1241

4 views

Create Account or Sign In to post comments

Deep learning has achieved good results in many practical applications, but the network architecture is largely dependent on manual design. In order to liberate the network architecture from manual design, the Neural Architecture Search (NAS) came into being. NAS is mainly divided into three parts: search space, search strategy and performance estimation strategy. Because of the huge search space of NAS, search process becomes extremely long. A good search strategy can search out the high-performance network architecture in a short time. In this paper, we study the search strategy for NAS problems and propose the UCB-ENAS algorithm based on reinforcement learn- ing, which significantly improves search efficiency in a flexible manner. NAS problem can be regarded as a stateless Multi-armed Bandit problem, so we use long short-term memory (LSTM) and Upper Confidence Bounds (UCB) to jointly build a controller that generates a network architecture, and then use the policy-based REINFORCE algorithm to update the controller parameters to maximize the expected reward. Controller parameters and model parameters are alternately optimized. A large number of experiments show that the proposed algorithm can quickly and efficiently search the network architecture, which is faster than ENAS in search speed, and the performance is higher than the architecture searched by DARTS (first order). For example: 56.54% perplexity is obtained on the PTB dataset.

Research on UCB-ENAS based on reinforcement learning Song Xue, Bo Zhao, Hanlin Chen, Ruiqi Wang, Baochang Zhang

July 15, 2021