深度神经网络结构搜索综述
Survey on neural architecture search
- 2021年26卷第2期 页码:245-264
收稿:2020-05-24,
修回:2020-7-16,
录用:2020-7-23,
纸质出版:2021-02-16
DOI: 10.11834/jig.200202
移动端阅览

浏览全部资源
扫码关注微信
收稿:2020-05-24,
修回:2020-7-16,
录用:2020-7-23,
纸质出版:2021-02-16
移动端阅览
深度神经网络在图像识别、语言识别和机器翻译等人工智能任务中取得了巨大进展,很大程度上归功于优秀的神经网络结构设计。神经网络大都由手工设计,需要专业的机器学习知识以及大量的试错。为此,自动化的神经网络结构搜索成为研究热点。神经网络结构搜索(neural architecture search,NAS)主要由搜索空间、搜索策略与性能评估方法3部分组成。在搜索空间设计上,出于计算量的考虑,通常不会搜索整个网络结构,而是先将网络分成几块,然后搜索块中的结构。根据实际情况的不同,可以共享不同块中的结构,也可以对每个块单独搜索不同的结构。在搜索策略上,主流的优化方法包含强化学习、进化算法、贝叶斯优化和基于梯度的优化等。在性能评估上,为了节省计算时间,通常不会将每一个网络都充分训练到收敛,而是通过权值共享、早停等方法尽可能减小单个网络的训练时间。与手工设计的网络相比,神经网络结构搜索得到的深度神经网络具有更好的性能。在ImageNet分类任务上,与手工设计的MobileNetV2相比,通过神经网络结构搜索得到的MobileNetV3减少了近30%的计算量,并且top-1分类精度提升了3.2%;在Cityscapes语义分割任务上,与手工设计的DeepLabv3+相比,通过神经网络结构搜索得到的Auto-DeepLab-L可以在没有ImageNet预训练的情况下,达到比DeepLabv3+更高的平均交并比(mean intersection over union,mIOU),同时减小一半以上的计算量。神经网络结构搜索得到的深度神经网络通常比手工设计的神经网络有着更好的表现,是未来神经网络设计的发展趋势。
Deep neural networks(DNNs) have achieved remarkable progress over the past years on a variety of tasks
such as image recognition
speech recognition
and machine translation. One of the most crucial aspects for this progress is novel neural architectures
in which hierarchical feature extractors are learned from data in an end-to-end manner rather than manually designed. Neural network training can be considered an automatic feature engineering process
and its success has been accompanied by an increasing demand for architecture engineering. At present
most neural networks are developed by human experts; however
the process involved is time-consuming and error-prone. Consequently
interest in automated neural architecture search methods has increased recently. Neural architecture search can be regarded as a subfield of automated machine learning
and it significantly overlaps with hyperparameter optimization and meta learning. Neural architecture search can be categorized into three dimensions: search space
search strategy
and performance estimation strategy. The search space defines which architectures can be represented in principle. The choice of search space largely determines the difficulty of optimization and search time. To reduce search time
neural architecture search is typically not applied to the entire network
but instead
the neural network is divided into several blocks and the search space is designed inside the blocks. All the blocks are combined into a whole neural network by using a predefined paradigm. In this manner
the search space can be significantly reduced
saving search time. In accordance with different situations
the architecture of the searched block can be shared or not. If the architecture is not shared
then every block has a unique architecture; otherwise
all the blocks in the neural network exhibit the same architecture. In this manner
search time can be further reduced. The search strategy details how the search space can be explored. Many search strategies can be used to explore the space of neural architectures
including random search
reinforcement learning
evolution algorithm
Bayesian optimization
and gradient-based optimization. A search strategy encompasses the classical exploration-exploitation trade-off. The objective of neural architecture search is typically to find architectures that achieve high predictive performance on unseen data. Performance estimation refers to the process of estimating this performance. The most direct approach is performing complete training and validation of the architecture on target data. This technique is extremely time-consuming
in the order of thousands of graphics processing unit (GPU) days. Thus
we generally do not train each candidate to converge. Instead
methods
such as like weight sharing
early stopping
or searching smaller proxy datasets
are used in the performance estimation strategy
considerably reducing training time for each candidate architecture performance estimation. Weight sharing can be achieved by inheriting weights from pretrained models or searching a one-shot model
whose weights are then shared across different architectures that are merely subgraphs of the one-shot model. The early stopping method estimates performance in accordance with the early stage validation result via learning curve extrapolation. Training on a smaller proxy dataset finds a neural architecture on a small dataset
such as CIFAR-10. Then
the architecture is trained on the target large dataset
such as ImageNet. Compared with neural networks developed by human experts
models found via neural architecture search exhibit better performance on various tasks
such as image classification
image detection
and semantic segmentation. For the ImageNet classification task
for example
MobileNetV3
which was found via neural architecture search
reduced approximately 30% FLOPs compared with the MobileNetV2
which was designed by human experts
with more 3.2% top-1 accuracy. For the Cityscapes segmentation task
Auto-DeepLab-L found via neural architecture search has exhibited better performance than DeepLabv3+
with only half multi-adds. In this survey
we propose several neural architecture methods and applications
demonstrating that neural networks found via neural architecture search outperform manually designed architectures on certain tasks
such as image classification
object detection
and semantic segmentation. However
insights into why specific architectures work efficiently remain minimal. Identifying common motifs
providing an understanding why these motifs are important for high performance
and investigating whether these motifs can be generalized over different problems will be desirable.
Abadi M, Barham P, Chen J M, Chen Z F, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M, Kudlur M, Levenberg J, Monga R, Moore S, Murray D G, Steiner B, Tucker P, Vanhoucke V, Warden P, Wicke M, Yu Y and Zheng X Q. 2016. Tensorflow: a system for large-scale machine learning//Proceedings of the 12th USENIX Conference on Operating Systems Design and Implementation. Savannah, USA: USENIX Association: 265-283
Angeline P J, Saunders G M and Pollack J B. 1994. An evolutionary algorithm that constructs recurrent neural networks. IEEE Transactions on Neural Networks, 5(1): 54-65[DOI: 10.1109/72.265960]
Baker B, Gupta O, Raskar R and Naik N. 2017. Accelerating neural architecture search using performance prediction[EB/OL].[2020-05-21] . https://arxiv.org/pdf/1705.10823.pdf https://arxiv.org/pdf/1705.10823.pdf
Bergstra J, Bardenet R, Bengio Y and Kégl B. 2011. Algorithms for hyper-parameter optimization//Proceedings of the 24th International Conference on Neural Information Processing Systems. Granada, Spain: Curran Associates Inc.: 2546-2554
Bergstra J and Bengio Y. 2012. Random search for hyper-parameter optimization. Journal of Machine Learning Research, 13: 281-305
Bergstra J, Yamins D and Cox D D. 2013. Making a science of model search: hyperparameter optimization in hundreds of dimensions for vision architectures//Proceedings of the 30th International Conference on Machine Learning (ICML 2013). Atlanta, Gerorgia: ICML: 115-123
Cai H, Chen T Y, Zhang W N, Yu Y and Wang J. 2018a. Efficient architecture search by network transformation//Proceedings of the 32nd AAAI Conference on Artificial Intelligence. New Orleans, Louisiana, USA: AAAI: 2787-2794
Cai H, Gan C, Wang T, Zhang Z and Han S. 2019. Once-for-All: train one network and specialize it for efficient deployment[EB/OL].[2020-04-24] . https://arxiv.org/pdf/1908.09791.pdf https://arxiv.org/pdf/1908.09791.pdf
Cai H, Zhu L and Han S. 2018b. ProxylessNAS: direct neural architecture search on target task and hardware[EB/OL].[2020-04-24] . https://arxiv.org/pdf/1812.00332.pdf https://arxiv.org/pdf/1812.00332.pdf
Chen H, Zhuo L, Zhang B C, Zheng X W, Liu J Z, Ji R R and David D.2020. Binarized neural architecture search for efficient object recognition. International Journal of Computer Vision, 1-16
Chen L C, Collins M D, Zhu Y K, Papandreou G, Zoph B, Schroff F, Adam H and Shlens J. 2018. Searching for efficient multi-scale architectures for dense image prediction//Proceedings of the 32nd International Conference on Neural Information Processing Systems. Montreal, Canada: Curran Associates Inc.: 8713-8724
Chen L C, Papandreou G, Schroff F and Adam H. 2017. Rethinking atrous convolution for semantic image segmentation[EB/OL].[2020-04-24]. https: //arxiv.org/pdf/1706.05587.pdf
Chen X, Xie L X, Wu J and Tian Q. 2019a. Progressive differentiable architecture search: bridging the depth gap between search and evaluation//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, South Korea: IEEE: 1294-1303[ DOI: 10.1109/ICCV.2019.00138 http://dx.doi.org/10.1109/ICCV.2019.00138 ]
Chen Y K, Yang T, Zhang X Y, Meng G F, Xiao X Y and Sun J. 2019b. DetNAS: backbone search for object detection//Proceedings of the 33rd International Conference on Neural Information Processing Systems. Vancouver, Canada: NIPS: 6638-6648
Chrabaszcz P, Loshchilov I and Hutter F. 2017. A downsampled variant of ImageNet as an alternative to the CIFAR datasets[EB/OL].[2020-04-24] . https://arxiv.org/pdf/1707.08819.pdf https://arxiv.org/pdf/1707.08819.pdf
Dalal N and Triggs B. 2005. Histograms of oriented gradients for human detection//Proceedings of 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05). San Diego, USA: IEEE: 886-893[ DOI: 10.1109/CVPR.2005.177 http://dx.doi.org/10.1109/CVPR.2005.177 ]
Deng J, Dong W, Socher R, Li L J, Li K and Li F F. 2009. ImageNet: a large-scale hierarchical image database//Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami, USA: IEEE: 248-255[ DOI: 10.1109/CVPR.2009.5206848 http://dx.doi.org/10.1109/CVPR.2009.5206848 ]
Deng L, Li J Y, Huang J T, Yao K S, Yu D, Seide F, Seltzer M, Zweig G, He X D, Williams J, Gong Y F and Acero A. 2013. Recent advances in deep learning for speech research at Microsoft//Proceedings of 2013 IEEE International Conference on Acoustics, Speech and Signal Processing. Vancouver, Canada: IEEE: 8604-8608[ DOI: 10.1109/ICASSP.2013.6639345 http://dx.doi.org/10.1109/ICASSP.2013.6639345 ]
Devlin J, Chang M W, Lee K and Kristina T. 2019.BERT: pre-training of deep bidirectional transformers for language understanding[EB/OL].[2020-04-24] . https://arxiv.org/pdf/1810.04805.pdf https://arxiv.org/pdf/1810.04805.pdf
Domhan T, Springenberg J T and Hutter F. 2015. Speeding up automatic hyperparameter optimization of deep neural networks by extrapolation of learning curves//Proceedings of the 24th International Conference on Artificial Intelligence. Buenos Aires, Argentina: AAAI: 3460-3468
Dong X Y and Yang Y. 2019. Network pruning via transformable architecture search//Proceedings of the 33rd Conference on Neural Information Processing Systems. Vancouver, Canada: NIPS: 759-770
Dong X and Yang Y. 2020. NAS-Bench-201: Extending the Scope of Reproducible Neural Architecture Search[EB/OL].[2020-04-24] . https://arxiv.org/pdf/2001.00326.pdf https://arxiv.org/pdf/2001.00326.pdf
Du X Z, Lin T Y, Jin P C, Ghiasi G, Tan M X, Cui Y, Le Q V and Song X D. 2020. SpineNet: learning scale-permuted backbone for recognition and localization//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, USA: IEEE: 11589-11598[ DOI: 10.1109/CVPR42600.2020.01161 http://dx.doi.org/10.1109/CVPR42600.2020.01161 ]
Elsken T, Metzen J H, Hutter F. 2018. Efficient multi-objective neural architecture search via Lamarckian evolution[EB/OL].[2020-04-24] . https://arxiv.org/pdf/1804.09081.pdf https://arxiv.org/pdf/1804.09081.pdf
Elsken T, Metzen J H and Hutter F. 2019. Neural architecture search: a survey. Journal of Machine Learning Research, 20: 1-21
Floreano D, Dürr P and Mattiussi C. 2008. Neuroevolution: from architectures to learning. Evolutionary Intelligence, 1(1): 47-62[DOI: 10.1007/s12065-007-0002-4]
Ghiasi G, Lin T Y and Le Q V. 2019. NAS-FPN: learning scalable feature pyramid architecture for object detection//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE: 7029-7038[ DOI: 10.1109/CVPR.2019.00720 http://dx.doi.org/10.1109/CVPR.2019.00720 ]
Gong X Y, Chang S Y, Jiang Y F and Wang Z Y. 2019. AutoGAN: neural architecture search for generative adversarial networks//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, South Korea: IEEE: 3223-3233[ DOI: 10.1109/ICCV.2019.00332 http://dx.doi.org/10.1109/ICCV.2019.00332 ]
Gordon A, Eban E, Nachum O, Chen B, Wu H, Yang T J and Chio E. 2018. Morphnet: fast and simple resource-constrained structure learning of deep networks//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 1586-1595[ DOI: 10.1109/CVPR.2018.00171 http://dx.doi.org/10.1109/CVPR.2018.00171 ]
Guo J Y, Han K, Wang Y H, Zhang C, Yang Z H, Wu H, Chen X H and Xu C. 2020a. Hit-Detector: hierarchical trinity architecture search for object detection//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, USA: IEEE: 11402-11411[ DOI: 10.1109/CVPR42600.2020.01142 http://dx.doi.org/10.1109/CVPR42600.2020.01142 ]
Guo Z C, Zhang X Y, Mu H Y, Heng W, Liu Z C, Wei Y C and Sun J. 2020b. Single path one-shot neural architecture search with uniform sampling//Proceedings of the 16th European Conference on Computer Vision. Glasgow, UK: Springer: 544-560[ DOI: 10.1007/978-3-030-58517-4_32 http://dx.doi.org/10.1007/978-3-030-58517-4_32 ]
He K M, Zhang X Y, Ren S Q and Sun J. 2016. Deep residual learning for image recognition//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, USA: IEEE: 770-778[ DOI: 10.1109/CVPR.2016.90 http://dx.doi.org/10.1109/CVPR.2016.90 ]
He Y, Liu P, Wang Z W, Hu Z L and Yang Y. 2019. Filter pruning via geometric median for deep convolutional neural networks acceleration//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE: 4340-4349[ DOI: 10.1109/CVPR.2019.00447 http://dx.doi.org/10.1109/CVPR.2019.00447 ]
He Y H, Lin J, Liu Z J, Wang H R, Li L J and Han S. 2018. AMC: automl for model compression and acceleration on mobile devices//Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer: 815-832[ DOI: 10.1007/978-3-030-01234-2_48 http://dx.doi.org/10.1007/978-3-030-01234-2_48 ]
Hinton G E. 2007. Learning multiple layers of representation. Trends in Cognitive Sciences, 11(10): 428-434[DOI: 10.1016/j.tics.2007.09.004]
Howard A, Sandler M, Chen B, Wang W J, Chen L C, Tan M X, Chu G, Vasudevan V, Zhu Y K, Pang R M, Adam H and Le Q. 2019. Searching for mobilenetv3//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, South Korea: IEEE: 1314-1324[ DOI: 10.1109/ICCV.2019.00140 http://dx.doi.org/10.1109/ICCV.2019.00140 ]
Howard A G, Zhu M L, Chen B, Kalenichenko D, Wang W J, Weyand T, Andreetto M and Adam H. 2017. Mobilenets: efficient convolutional neural networks for mobile vision applications[EB/OL].[2020-04-24] . https://arxiv.org/pdf/1704.04861.pdf https://arxiv.org/pdf/1704.04861.pdf
Hu J, Shen L and Sun G. 2018. Squeeze-and-excitation networks//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 7132-7141[ DOI: 10.1109/CVPR.2018.00745 http://dx.doi.org/10.1109/CVPR.2018.00745 ]
Huang G, Liu Z, Van Der Maaten L and Weinberger K Q. 2017. Densely connected convolutional networks//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, USA: IEEE: 4700-4708[ DOI: 10.1109/CVPR.2017.243 http://dx.doi.org/10.1109/CVPR.2017.243 ]
Ioffe S and Szegedy C. 2015. Batch normalization: accelerating deep network training by reducing internal covariate shift//Proceedings of the 32nd International Conference on Machine Learning. Lile, France: ICML: 448-456
Jozefowicz R, Zaremba W and Sutskever I. 2015. An empirical exploration of recurrent network architectures//Proceedings of the 32nd International Conference on Machine Learning. Lille, France: ICML: 2342-2350
Klein A, Falkner S, Springenberg J T and Hutter F. 2016. Learning curve prediction with Bayesian neural networks[EB/OL].[2020-04-24] . https://openreview.net/forum?id=S11KBYclx https://openreview.net/forum?id=S11KBYclx
Krizhevsky A, Sutskever I and Hinton G E. 2012. ImageNet classification with deep convolutional neural networks//Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe, USA: Curran Associates Inc.: 1097-1105
Kalchbrenner N, Grefenstette E and Blunsom P. 2014. A convolutional neural network for modelling sentences[EB/OL].[2020-04-24] . https://arxiv.org/pdf/1404.2188.pdf https://arxiv.org/pdf/1404.2188.pdf
LeCun Y, Bengio Y and Hinton G. 2015. Deep learning. Nature, 521(7553): 436-444[DOI: 10.1038/nature14539]
LeCun Y, Bottou L, Bengio Y and Huffner P. 1998. Gradient-based learning applied to document recognition//Proceedings of 1998 IEEE, 86(11): 2278-2324[ DOI: 10.1109/5.726791 http://dx.doi.org/10.1109/5.726791 ]
Li H, Kadav A, Durdanovic I, Samet H and Graf H P. 2016. Pruning filters for efficient convnets[EB/OL].[2020-04-24] . https://arxiv.org/pdf/1608.08710.pdf https://arxiv.org/pdf/1608.08710.pdf
Li L S, Jamieson K, DeSalvo G, Rostamizadeh A and Talwalkar A. 2017. Hyperband: a novel bandit-based approach to hyperparameter optimization. The Journal of Machine Learning Research, 18(1): 6765-6816
Lin M B, Ji R R, Zhang Y X, Zhang B C, Wu Y J and Tian Y H. 2020. Channel pruning via automatic structure search[EB/OL].[2020-04-24] . https://arxiv.org/pdf/2001.08565.pdf https://arxiv.org/pdf/2001.08565.pdf
Liu C X, Chen L C, Schroff F, Adam H, Hua W, Yuille A L and Li F F. 2019a. Auto-DeepLab: hierarchical neural architecture search for semantic image segmentation//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE: 82-92[ DOI: 10.1109/CVPR.2019.00017 http://dx.doi.org/10.1109/CVPR.2019.00017 ]
Liu C X, Zoph B, Neumann M, Shlens J, Hua W, Li L J, Li F F, Yuille A, Huang J and Murphy K. 2018a. Progressive neural architecture search//Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer: 19-35[ DOI: 10.1007/978-3-030-01246-5_2 http://dx.doi.org/10.1007/978-3-030-01246-5_2 ]
Liu H, Simonyan K and Yang Y. 2018b. DARTS: differentiable architecture search[EB/OL].[2020-04-24] . https://arxiv.org/pdf/1806.09055.pdf https://arxiv.org/pdf/1806.09055.pdf
Liu N, Ma X L, Xu Z Y, Wang Y Z, Tang J and Ye J P. 2020. AutoCompress: an automatic DNN structured pruning framework for ultra-high compression rates//Proceedings of the 34th AAAI Conference on Artificial Intelligence, the 32nd Conference on Innovative Applications of Artificial Intelligence, the 10th Symposium on Educational Advances in Artificial Intelligence. Palo Alto, USA: AAAI: 4876-4883[ DOI: 10.1609/aaai.v34i04.5924 http://dx.doi.org/10.1609/aaai.v34i04.5924 ]
Liu Z C, Mu H Y, Zhang X Y, Guo Z C, Yang X, Cheng K T and Sun J. 2019b. MetaPruning: meta learning for automatic neural network channel pruning//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, South Korea: IEEE: 3296-3305[ DOI: 10.1109/ICCV.2019.00339 http://dx.doi.org/10.1109/ICCV.2019.00339 ]
Liu W, Anguelov D, Erhan D, Christian S, Scott R, Fu C Y and Alexander C. 2016. SSD: single shot multibox detector//Proceedings of 2016 European Conference on Computer Vision. Amsterdam, The Netherlands: Springer: 21-37
Long J, Shelhamer E and Darrell T. 2015. Fully convolutional networks for semantic segmentation//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, USA: IEEE: 3431-3440
Lowe D G. 2004. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2): 91-110[DOI: 10.1023/B:VISI.0000029664.99615.94]
Newell A, Yang K Y and Deng J. 2016. Stacked hourglass networks for human pose estimation//Proceedings of the 14th European Conference on Computer Vision. Amsterdam, The Netherlands: Springer: 483-499[ DOI: 10.1007/978-3-319-46484-8_29 http://dx.doi.org/10.1007/978-3-319-46484-8_29 ]
Noh H, Hong S and Han B Y. 2015. Learning deconvolution network for semantic segmentation//Proceedings of 2015 IEEE International Conference on Computer Vision (ICCV). Santiago, Chile: IEEE: 1520-1528[ DOI: 10.1109/ICCV.2015.178 http://dx.doi.org/10.1109/ICCV.2015.178 ]
Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G, Killeen T, Lin Z M, Gimelshein N, Antiga L, Desmaison A, Köpf A, Yang E, DeVito Z, Raison M, Tejani A, Chilamkurthy S, Steiner B, Fang L, Bai J J and Chintala S. 2019. PyTorch: an imperative style, high-performance deep learning library[EB/OL].[2020-04-24] . https://arxiv.org/pdf/1802.03268v2.pdf https://arxiv.org/pdf/1802.03268v2.pdf
Pham H, Guan M, Zoph B, Le Q V and Jeff D. 2018. Efficient Neural Architecture Search via Parameters Sharing[EB/OL].[2020-04-24] . https://arxiv.org/pdf/1802.03268v2.pdf https://arxiv.org/pdf/1802.03268v2.pdf
Rawal A and Miikkulainen R. 2018. From nodes to networks: evolving recurrent neural networks[EB/OL].[2020-04-24] . https://arxiv.org/pdf/1803.04439.pdf https://arxiv.org/pdf/1803.04439.pdf
Real E, Aggarwal A, Huang Y P and Le Q V. 2019. Regularized evolution for image classifier architecture search//Proceedings of the 33rd AAAI Conference on Artificial Intelligence. Palo Alto, USA: AAAI: 4780-4789[ DOI: 10.1609/aaai.v33i01.33014780 http://dx.doi.org/10.1609/aaai.v33i01.33014780 ]
Redmon J, Divvala S, Girshick R and Farhadi A. 2016. You only look once: unified, real-time object detection//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, USA: IEEE: 779-788
Sandler M, Howard A, Zhu M L, ZhmoginovA and Chen L C. 2018. Mobilenetv2: inverted residuals and linear bottlenecks//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 4510-4520[ DOI: 10.1109/CVPR.2018.00474 http://dx.doi.org/10.1109/CVPR.2018.00474 ]
Saxena S and Verbeek J. 2016. Convolutional neural fabrics//Proceedings of the 30th International Conference on Neural Information Processing Systems. Barcelona, Spain: Curran Associates Inc.: 4053-4061
Sergeev A and Del Balso M. 2018. Horovod: fast and easy distributed deep learning in TensorFlow[EB/OL].[2020-04-24] . https://arxiv.org/pdf/1802.05799.pdf https://arxiv.org/pdf/1802.05799.pdf
Simonyan K and Zisserman A. 2014. Very deep convolutional networks for large-scale image recognition[EB/OL].[2020-04-24] . https://arxiv.org/pdf/1409.1556.pdf https://arxiv.org/pdf/1409.1556.pdf
Singh P, Verma V K, Rai P and Vinay P. 2019. Hetconv: heterogeneous kernel-based convolutions for deep CNNs//Proceedings of 2019 IEEE Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 4835-4844
Snoek J, Larochelle H and AdamsR P. 2012. Practical bayesian optimization of machine learning algorithms//Proceedings of the 25th International Conference on Neural Information Processing Systems. Lake Tahoe, USA: Curran Associates Inc.: 2951-2959
Snoek J, Rippel O, Swersky K, Kiros R, Satish N, Sundaram N, Patwary M M A, Prabhat and Adams R P. 2015. Scalable bayesian optimization using deep neural networks//Proceedings of the 32nd International Conference on Machine Learning. Lille, France: ICML: 2171-2180
Stamoulis D, Ding R Z, Wang D, Lymberopoulos D, Priyantha B, Liu J and Marculescu D. 2020. Single-path NAS: designing hardware-efficient ConvNets in less than 4 hours//Proceedings of 2019 European Conference on Machine Learning and Knowledge Discovery in Databases. Würzburg, Germany: Springer: 481-497[ DOI: 10.1007/978-3-030-46147-8_29 http://dx.doi.org/10.1007/978-3-030-46147-8_29 ]
Stanley K O, D'Ambrosio D B and Gauci J. 2009. A hypercube-based encoding for evolving large-scale neural networks.Artificial Life, 15(2): 185-212[DOI: 10.1162/artl.2009.15.2.15202]
Stanley K O and Miikkulainen R. 2002. Evolving neural networks through augmenting topologies. Evolutionary Computation, 10(2): 99-127[DOI: 10.1162/106365602320169811]
Sun Ke, Li M J, Liu D and Wang J D. 2018. Igcv3: interleaved low-rank group convolutions for efficient deep neural networks[EB/OL].[2020-04-24] . https://arxiv.org/pdf/1806.00178.pdf https://arxiv.org/pdf/1806.00178.pdf
Swersky K, Snoek J and Adams R P. 2014. Freeze-thaw Bayesian optimization[EB/OL].[2020-04-24] . https://arxiv.org/pdf/1406.3896.pdf https://arxiv.org/pdf/1406.3896.pdf
Szegedy C, Ioffe S, Vanhoucke V and Alemi A A. 2017. Inception-v4, inception-ResNet and the impact of residual connections on learning//Proceedings of the 31st AAAI Conference on Artificial Intelligence. San Francisco, USA: AAAI: 4278-4284
Szegedy C, Liu W, Jia Y Q, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V and Rabinovich A. 2015. Going deeper with convolutions//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Boston, USA: IEEE: 1-9[ DOI: 10.1109/CVPR.2015.7298594 http://dx.doi.org/10.1109/CVPR.2015.7298594 ]
Szegedy C, Vanhoucke V, Ioffe S, Shlens J and Wojna Z. 2016. Rethinking the inception architecture for computer vision//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas, USA: IEEE: 2818-2826[ DOI: 10.1109/CVPR.2016.308 http://dx.doi.org/10.1109/CVPR.2016.308 ]
Tan M X, Chen B, Pang R M, Vasudevan V, Sandler M, Howard A and Le Q V. 2019. MnasNet: platform-aware neural architecture search for mobile//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE: 2820-2828[ DOI: 10.1109/CVPR.2019.00293 http://dx.doi.org/10.1109/CVPR.2019.00293 ]
Wan A, Dai X L, Zhang P Z, He Z J, Tian Y D, Xie S N, Wu B C, Yu M, Xu T, Chen K, Vajda P and Gonzalez J E. 2020. FBNetV2: differentiable neural architecture search for spatial and channel dimensions//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, USA: IEEE: 12962-12971[ DOI: 10.1109/CVPR42600.2020.01298 http://dx.doi.org/10.1109/CVPR42600.2020.01298 ]
Wang N, Gao Y, Chen H, Wang P, Tian Z, Shen C H and Zhang Y N. 2020. NAS-FCOS: fast neural architecture search for object detection//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle, USA: IEEE: 11940-11948[ DOI: 10.1109/CVPR42600.2020.01196 http://dx.doi.org/10.1109/CVPR42600.2020.01196 ]
Wu B C, Dai X L, Zhang P Z, Wang Y H, Sun F, Wu Y M, Tian Y D, Vajda P, Jia Y Q and Keutzer K. 2019. FBNet: hardware-aware efficient ConvNet design via differentiable neural architecture search//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE: 10734-10742[ DOI: 10.1109/CVPR.2019.01099 http://dx.doi.org/10.1109/CVPR.2019.01099 ]
Xie S N, Girshick R, Dollár P, Tu Z W and He K M. 2017. Aggregated residual transformations for deep neural networks//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Honolulu, USA: IEEE: 5987-5995[ DOI: 10.1109/CVPR.2017.634 http://dx.doi.org/10.1109/CVPR.2017.634 ]
Xie S, Zheng H, Liu C and Lin L. 2018. SNAS: stochastic neural architecture search[EB/OL].[2020-04-24] . https://arxiv.org/pdf/1812.09926.pdf https://arxiv.org/pdf/1812.09926.pdf
Xu H, Yao L W, Li Z G, Liang X D and Zhang W. 2019. Auto-FPN: automatic network architecture adaptation for object detection beyond classification//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, South Korea: IEEE: 6648-6657[ DOI: 10.1109/ICCV.2019.00675 http://dx.doi.org/10.1109/ICCV.2019.00675 ]
Yang T J, Howard A, Chen B, Zhang X, Go A, Sandler M, Sze V and Adam H. 2018. NetAdapt: platform-aware neural network adaptation for mobile applications//Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer: 289-304[ DOI: 10.1007/978-3-030-01249-6_18 http://dx.doi.org/10.1007/978-3-030-01249-6_18 ]
Ying C, Klein A, Christiansen E, Murphy K and Hutter F. 2019. NAS-Bench-101: Towards Reproducible Neural Architecture Search//Proceedings of the 36th International Conference on Machine Learning. Lugano, Switzerland: ICML: 63-77
Yu J H and Huang T. 2019a. Universally slimmable networks and improved training techniques//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, South Korea: IEEE: 1803-1811[ DOI: 10.1109/ICCV.2019.00189 http://dx.doi.org/10.1109/ICCV.2019.00189 ]
Yu J H and Huang T. 2019b. AutoSlim: towards one-shot architecture search for channel numbers[EB/OL].[2020-04-24] . https://arxiv.org/pdf/1903.11728.pdf https://arxiv.org/pdf/1903.11728.pdf
Yu J H, Yang L J, Xu N, Yang J C and Huang T. 2018. Slimmable Neural Networks[EB/OL].[2020-04-24] . https://arxiv.org/pdf/1812.08928.pdf https://arxiv.org/pdf/1812.08928.pdf
Zela A, Klein A, Falkner S and Hutter F. 2018. Towards automated deep learning: efficient joint neural architecture and hyperparameter search[EB/OL].[2020-04-24] . https://arxiv.org/pdf/1807.06906.pdf https://arxiv.org/pdf/1807.06906.pdf
Zela A, Siems J and Hutter F. 2019. NAS-Bench-1Shot1: Benchmarking and Dissecting One-shot Neural Architecture Search[EB/OL].[2020-04-24] . https://arxiv.org/pdf/2001.10422.pdf https://arxiv.org/pdf/2001.10422.pdf
Zheng X W, Ji R R, Tang L, Zhang B C, Liu J Z and Tian Q. 2019. Multinomial distribution learning for effective neural architecture search//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision (ICCV). Seoul, South Korea: IEEE: 1304-1313[ DOI: 10.1109/ICCV.2019.00139 http://dx.doi.org/10.1109/ICCV.2019.00139 ]
Zhuo L A, Zhang B C, Chen H L, Yang L L, Chen C, Zhu Y J and Doermann D. 2020. CP-NAS: child-parent neural architecture search for binary neural networks[EB/OL].[2020-04-24] . https://arxiv.org/pdf/2005.00057 https://arxiv.org/pdf/2005.00057
Zoph B and Le Q V. 2016. Neural architecture search with reinforcement learning[EB/OL].[2020-04-24] . https://arxiv.org/pdf/1611.01578 https://arxiv.org/pdf/1611.01578
Zoph B, Vasudevan V, Shlens J and Le Q V. 2018. Learning transferable architectures for scalable image recognition//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 8697-8710[ DOI: 10.1109/CVPR.2018.00907 http://dx.doi.org/10.1109/CVPR.2018.00907 ]
相关作者
相关机构
京公网安备11010802024621