Misc,

Accelerating Neural Architecture Search using Performance Prediction

B. Baker, O. Gupta, R. Raskar, and N. Naik.
(2017)cite arxiv:1705.10823Comment: Submitted to International Conference on Learning Representations, (2018).

Abstract

Methods for neural network hyperparameter optimization and meta-modeling are computationally expensive due to the need to train a large number of model configurations. In this paper, we show that standard frequentist regression models can predict the final performance of partially trained model configurations using features based on network architectures, hyperparameters, and time-series validation performance data. We empirically show that our performance prediction models are much more effective than prominent Bayesian counterparts, are simpler to implement, and are faster to train. Our models can predict final performance in both visual classification and language modeling domains, are effective for predicting performance of drastically varying model architectures, and can even generalize between model classes. Using these prediction models, we also propose an early stopping method for hyperparameter optimization and meta-modeling, which obtains a speedup of a factor up to 6x in both hyperparameter optimization and meta-modeling. Finally, we empirically show that our early stopping method can be seamlessly incorporated into both reinforcement learning-based architecture selection algorithms and bandit based search methods. Through extensive experimentation, we empirically show our performance prediction models and early stopping algorithm are state-of-the-art in terms of prediction accuracy and speedup achieved while still identifying the optimal model configurations.

BibTeX key: baker2017accelerating
entry type: misc
year: 2017
url: http://arxiv.org/abs/1705.10823
note: cite arxiv:1705.10823Comment: Submitted to International Conference on Learning Representations, (2018)

Users

Comments and Reviewsshow / hide

Please log in to take part in the discussion (add own reviews or comments).

Cite this publication

@misc{baker2017accelerating, abstract = {Methods for neural network hyperparameter optimization and meta-modeling are computationally expensive due to the need to train a large number of model configurations. In this paper, we show that standard frequentist regression models can predict the final performance of partially trained model configurations using features based on network architectures, hyperparameters, and time-series validation performance data. We empirically show that our performance prediction models are much more effective than prominent Bayesian counterparts, are simpler to implement, and are faster to train. Our models can predict final performance in both visual classification and language modeling domains, are effective for predicting performance of drastically varying model architectures, and can even generalize between model classes. Using these prediction models, we also propose an early stopping method for hyperparameter optimization and meta-modeling, which obtains a speedup of a factor up to 6x in both hyperparameter optimization and meta-modeling. Finally, we empirically show that our early stopping method can be seamlessly incorporated into both reinforcement learning-based architecture selection algorithms and bandit based search methods. Through extensive experimentation, we empirically show our performance prediction models and early stopping algorithm are state-of-the-art in terms of prediction accuracy and speedup achieved while still identifying the optimal model configurations.}, added-at = {2019-06-04T15:48:53.000+0200}, author = {Baker, Bowen and Gupta, Otkrist and Raskar, Ramesh and Naik, Nikhil}, biburl = {https://www.bibsonomy.org/bibtex/20299a5d132e0850936c0eec8d760d901/alrigazzi}, description = {Accelerating Neural Architecture Search using Performance Prediction}, interhash = {c74acf6575d44d1d589792abda1444a8}, intrahash = {0299a5d132e0850936c0eec8d760d901}, keywords = {deep dl large-scale networks neural parallel}, note = {cite arxiv:1705.10823Comment: Submitted to International Conference on Learning Representations, (2018)}, timestamp = {2019-06-04T15:48:53.000+0200}, title = {Accelerating Neural Architecture Search using Performance Prediction}, url = {http://arxiv.org/abs/1705.10823}, year = 2017 }

BibSonomy

Accelerating Neural Architecture Search using Performance Prediction

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on