Zusammenfassung
Designing convolutional neural networks (CNN) models for mobile devices is
challenging because mobile models need to be small and fast, yet still
accurate. Although significant effort has been dedicated to design and improve
mobile models on all three dimensions, it is challenging to manually balance
these trade-offs when there are so many architectural possibilities to
consider. In this paper, we propose an automated neural architecture search
approach for designing resource-constrained mobile CNN models. We propose to
explicitly incorporate latency information into the main objective so that the
search can identify a model that achieves a good trade-off between accuracy and
latency. Unlike in previous work, where mobile latency is considered via
another, often inaccurate proxy (e.g., FLOPS), in our experiments, we directly
measure real-world inference latency by executing the model on a particular
platform, e.g., Pixel phones. To further strike the right balance between
flexibility and search space size, we propose a novel factorized hierarchical
search space that permits layer diversity throughout the network. Experimental
results show that our approach consistently outperforms state-of-the-art mobile
CNN models across multiple vision tasks. On the ImageNet classification task,
our model achieves 74.0\% top-1 accuracy with 76ms latency on a Pixel phone,
which is 1.5x faster than MobileNetV2 (Sandler et al. 2018) and 2.4x faster
than NASNet (Zoph et al. 2018) with the same top-1 accuracy. On the COCO object
detection task, our model family achieves both higher mAP quality and lower
latency than MobileNets.
Nutzer