Inbook,

Online Quantization Adaptation for Fault-Tolerant Neural Network Inference

M. Beyer, J. Borrmann, A. Guntoro, and H. Blume.
page 243--256. Springer International Publishing AG, (2023)
DOI: 10.1007/978-3-031-40923-3_18

Abstract

Neural networks (NNs) are commonly used for environmental perception in autonomous driving applications. Safety aspects in such systems play a crucial role along with performance and efficiency. Since NNs exhibit enormous computational demands, safety measures that rely on traditional spatial or temporal redundancy for mitigating hardware (HW) faults are far from ideal. In this paper, we combine algorithmic properties with dedicated HW features to achieve lightweight fault tolerance. We leverage that many NNs maintain their accuracy when quantized to lower bit widths and adapt their quantization configuration during runtime to counteract HW faults. Instead of masking computations that are performed on faulty HW, we introduce a fail-degraded operating mode. In this mode, reduced precision computations are exploited for NN operations, as opposed to fully losing compute capability. This allows us to maintain important synapses of the network and thus preserve its accuracy. The required HW overhead for our method is minimal because we reuse existing HW features that were originally implemented for functional reasons. To demonstrate the effectiveness of our method, we simulate permanent HW faults in a NN accelerator and evaluate the impact on a NN’s classification performance. We can preserve a NN’s accuracy even at higher error rates, whereas without our method it completely loses its prediction capabilities. Accuracy drops in our experiments range from a few percent to a maximum of 10 %, confirming the improved fault tolerance of the system.

BibTeX key: beyer2023online
entry type: inbook
booktitle: Computer Safety, Reliability, and Security
year: 2023
pages: 243--256
publisher: Springer International Publishing AG
isbn: 9783031409226
comment: This work is supported by the German federal ministry of education and research (BMBF), project ZuSE-KI-AVF (grant no. 16ME0062). 10.1007/978-3-031-40923-3_18
DOI: 10.1007/978-3-031-40923-3_18
url: http://www.scopus.com/inward/record.url?scp=85172099424&partnerID=8YFLogxK

Users

Comments and Reviewsshow / hide

Please log in to take part in the discussion (add own reviews or comments).

Cite this publication

%0 Book Section %1 beyer2023online %A Beyer, Michael %A Borrmann, Jan Micha %A Guntoro, Andre %A Blume, Holger %B Computer Safety, Reliability, and Security %D 2023 %E Guiochet, Jérémie %E Tonetta, Stefano %E Bitsch, Friedemann %I Springer International Publishing AG %K Approximate Automotive Computing Fault Hardware Network Networks Neural Quantization Tolerance myown %P 243--256 %R 10.1007/978-3-031-40923-3_18 %T Online Quantization Adaptation for Fault-Tolerant Neural Network Inference %U http://www.scopus.com/inward/record.url?scp=85172099424&partnerID=8YFLogxK %X Neural networks (NNs) are commonly used for environmental perception in autonomous driving applications. Safety aspects in such systems play a crucial role along with performance and efficiency. Since NNs exhibit enormous computational demands, safety measures that rely on traditional spatial or temporal redundancy for mitigating hardware (HW) faults are far from ideal. In this paper, we combine algorithmic properties with dedicated HW features to achieve lightweight fault tolerance. We leverage that many NNs maintain their accuracy when quantized to lower bit widths and adapt their quantization configuration during runtime to counteract HW faults. Instead of masking computations that are performed on faulty HW, we introduce a fail-degraded operating mode. In this mode, reduced precision computations are exploited for NN operations, as opposed to fully losing compute capability. This allows us to maintain important synapses of the network and thus preserve its accuracy. The required HW overhead for our method is minimal because we reuse existing HW features that were originally implemented for functional reasons. To demonstrate the effectiveness of our method, we simulate permanent HW faults in a NN accelerator and evaluate the impact on a NN’s classification performance. We can preserve a NN’s accuracy even at higher error rates, whereas without our method it completely loses its prediction capabilities. Accuracy drops in our experiments range from a few percent to a maximum of 10 %, confirming the improved fault tolerance of the system. %@ 9783031409226

@inbook{beyer2023online, abstract = {Neural networks (NNs) are commonly used for environmental perception in autonomous driving applications. Safety aspects in such systems play a crucial role along with performance and efficiency. Since NNs exhibit enormous computational demands, safety measures that rely on traditional spatial or temporal redundancy for mitigating hardware (HW) faults are far from ideal. In this paper, we combine algorithmic properties with dedicated HW features to achieve lightweight fault tolerance. We leverage that many NNs maintain their accuracy when quantized to lower bit widths and adapt their quantization configuration during runtime to counteract HW faults. Instead of masking computations that are performed on faulty HW, we introduce a fail-degraded operating mode. In this mode, reduced precision computations are exploited for NN operations, as opposed to fully losing compute capability. This allows us to maintain important synapses of the network and thus preserve its accuracy. The required HW overhead for our method is minimal because we reuse existing HW features that were originally implemented for functional reasons. To demonstrate the effectiveness of our method, we simulate permanent HW faults in a NN accelerator and evaluate the impact on a NN’s classification performance. We can preserve a NN’s accuracy even at higher error rates, whereas without our method it completely loses its prediction capabilities. Accuracy drops in our experiments range from a few percent to a maximum of 10 %, confirming the improved fault tolerance of the system.}, added-at = {2024-02-05T16:19:02.000+0100}, author = {Beyer, Michael and Borrmann, Jan Micha and Guntoro, Andre and Blume, Holger}, biburl = {https://www.bibsonomy.org/bibtex/2ded706acb53b7cccb8abeee845a0b9b4/fabcho}, booktitle = {Computer Safety, Reliability, and Security}, comment = {This work is supported by the German federal ministry of education and research (BMBF), project ZuSE-KI-AVF (grant no. 16ME0062). 10.1007/978-3-031-40923-3_18}, doi = {10.1007/978-3-031-40923-3_18}, editor = {Guiochet, Jérémie and Tonetta, Stefano and Bitsch, Friedemann}, interhash = {3fac64e776f1091b1a7ffffff2a62435}, intrahash = {ded706acb53b7cccb8abeee845a0b9b4}, isbn = {9783031409226}, keywords = {Approximate Automotive Computing Fault Hardware Network Networks Neural Quantization Tolerance myown}, pages = {243--256}, publisher = {Springer International Publishing AG}, timestamp = {2024-03-05T15:40:12.000+0100}, title = {Online Quantization Adaptation for Fault-Tolerant Neural Network Inference}, url = {http://www.scopus.com/inward/record.url?scp=85172099424&partnerID=8YFLogxK}, year = 2023 }

BibSonomy

Online Quantization Adaptation for Fault-Tolerant Neural Network Inference

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on