Flip-and-Patch: A fault-tolerant technique for on-chip memories of CNN accelerators at low supply voltage

Yamilka Toca-Díaz, Rubén Gran, Alejandro Valero

April 2024

Abstract

Aggressively reducing the supply voltage (𝑉𝑑𝑑) below the safe threshold voltage (𝑉𝑚𝑖𝑛) can effectively lead to significant energy savings in digital circuits. However, operating at such low supply voltages poses challenges due to a high occurrence of permanent faults resulting from manufacturing process variations in current technology nodes.

This work addresses the impact of permanent faults on the accuracy of a Convolutional Neural Network (CNN) inference accelerator using on-chip activation memories supplied at low 𝑉𝑑𝑑 below 𝑉𝑚𝑖𝑛. Based on a characterization study of fault patterns, this paper proposes two low-cost microarchitectural techniques, namely Flip-and-Patch, which maintain the original accuracy of CNN applications even in the presence of a high number of faults caused by operating at 𝑉𝑑𝑑 < 𝑉𝑚𝑖𝑛. Unlike existing techniques, Flip-and-Patch remains transparent to the programmer and does not rely on application characteristics, making it easily applicable to real CNN accelerators.

Experimental results show that Flip-and-Patch ensures the original CNN accuracy with a minimal impact on system performance (less than 0.05% for every application), while achieving average energy savings of 10.5% and 46.6% in activation memories compared to a conventional accelerator operating at safe and nominal supply voltages, respectively. Compared to the state-of-the-art ThUnderVolt technique, which dynamically adjusts the supply voltage at run time and discarding any energy overhead for such an approach, the average energy savings are by 3.2%