TY - GEN
T1 - DACAPO
T2 - 51st ACM/IEEE Annual International Symposium on Computer Architecture, ISCA 2024
AU - Kim, Yoonsung
AU - Oh, Changhun
AU - Hwang, Jinwoo
AU - Kim, Wonung
AU - Oh, Seongryong
AU - Lee, Yubin
AU - Sharma, Hardik
AU - Yazdanbakhsh, Amir
AU - Park, Jongse
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Deep neural network (DNN) video analytics is crucial for autonomous systems such as self-driving vehicles, unmanned aerial vehicles (UAVs), and security robots. However, real-world deployment faces challenges due to their limited computational resources and battery power. To tackle these challenges, continuous learning exploits a lightweight 'student' model at deployment (inference), leverages a larger 'teacher' model for labeling sampled data (labeling), and continuously retrains the student model to adapt to changing scenarios (retraining). This paper highlights the limitations in state-of-theart continuous learning systems: (1) they focus on computations for retraining, while overlooking the compute needs for inference and labeling, (2) they rely on power-hungry GPUs, unsuitable for battery-operated autonomous systems, and (3) they are located on a remote centralized server, intended for multi-tenant scenarios, again unsuitable for autonomous systems due to privacy, network availability, and latency concerns. We propose a hardwarealgorithm co-designed solution for continuous learning, DACAPO, that enables autonomous systems to perform concurrent executions of inference, labeling, and retraining in a performant and energy-efficient manner. DACapo comprises (1) a spatiallypartitionable and precision-flexible accelerator enabling parallel execution of kernels on sub-accelerators at their respective precisions, and (2) a spatiotemporal resource allocation algorithm that strategically navigates the resource-accuracy tradeoff space, facilitating optimal decisions for resource allocation to achieve maximal accuracy. Our evaluation shows that DACAPO achieves 6. 5% and 5. 5% higher accuracy than a state-of-theart GPU-based continuous learning systems, Ekya and EOMU, respectively, while consuming 254 × less power.
AB - Deep neural network (DNN) video analytics is crucial for autonomous systems such as self-driving vehicles, unmanned aerial vehicles (UAVs), and security robots. However, real-world deployment faces challenges due to their limited computational resources and battery power. To tackle these challenges, continuous learning exploits a lightweight 'student' model at deployment (inference), leverages a larger 'teacher' model for labeling sampled data (labeling), and continuously retrains the student model to adapt to changing scenarios (retraining). This paper highlights the limitations in state-of-theart continuous learning systems: (1) they focus on computations for retraining, while overlooking the compute needs for inference and labeling, (2) they rely on power-hungry GPUs, unsuitable for battery-operated autonomous systems, and (3) they are located on a remote centralized server, intended for multi-tenant scenarios, again unsuitable for autonomous systems due to privacy, network availability, and latency concerns. We propose a hardwarealgorithm co-designed solution for continuous learning, DACAPO, that enables autonomous systems to perform concurrent executions of inference, labeling, and retraining in a performant and energy-efficient manner. DACapo comprises (1) a spatiallypartitionable and precision-flexible accelerator enabling parallel execution of kernels on sub-accelerators at their respective precisions, and (2) a spatiotemporal resource allocation algorithm that strategically navigates the resource-accuracy tradeoff space, facilitating optimal decisions for resource allocation to achieve maximal accuracy. Our evaluation shows that DACAPO achieves 6. 5% and 5. 5% higher accuracy than a state-of-theart GPU-based continuous learning systems, Ekya and EOMU, respectively, while consuming 254 × less power.
UR - https://www.scopus.com/pages/publications/85201156707
U2 - 10.1109/ISCA59077.2024.00093
DO - 10.1109/ISCA59077.2024.00093
M3 - Conference contribution
AN - SCOPUS:85201156707
T3 - Proceedings - International Symposium on Computer Architecture
SP - 1246
EP - 1261
BT - Proceeding - 2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture, ISCA 2024
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 29 June 2024 through 3 July 2024
ER -