NA³Os

Neural Approximate Accelerator Architecture Optimization for DNN Inference on Lightweight FPGAs

Embedded Machine Learning (ML) constitutes an admittedly fast-growing field that comprises ML algorithms, hardware, and software capable of performing on-device sensor data analyses at extremely low power, enabling thus several always-on and battery-powered applications and services. Running ML-based applications on embedded edge devices witnesses a phenomenal research and business interest for many reasons, including accessibility, privacy, latency, cost, and security. Embedded ML is primarily represented by artificial intelligence (AI) at the edge (EdgeAI) and on tiny, ultra resource constrained devices, a.k.a. TinyML. TinyML poses requirements for energy efficiency but also low latency as well as to retain accuracy in acceptable levels mandating, thus, optimization of the software and hardware stack.
GPUs form the default platform for DNN training workloads, due to their high parallelism computing originating by the massive number of processing cores. Though, GPU is often not an optimal solution for DNN inference acceleration due to the high energy-cost and the lack of reconfigurability, especially for high sparsity models or customized architectures. On the other hand, Field Programmable Gate Arrays (FPGAs) have a unique privilege of potentially lower latency and higher efficiency than GPUs while offering high customization and faster time-to-market combined with potentially longer useful life than ASIC solutions.
In the context of TinyML, NA³Os focuses on a neural approximate accelerator-architecture co-search targeting specifically lightweight FPGA devices. This project investigates design techniques to optimally and automatically map DNNs to resource- constrained FPGAs while exploiting principles of approximate computing. Our particular topics of investigation include:

Efficient mapping of DNN operations onto approximate hardware components (e.g., multipliers, adders, DSP Blocks, BRAMs).
Techniques for fast and automated design space exploration of mappings of DNNs defined by a set of approximate operators and a set of FPGA platform constraints.
Investigation of a hardware-aware neural architecture co-search methodology targeting FPGA-based DNN accelerators.
Evaluation of robustness vs. energy efficiency tradeoffs.
Finally, all developed methods shall be evaluated experimentally by providing a proper synthesis path and comparing the quality of generated solutions with state-of-the-art solutions.

Publications

Sabih M., Karim A., Wittmann J., Hannig F., Teich J.:
Hardware/Software Co-Design of RISC-V Extensions for Accelerating Sparse DNNs on FPGAs
International Conference on Field Programmable Technology (FPT 2024) (Sydney, Australia, 10. December 2024 - 12. December 2024)
BibTeX: Download
Deutel M., Hannig F., Mutschler C., Teich J.:
On-Device Training of Fully Quantized Deep Neural Networks on Cortex-M Microcontrollers
In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 44 (2024), p. 1250 - 1261
ISSN: 0278-0070
DOI: 10.1109/TCAD.2024.3484354
URL: https://ieeexplore.ieee.org/document/10726519
BibTeX: Download
Sabih M., Sesli B., Hannig F., Teich J.:
Accelerating DNNs using Weight Clustering on RISC-V Custom Functional Units
Conference on Design, Automation and Test in Europe (DATE) (Valencia, 25. March 2024 - 27. March 2024)
In: Proceedings of the Conference on Design, Automation and Test in Europe (DATE) 2024
BibTeX: Download
Sabih M., Yayla M., Hannig F., Teich J., Chen JJ.:
Robust and Tiny Binary Neural Networks using Gradient-based Explainability Methods
EuroMLSys '23: Proceedings of the 3rd Workshop on Machine Learning and Systems (Rome, Italy, 8. May 2023 - 8. May 2023)
In: Eiko Yoneki, Luigi Nardi (ed.): EuroMLSys '23: Proceedings of the 3rd Workshop on Machine Learning and System, New York(NY) United States: 2023
DOI: 10.1145/3578356.3592595
URL: https://dl.acm.org/doi/10.1145/3578356.3592595
BibTeX: Download
Sabih M., Hannig F., Teich J.:
Fault-Tolerant Low-Precision DNNs using Explainable AI
Workshop on Dependable and Secure Machine Learning (DSML) (Virtual Workshop, 21. June 2021 - 24. June 2021)
In: 2021 51st Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W) 2021
DOI: 10.1109/DSN-W52860.2021.00036
URL: https://ieeexplore.ieee.org/document/9502445/
BibTeX: Download