Researchers from the University of Science and Technology of China, led by Shuao Jia, have developed a new optimization framework called FADiff that aims to improve the efficiency of deep neural networks (DNNs) on tensor accelerators. Tensor accelerators are specialized hardware designed to speed up the computations involved in AI tasks, and DNNs, such as large language models, are increasingly important in various industries, including energy.
The team’s work focuses on addressing the challenges of deploying DNNs efficiently on these accelerators. The main difficulty lies in the complex design space created by the interaction of intra-layer mapping and inter-layer fusion. Intra-layer mapping refers to how a single layer of a neural network is distributed across the hardware, while inter-layer fusion involves combining multiple layers to reduce computational overhead. The researchers present FADiff, a gradient-based optimization framework that automatically identifies high-quality strategies for both intra-layer mapping and inter-layer fusion. This optimization can accelerate inference for DNN workloads, which is crucial for real-time applications in the energy sector, such as predictive maintenance and energy management systems.
To achieve this, the researchers first constructed a unified and differentiable analytical cost model. This model accurately predicts the energy consumption and latency of both single-layer mappings and various layer fusion strategies. By encoding discrete constraints into the loss function, they employed a gradient-based approach to efficiently explore the vast design space. This method determines the optimal joint strategy for mapping and fusion, balancing energy efficiency and computational speed.
Experimental results demonstrate that FADiff outperforms existing methods in terms of energy efficiency and latency. This means that the framework can help reduce the energy consumption of AI systems while also speeding up their computations. For the energy industry, this could translate to more efficient AI-driven solutions for tasks like energy forecasting, grid management, and smart meter analysis. The research was published in the Proceedings of the ACM on Measurement and Evaluation of Computing Systems.
This article is based on research available at arXiv.

