Energy-Efficient Deep Learning Via Compression: Green AI

Uncategorized

Authors: Rajesh Chaurasiya, Vishal Sharma

Abstract: With the rapid growth of artificial intelligence (AI), deep learning models are becoming more complex and require significant computing power, memory, and energy. This makes it difficult to deploy them on devices with limited resources, such as smartphones, embedded systems, and edge devices. To address this challenge, model compression techniques have emerged as a key solution. These methods reduce the size and computational cost of AI models while keeping their performance close to that of the original models. This paper explores four widely used model compression techniques: pruning, quantization, knowledge distillation, and low-rank factorization. Each technique is explained in terms of how it works, its advantages, and the trade-offs it brings. A special focus is placed on pure compression strategies, which avoid external indexing or lookup tables and are better suited for simple and energy-efficient systems. A case study using a convolutional neural network (CNN) shows that combining pruning and quantization can reduce model size by more than 80% and speed up inference time by 30% with only a small loss in accuracy. The study also highlights key metrics for evaluating compressed models, including memory usage, speed, and accuracy. Finally, the paper discusses real-world applications in mobile devices, healthcare, and autonomous systems, along with future directions such as automated compression tools and energy-aware training. Overall, this research supports the development of more accessible, scalable, and eco-friendly AI by making models lighter and more efficient.

DOI: http://doi.org/10.5281/zenodo.15718659

 

× How can I help you?