How ‘slimmed-down’ large language models can reduce AI’s environmental and energy footprint
How ‘slimmed-down’ large language models can reduce AI’s environmental and energy footprint
Publish Date: 2026-05-22 11:52:00
Source Domain: news.engineering.utoronto.ca
- New research led by Professor Samin Aref (MIE) proposes better methods to reduce the environmental and energy impact of generative artificial intelligence systems like Large Language Models (LLM).
- The research highlights the effectiveness of quantization, a technique that compresses LLMs by reducing the precision of their parameters, thus using less energy while maintaining almost intact performance.
- Two conference papers, one winning the best paper award, detail these findings and explore compression methods through partial retraining and distribution alignment training.
- Partial retraining compresses LLMs from 16-bit precisions to 3-bit and 2-bit precisions while minimizing performance loss using a specialized regularization term.
- The use of distribution alignment training improves the recovery of losses in quantized language models by up to 20.37%, achieving better trade-offs between compression and accuracy.