How ‘slimmed-down’ large language models can reduce AI’s environmental and energy footprint

How ‘slimmed-down’ large language models can reduce AI’s environmental and energy footprint

How ‘slimmed-down’ large language models can reduce AI’s environmental and energy footprint

https://news.engineering.utoronto.ca/how-slimmed-down-large-language-models-can-reduce-ais-environmental-and-energy-footprint/

Publish Date: 2026-05-22 11:52:00

Source Domain: news.engineering.utoronto.ca

  • New research led by Professor Samin Aref (MIE) proposes better methods to reduce the environmental and energy impact of generative artificial intelligence systems like Large Language Models (LLM).
  • The research highlights the effectiveness of quantization, a technique that compresses LLMs by reducing the precision of their parameters, thus using less energy while maintaining almost intact performance.
  • Two conference papers, one winning the best paper award, detail these findings and explore compression methods through partial retraining and distribution alignment training.
  • Partial retraining compresses LLMs from 16-bit precisions to 3-bit and 2-bit precisions while minimizing performance loss using a specialized regularization term.
  • The use of distribution alignment training improves the recovery of losses in quantized language models by up to 20.37%, achieving better trade-offs between compression and accuracy.