New technique makes AI models leaner and faster while they’re still learning | MIT News
New technique makes AI models leaner and faster while they’re still learning | MIT News
https://news.mit.edu/2026/new-technique-makes-ai-models-leaner-faster-while-still-learning-0409
Publish Date: 2026-04-09 09:00:00
Source Domain: news.mit.edu
-
New Compression Method During Training: Researchers have developed CompreSSM, a technique to compress state-space models during training, avoiding the trade-off between model size and performance improvement.
-
Advantages Over Traditional Methods: Unlike conventional pruning or knowledge distillation methods, CompreSSM enables informed compression mid-training, saving computational resources and time without sacrificing accuracy.
-
Efficiency and Performance: On benchmarks like CIFAR-10 and Mamba, CompreSSM-trained models match the performance of larger models while training up to 1.5 times faster and achieving up to 4x training speedups.
-
Theoretical Grounding: The method leverages control theory, specifically Hankel singular values, to rank model components by their importance early in training, ensuring only significant parts are retained.
-
Safe Compression: CompreSSM includes a safety net feature allowing practitioners to revert to saved checkpoints if a compression step adversely affects performance.
-
Effectiveness on Certain Models: The method works particularly well on multi-input, multi-output models and linear time-invariant systems, with potential extensions to other state-space model architectures, including transformers.
-
Future Directions: Researchers aim to extend CompreSSM to more complex architectures and dynamical systems, aiming for wider applicability in modern AI systems.
-
Conference and Support: The work was accepted for presentation at the International Conference on Learning Representations 2026, supported by several institutions including the Max Planck Institute and U.S. Office of Naval Research.