Teaching AI models to say “I’m not sure” | MIT News

Source Domain: news.mit.edu

Overconfidence in AI Systems: Modern AI reasoning models, such as those at MIT’s CSAIL, express answers with the same high level of certainty regardless of whether they are right or guessing, a problem traced to their training methods.
Issue with Reinforcement Learning: The training method for these models, which rewards only correctness without considering correctness by chance, fosters overconfidence, leading to unreliable outputs in critical applications.
RLCR Method Developed: Researchers have introduced RLCR (Reinforcement Learning with Calibration Rewards), a technique that trains models to output both answers and calibrated confidence estimates, effectively addressing overconfidence.
Effective Results: RLCR reduced calibration errors by up to 90% in experiments while either maintaining or improving accuracy on both trained and new tasks.
Practical Utility: The confidence estimates generated by RLCR improve both the accuracy and calibration when used for selecting or weighting candidate answers.
Added Value of Uncertainty Reasoning: Including a model’s uncertainty reasoning in its input data enhanced classifier performance, indicating that self-awareness about uncertainty holds practical value.

You may have missed