Artificial intelligence ready to score full marks on one of world’s most challenging tests
Artificial intelligence ready to score full marks on one of world’s most challenging tests
https://www.gbnews.com/news/artificial-intelligence-full-marks-test-google-gemini
Publish Date: 2026-03-30 14:41:00
Source Domain: www.gbnews.com
- Google’s Gemini model has achieved 45.9 percent on “Humanity’s Last Exam,” a significant leap from previous performances.
- The test, designed to measure the divide between machine learning and human intellect, comprises 2,500 questions across roughly 100 disciplines requiring doctoral-level comprehension.
- The test was collaboratively developed by Scale and the Centre for AI Safety, drawing from over 70,000 questions proposed by experts from approximately 50 countries.
- The benchmark’s purpose is to evaluate both breadth and depth of knowledge and reasoning in AI systems, comparing them to the capability of universal experts.
- AI models’ recent rapid advancement, noted by researchers like Calvin Zhang of Scale, has led to predictions that full marks could be achieved within twelve months.
- While some models, like Google’s Gemini and Anthropic’s Claude, show improving performance, others still lag, indicating persistent gaps in AI’s understanding.
- Experts like Dr. Tung Nguyen stress that the test highlights the importance of human expertise in depth, context, and specialized knowledge.
- There is optimism from industry representatives, such as Kate Olszewska, that full marks could be quickly achieved if enough resources and focus are directed toward this goal.