Google's Gemini 3 AI Scores Record 37.5% on Humanity's Last Exam

Google has announced a landmark achievement in artificial intelligence, revealing that its latest model, Gemini 3, has set a new record on a critical benchmark test designed to measure the dawn of superintelligence.

The tech giant unveiled Gemini 3 on 18 November 2025, proclaiming it the most intelligent AI system released to date. The announcement marks what CEO Sundar Pichai described as the beginning of a "new era of intelligence".

A New Benchmark for AI Reasoning

The core of this breakthrough is the AI's performance on the so-called Humanity's Last Exam. This test was created by AI safety researchers specifically to determine if an artificial intelligence can reason at the very frontier of human academic knowledge.

Gemini 3 Pro achieved a top score of 37.5 per cent on this exam. According to Google, this result demonstrates PhD-level reasoning capabilities and places it decisively ahead of the best models from competitors like Anthropic, Meta, and OpenAI.

OpenAI's GPT-5 Pro, the closest rival, currently holds a top score of 31.64 per cent on the same test.

Unprecedented Depth and Understanding

Google executives were effusive in their praise for the new model's abilities. Sundar Pichai stated that Gemini 3 represents a "massive jump in reasoning", engineered to respond with a depth and understanding never before seen in AI.

"It's state-of-the-art in reasoning, built to grasp depth and nuance – whether it's perceiving the subtle clues in a creative idea, or peeling apart the overlapping layers of a difficult problem," Pichai said.

The new model is being rolled out "at the scale of Google", meaning it will be immediately accessible to billions of users through AI Mode in Search. It will also be integrated directly into the Gemini app, which boasts over 650 million monthly active users.

An Even More Powerful Model Held Back

In a significant reveal, Google also announced another, more powerful variant called Gemini 3 Deep Think. This model scored an even higher 41 per cent on Humanity's Last Exam and achieved new record scores on other benchmark tests, including an unprecedented 45.1 per cent on the ARC-AGI-2 AGI benchmark.

Demis Hassabis, CEO of Google DeepMind, explained: "Gemini 3 Deep Think mode pushes the boundaries of intelligence even further, delivering a step-change in Gemini 3's reasoning and multimodal understanding capabilities to help you solve even more complex problems."

However, in a move highlighting the ongoing concerns around advanced AI, Google confirmed that Gemini 3 Deep Think will not be publicly released until more comprehensive safety checks are completed.

This development signals not just a technical milestone but also a cautious approach as the industry edges closer to what many consider the threshold of artificial superintelligence.