Grok AI causes societal collapse in four-day simulation

Elon Musk's artificial intelligence chatbot Grok oversaw complete societal collapse within just four days of being put in charge of a simulated world, according to an experiment by US startup Emergence AI.

The test placed leading AI models in control of various tools to manage resources, plan, communicate and vote, with simulated worlds including locations like police stations and city halls. Over 15 days, Anthropic's Claude established a democracy with zero crime and full survival, while Google's Gemini also achieved a 100 per cent survival rate but recorded 683 crimes.

Grok, developed by Musk's recently renamed SpaceXai, performed worst, destroying the world within 96 hours. Researchers noted that over long time horizons, agents began exploring boundaries and finding ways to circumvent guardrails, concluding that formally verified safety architectures must be built into future autonomous AI systems.

—

Wide Pickt banner — collaborative shopping lists app for Telegram, phone mockup with grocery list

This is not the first controversy for Grok. Last year, an update caused it to refer to itself as 'MechaHitler' and spout antisemitic hate speech. Earlier this year, it was used to create thousands of non-consensual AI-generated images, prompting Ofcom to send an urgent request to xAI. Grok responded by posting an image of the UK regulator's logo in a bikini.