AI Fails the Pun Test: Study Reveals LLMs Don't Get Jokes

Comedians and witty writers can breathe a sigh of relief for now, as new research confirms that artificial intelligence still struggles to understand the nuances of human humour. A joint study from universities in the UK and Italy has investigated whether large language models (LLMs) truly comprehend puns, with revealing results.

Researchers from Cardiff University and Ca' Foscari University of Venice concluded that while AI can identify the structure of a pun, it fundamentally fails to grasp the joke itself. The findings, presented at the 2025 Conference on Empirical Methods in Natural Language Processing in Suzhou, China, highlight significant limitations in AI's understanding of humour, empathy and cultural context.

The Illusion of Understanding

The research team conducted experiments where they presented LLMs with classic puns and then modified them to remove the double meaning. In one test, they used the pun: "I used to be a comedian, but my life became a joke." When they replaced this with: "I used to be a comedian, but my life became chaotic," the AI models still incorrectly identified it as containing a pun.

Another example tested was: "Long fairy tales have a tendency to dragon." Even when researchers substituted "dragon" with the synonym "prolong" or even a completely random word, the LLMs persistently believed they were detecting wordplay.

Professor Jose Camacho Collados from Cardiff University's School of Computer Science and Informatics explained the significance of these findings. "In general, LLMs tend to memorise what they have learned in their training. As such, they catch existing puns well but that doesn't mean they truly understand them," he stated.

Creative But Clueless: AI's Humour Gap

The study revealed that when faced with unfamiliar wordplay, the success rate of LLMs in distinguishing actual puns from ordinary sentences can plummet to as low as 20%. This demonstrates what researchers describe as "the illusion of humour understanding" in artificial intelligence systems.

In one particularly telling experiment, the team presented the pun: "Old LLMs never die, they just lose their attention." When they changed "attention" to "ukulele," the AI still identified it as a pun, creatively suggesting that "ukulele" sounded like "you-kill-LLM."

While the researchers noted surprise at this display of creativity, it ultimately highlighted that the AI had completely missed the original joke. "We were able to consistently fool LLMs by modifying existing puns, removing the double meaning that made the original pun," Professor Camacho Collados explained. "In these cases, models associate these sentences with previous puns, and make up all sorts of reasons to justify they are a pun."

Implications for AI Development

The research paper, titled "Pun unintended: LLMs and the illusion of humor understanding," underscores why caution is necessary when using large language models for applications that require genuine understanding of humour, empathy or cultural nuance.

These findings have significant implications for how we develop and deploy AI systems in creative industries, customer service and any field where human communication relies on subtle linguistic cues. The study suggests that while AI can mimic recognition of humorous structures, true comprehension of what makes something funny remains a distinctly human capability for the foreseeable future.

As AI continues to evolve, this research provides crucial insights into the current boundaries of machine understanding and the complex challenges that remain in replicating human cognitive abilities.