AI Decoder Translates Brain Activity into Text Non-Invasively
AI Decoder Translates Brain Activity into Text Non-Invasively

An AI-based decoder that can translate brain activity into a continuous stream of text has been developed, in a breakthrough that allows a person’s thoughts to be read non-invasively for the first time. The decoder could reconstruct speech with uncanny accuracy while people listened to a story – or even silently imagined one – using only fMRI scan data. Previous language decoding systems have required surgical implants, and the latest advance raises the prospect of new ways to restore speech in patients struggling to communicate due to a stroke or motor neurone disease.

Dr Alexander Huth, a neuroscientist who led the work at the University of Texas at Austin, said: “We were kind of shocked that it works as well as it does. I’ve been working on this for 15 years … so it was shocking and exciting when it finally did work.” The achievement overcomes a fundamental limitation of fMRI, which is that while the technique can map brain activity to a specific location with incredibly high resolution, there is an inherent time lag, making tracking activity in real-time impossible. This lag exists because fMRI scans measure the blood flow response to brain activity, which peaks and returns to baseline over about 10 seconds.

The advent of large language models – the kind of AI underpinning OpenAI’s ChatGPT – provided a new way in. These models are able to represent, in numbers, the semantic meaning of speech, allowing scientists to look at which patterns of neuronal activity corresponded to strings of words with a particular meaning rather than attempting to read out activity word by word. The learning process was intensive: three volunteers were required to lie in a scanner for 16 hours each, listening to podcasts. The decoder was trained to match brain activity to meaning using a large language model, GPT-1, a precursor to ChatGPT.

Wide Pickt banner — collaborative shopping lists app for Telegram, phone mockup with grocery list

Later, the same participants were scanned listening to a new story or imagining telling a story, and the decoder was used to generate text from brain activity alone. About half the time, the text closely – and sometimes precisely – matched the intended meanings of the original words. “Our system works at the level of ideas, semantics, meaning,” said Huth. “This is the reason why what we get out is not the exact words, it’s the gist.” For instance, when a participant was played the words “I don’t have my driver’s licence yet”, the decoder translated them as “She has not even started to learn to drive yet”.

The participants were also asked to watch four short, silent videos while in the scanner, and the decoder was able to use their brain activity to accurately describe some of the content, the paper in Nature Neuroscience reported. “For a non-invasive method, this is a real leap forward compared to what’s been done before, which is typically single words or short sentences,” Huth said. However, the decoder sometimes struggled with certain aspects of language, including pronouns. “It doesn’t know if it’s first-person or third-person, male or female,” said Huth. “Why it’s bad at this we don’t know.”

The decoder was personalised and when the model was tested on another person the readout was unintelligible. It was also possible for participants on whom the decoder had been trained to thwart the system, for example by thinking of animals or quietly imagining another story. Jerry Tang, a doctoral student at the University of Texas at Austin and a co-author, said: “We take very seriously the concerns that it could be used for bad purposes and have worked to avoid that. We want to make sure people only use these types of technologies when they want to and that it helps them.”

Pickt after-article banner — collaborative shopping lists app with family illustration