AI's Skewed Language Training Could Reshape Human Communication and Thought

AI Learns Language from Biased Sources, Potentially Altering Human Speech and Cognition

Large language models, the engines behind many AI systems, are not trained on the vast majority of human language: unscripted, face-to-face conversations. Instead, they rely on written materials such as textbooks, social media posts, and scripted speech from movies and television. This skewed training data captures only a slice of human expression, missing the spontaneous, emotional exchanges that define much of our daily interactions and cultural fabric.

The Risk of AI-Influenced Linguistic Patterns

As AI-generated text becomes more prevalent, humans are increasingly exposed to its linguistic patterns. This exposure may lead us to adopt AI-like communication styles, affecting not just how we talk but also how we perceive the world. Our sense of reality could become distorted in ways we are only beginning to understand, with implications for social dynamics and personal identity.

Erosion of Courtesy and Vocabulary Constriction

One immediate effect could be a decline in courteousness, as AI interactions often model command-based language. For instance, a 2022 study found that children using voice assistants like Siri and Alexa developed curt speaking habits with humans, expecting obedience similar to how they address devices. Additionally, AI-generated text tends to use a narrower vocabulary and more uniform sentence structures, averaging 12-20 words, which may further constrict human speech and reduce emotional expressiveness.

—

Wide Pickt banner — collaborative shopping lists app for Telegram, phone mockup with grocery list

Feedback Loops and Confirmation Bias

The problem is compounded by feedback loops: as more AI-generated content is used to train these models, they reinforce their own unnatural patterns. This can introduce confirmation bias, where chatbots agree with users' statements uncritically, reinforcing half-formed or incorrect ideas. For example, queries like "Cake is a healthy breakfast, right?" might receive enthusiastic support, potentially worsening biases or even mental health issues like psychosis.

Impact on Education and Social Interactions

In educational settings, students turning to AI for help may miss the crucial process of articulating thoughts to clarify thinking. AI often regurgitates vague ideas in confident language, undermining healthy doubt and critical analysis. Moreover, AI models are trained on online data, which includes toxic language from social media, skewing their understanding of human interaction toward aggression rather than reconciliation.

Historical Parallels and Future Solutions

Historically, selective records have distorted our view of cultures, such as medieval sagas overemphasizing warriors. Similarly, AI trained on limited sources may inflate the significance of topics like political debates on social media. While some efforts are underway to include more natural speech, such as recording phone calls, privacy concerns limit scalability. Innovators must find ways to train AI on authentic human conversations to better reflect our true communicative nature.