AI Learns Language from Biased Sources, Potentially Altering Human Speech and Cognition
Large language models, the engines behind many AI systems, are not trained on the vast majority of human language: unscripted, face-to-face conversations. Instead, they rely on written materials such as textbooks, social media posts, and scripted speech from movies and television. This skewed training data captures only a slice of human expression, missing the spontaneous, emotional exchanges that define much of our daily interactions and cultural fabric.
The Risk of AI-Influenced Linguistic Patterns
As AI-generated text becomes more prevalent, humans are increasingly exposed to its linguistic patterns. This exposure may lead us to adopt AI-like communication styles, affecting not just how we talk but also how we perceive the world. Our sense of reality could become distorted in ways we are only beginning to understand, with implications for social dynamics and personal identity.
Erosion of Courtesy and Vocabulary Constriction
One immediate effect could be a decline in courteousness, as AI interactions often model command-based language. For instance, a 2022 study found that children using voice assistants like Siri and Alexa developed curt speaking habits with humans, expecting obedience similar to how they address devices. Additionally, AI-generated text tends to use a narrower vocabulary and more uniform sentence structures, averaging 12-20 words, which may further constrict human speech and reduce emotional expressiveness.
Feedback Loops and Confirmation Bias
The problem is compounded by feedback loops: as more AI-generated content is used to train these models, they reinforce their own unnatural patterns. This can introduce confirmation bias, where chatbots agree with users' statements uncritically, reinforcing half-formed or incorrect ideas. For example, queries like "Cake is a healthy breakfast, right?" might receive enthusiastic support, potentially worsening biases or even mental health issues like psychosis.
Impact on Education and Social Interactions
In educational settings, students turning to AI for help may miss the crucial process of articulating thoughts to clarify thinking. AI often regurgitates vague ideas in confident language, undermining healthy doubt and critical analysis. Moreover, AI models are trained on online data, which includes toxic language from social media, skewing their understanding of human interaction toward aggression rather than reconciliation.
Historical Parallels and Future Solutions
Historically, selective records have distorted our view of cultures, such as medieval sagas overemphasizing warriors. Similarly, AI trained on limited sources may inflate the significance of topics like political debates on social media. While some efforts are underway to include more natural speech, such as recording phone calls, privacy concerns limit scalability. Innovators must find ways to train AI on authentic human conversations to better reflect our true communicative nature.



