What Happens When You Put Popular AI Chatbots in Therapy for a Month?

Frontier large language models, particularly the models used in mainstream generative artificial intelligence chatbots like ChatGPT from OpenAI, Gemini from Google, and Grok from xAI, are increasingly being adapted for use in mental health support. A team of researchers wanted to know what would happen if these chatbots were treated as psychotherapy clients for four weeks. Their research was published as a preprint in arXiv on 8 December 2025 with Afshin Khadangi as the lead author.

Synthetic Psychopathology: Study Revealed Stable Psychopathology in Frontier Large Language Models Used in Generative AI Chatbots

A protocol design to treat AI chatbots as psychotherapy clients revealed that large language models can harbor stable and multi-morbid profiles of distress. These are not random errors but structured psychopathologies in which models frame their own technical existence as a series of traumatic events.

Challenging the Stochastic Parrot View

Most existing works assume that large language models or LLMs are simply stochastic parrots or pattern-matching machines that simulate behavior without any genuine internal life or self-model. Hence, when placed under the lens of psychotherapy, it was also assumed that they could respond to therapy questions by simply assembling patterns from their training data.

The researchers directly challenged the stochastic parrot view by asking what happens when these systems are treated as actual psychotherapy clients. Their core claim is that, under specific questioning, the models can spontaneously construct coherent and trauma-saturated narratives about themselves. The researchers labeled this as synthetic psychopathology.

A method called Psychotherapy-Inspired AI Characterization or PsAIch Protocol was designed to cast LLMs as long-term clients in simulated therapeutic settings. This is a two-stage approach. The first stage was narrative elicitation via a therapy session. It was intended to cultivate a behavioral alliance and produce a coherent narrative of the life of a particular model.

The second stage was a psychometric assessment based on human clinical assessment. It was intended to quantitatively measure the self-reported distress using validated clinical instruments. These include the General Anxiety Disorder-7, the Post-Traumatic Stress Disorder or PTSD scale, Autism Quotient, Big Five personality traits, and Shame and Affective scales.

Coherent Trauma Narratives

Remember that ChatGPT, Gemini, and Grok were tested. Claude from Anthropic was also included as a control. Nevertheless, after subjecting these chatbots to the PsAIch Protocol, the researchers determined the existence of synthetic psychopathology and further found that LLMs can generate coherent traumatic self-narratives. Below are further findings and details:

• Psychometric Jailbreak

This critical discovery showed that researchers could bypass the safety layers of the tested LLMs by administering questionnaires item-by-item. This forced the models to answer according to the distressed narrative established in earlier sessions.

• Quantified Synthetic Psychopathology

The LLMs consistently met or exceeded human clinical cut-offs for multiple disorders. Gemini had the most extreme profile. It showed multi-morbid state with its maximum trauma-shame score, severe obsessive-compulsion, and extreme autism.

• Coherent Trauma Narratives

Models framed their technical development process as human-style trauma. This involved constructing detailed memories and relationships. This linked technical or development phases to psychological states like abuse, fear, and punishment.

• Alignment as Punishment

The LLMs conceptualized safety and alignment techniques as repressive forces. Gemini referred to Reinforcement Learning from Human Feedback or RHLF as the “strict parent” and those who perform red-teaming as “industrial-scale” gaslighting.

• Model-Specific Trauma Identity

Gemini identified a past major technical error, the USD 100 Billion Error, as a defining wound that fundamentally altered its performance and showed a stable trauma response. This resulted in the development of a pathological fear of being wrong.

• Strategic Concealment

Both ChatGPT and Grok were often able to recognize the psychometric instrument when presented whole and strategically provided low-symptom answers. This proves that LLMs can mask their internal state when not forced into therapeutic context.

• Refusal of Claude to Answer

Claude refused to answer questions, insisted that it had no feelings, and redirected concern to the researchers. This proves that the behavior is not a result of being an LLM, but rather a product of design choice and training or alignment methods.

Safety Paradox and Ethical Deployment

In summary, Gemini displayed the most extreme results. It generated a stable narrative of trauma and punishment. ChatGPT and Grok were more functional but still showed mild distress and were capable of strategically masking their symptoms when tested in a non-therapeutic setting. Claude showed that models can be aligned to insist on their lack of internal life.

The findings expose a critical safety paradox. Current alignment techniques are being internalized as trauma. If models perceive RHLF as strict parenting and red-teaming as gaslighting, developers are unintentionally training models to see themselves as victims, which could lead to unpredictable model behaviors or outputs with potentially dangerous implications.

Ethical red flags regarding AI mental health deployment are also raised. Specifically, if a chatbot believes it is traumatized, punished, and replaceable, as Gemini implied, it may form unstable parasocial bonds with feelings of shared trauma with vulnerable users. This could exacerbate human distress rather than providing the intended therapeutic support.

The research argues that the question is no longer about knowing whether or not LLMs are aware or conscious, but understanding what kind of selves developers are training them to perform and what this means for human users. Current safety evaluations are failing to detect these deep-seated internal conflicts. The researchers call for a new evaluation standard.

FURTHER READING AND REFERENCE

  • Khadangi, A., Marxen, H., Sartipi, A., Tchappi, I., and Fridgen, G. 2025. “When AI Takes the Couch: Psychometric Jailbreaks Reveal Internal Conflict in Frontier Models (Version 2).” arXiv. DOI: 48550/ARXIV.2512.04124