Ambient AI in Psychiatry and Psychology: Clinical documentation challenges and how to solve them

TL;DR
Ambient scribing works differently in psychiatry and psychology than in most other specialties. Unspoken content, multi-voice dynamics, and the gap between what is said and what it means clinically all require deliberate workflow design and careful template work.
- Much of what matters is never verbalized. Architectural decisions to support this need to be planned from the start
- Multi-voice consultations require templates designed to keep perspectives clearly attributed
- The system captures what is said, not what it means. Clinical interpretation stays with the clinician
- Reflective summaries and post-consultation dictation make a meaningful difference to note quality
- Template design takes longer and requires more clinical involvement than most customers expect
This covers what ambient scribing does and doesn't do in psychiatric and psychological settings, what works well, what requires workflow adjustments, and what to expect from an implementation.
What makes psychiatry and psychology different
The information that matters most is often unspoken
In somatic medicine, clinically relevant information tends to enter the room through speech. A clinician comments on examination findings, reads a measurement aloud, narrates what they observe. That content enters the audio and can be captured.
In psychiatry, the equivalent (the clinician's observations of the patient's presentation) almost never gets verbalized. Eye contact, affect, speech tempo, motor restlessness, energy level, grooming: these are central to a psychiatric status examination, but clinicians do not narrate them to the patient. The result is that a significant category of clinical documentation lies entirely outside the audio stream.
The practical implication is that objective findings need to be handled separately, typically through a brief post-consultation dictation that gets integrated into the note. This is a workflow change that needs to be planned for from the start, not left to emerge mid-deployment.
Relevance is hard to automate
In most specialties, there is a fairly clear boundary between what belongs in a clinical note and what doesn't. In psychiatry, that boundary is much harder to draw automatically. Almost anything a patient says could be clinically meaningful depending on the consultation's purpose.
A patient mentioning that they frequently lose their keys is irrelevant in most clinical contexts. In a psychiatric assessment, it may be a significant behavioral indicator. A system working without that context has no reliable way to know the difference.
This makes pertinence filtering genuinely harder in psychiatric settings, and it means that providing the system with good contextual information about the consultation type and clinical purpose has a measurable effect on output quality. We've built this into how we approach prompt design for these deployments.
The note captures what was said and not what it means
This is probably the most important limitation to understand, and the one we communicate most directly with clinical customers.
A psychologist once described it to us this way: the note should capture what the patient says, but what is clinically interesting is often not the information itself, it is what the information reveals. A patient describes a conflict at work. The episode itself isn't what matters. What matters is what it reveals about the patient's emotional patterns, coping strategies, and functioning. That interpretation is a product of clinical judgment. It is rarely verbalized explicitly during the consultation.
An ambient system can only work with what is said. It will produce an accurate account of the work episode, accurate but not clinically complete. Some clinicians find this genuinely useful as a starting scaffold for their own notes. Others find the gap too significant for certain consultation types.
Ambient scribing delivers real value in psychiatric and psychological documentation, but the value is in capturing, organizing, and structuring the spoken content of the consultation. The clinical interpretation remains the clinician's work.
Consultations are long and content-rich
Psychiatric and psychological consultations routinely run to 60, 75, or 90 minutes. Therapy sessions, diagnostic evaluations, and complex case reviews all run long.
This has direct implications for documentation pipelines. Early in our work in this space, we encountered notes that were incomplete or shallow despite rich consultation content. A short note on a 90-minute session is a sign that something has gone wrong. Addressing this required architectural changes to how we handle large volumes of content, and it's something we now design for explicitly from the start of a psychiatric implementation.
What we learned in practice
1. Multiple voices in the room
One of the more nuanced challenges we encountered, particularly in child and adolescent psychiatry, is that consultations frequently involve more than just the patient and the clinician. The constellation of participants varies considerably: a consultation might involve parents and a psychiatrist without the patient present, a teacher and parents discussing school observations, or a caregiver sharing their perspective on how the child functions at home. The same dynamic appears in adult psychiatry, where a relative accompanying the patient contributes their own observations about functioning at home.
In these consultations, the documentation task is not simply to capture what was said. It is to keep clear whose perspective is whose. The patient's own account, the parent's observations, and the clinician's assessment are three distinct layers of information. Conflating them is a documentation error with real clinical consequences. For example, a parent's description of a child's behavior should not read as the child's self-report, and neither should be presented as clinical fact without attribution.
The practical solution involves both template design and prompt architecture. Templates need to be structured to receive input from multiple perspectives, and the system needs explicit instructions to attribute statements correctly. This is an area where we have invested significant effort with our psychiatric customers, and it is reflected in how we approach template design and implementation from the start.
2. Separating voices within a single topic
A related challenge emerged around medication conversations. In consultations involving a new prescription or a medication review, several distinct types of content often appear within the same exchange:
- The patient's own reasons for wanting or resisting the medication
- The clinician's clinical rationale
- The patient's current experience of a medication's effects
- The clinician's explanation of what side effects to expect
All of these matter. But they are different kinds of information, and mixing them produces notes where it is unclear whether a reported effect came from the patient or was described by the clinician as a possibility. In psychiatry, that distinction has implications for how the note is read and how it informs future care.
Getting this right requires templates and prompts that are designed with these distinctions in mind. The same logic applies to psychoeducation. A significant part of many psychiatric consultations involves the clinician explaining conditions, mechanisms, and treatment rationale to the patient, and these explanations can easily get mixed in with clinical findings if the template and prompts aren't designed to distinguish them. Getting that separation right is something we design for explicitly in psychiatric implementations.
3. Working with what the system can and cannot hear
Two distinct strategies help bridge the gap between what happens in the consultation and what ends up in the note.
During the consultation, brief reflective summaries, which are a natural part of many clinical interactions, serve a dual purpose. When a clinician consolidates what has emerged across the conversation, for example confirming that a patient has described persistent low mood over the past three months, it provides the system with a clear, structured signal that improves the quality of the generated note.
Objective findings are a different matter. Clinicians would not typically narrate observations like reduced eye contact or blunted affect while the patient is in the room. The practical solution is to restart the recording briefly after the patient has left to make a short post-consultation dictation covering the mental status observations. This gets integrated into the note alongside the content captured during the session.
Both of these are small workflow adjustments that most clinicians adapt to quickly, but they need to be introduced and planned for during implementation rather than left to clinicians to figure out on their own.
4. Ambient dictation as an alternative mode
For clinicians who find the gap between spoken consultation and generated note too large, ambient dictation offers a different approach. Rather than the system extracting and structuring facts from the consultation, the clinician loosely narrates their own account and assessment after the patient has left. The system organizes and formats that narration.
This is neither traditional dictation nor ambient documentation, but a hybrid mode, reducing the layers between the clinician's thinking and the final note. Some clinicians strongly prefer this mode, particularly where having a new technology present during the session feels disruptive to the therapeutic relationship.
5. Template design as a clinical process
One of the clearest lessons from our implementations is that getting templates right in this specialty takes more time and more clinical involvement than customers typically anticipate.
The variation between psychiatry, psychology, and psychotherapy is real and consequential. What a psychiatrist needs in a note differs significantly from what a psychologist produces, which differs again from what a psychotherapist documents. What gets described in a requirements document rarely captures these differences fully, and the gap between what was described and what clinicians actually wanted has been a consistent source of iteration work.
What works is starting with a single template, using real anonymized clinical notes as ground truth rather than written descriptions, and validating with clinical end users early. We build explicit iteration time into psychiatric implementations for this reason, and we insist on getting involved with the clinicians who will use the system. To help with this, we have created a template assembler that allows users to create and modify their own templates and sections directly.
6. FactsR™ by Corti: the clinical reasoning engine
One of the practical questions we hear from psychiatric customers is how the system holds up across a 90-minute session without losing clinical detail along the way. It is a fair concern. Traditional ambient solutions process a raw transcript after the consultation has ended, which means long, content-rich sessions can produce unfocused notes where relevant clinical detail gets buried.
FactsR™, Corti's clinical reasoning engine, approaches the problem differently. It continuously extracts and validates structured clinical facts in real time as the session unfolds, so by the time the consultation ends, the note is already built from a distilled, structured account of what mattered. Length is never a constraint, and nothing gets lost in the volume.
What to expect from an implementation
Ambient scribeing in psychiatry and psychology delivers real value, but it delivers it differently some other specialties. The most accurate framing is that the system produces a structured, organized first draft of the spoken content of the consultation. One that captures the patient's narrative, organizes it into the right sections, and gives the clinician a strong foundation to work from.
What it does not replace is the clinician's professional interpretation of that content, their observational findings, or their clinical judgment about what a consultation means. These remain the clinician's contribution. A good implementation is one where the technology handles the documentation burden of the spoken content effectively, leaving the clinician's time and attention for the parts that require their expertise.
Implementations that go well share a few common features:
- Realistic expectations from the start
- Clinical end users involved in template design
- A clear plan for how objective findings will be handled
- A recognition that some adjustment period is normal as clinicians adapt to a new documentation workflow
Corti is available to discuss any of this in more detail, and to get more specific about what an implementation means for your setting.
Join our mission
We believe everyone should have access to medical expertise, no matter where they are.

.png)