Research and articles

As industry-leaders in healthcare AI, we pride ourselves on the robust scientific foundation on which our platform is based on.

Corti collaborates with organizations of all sizes, researching how we can further transform the healthcare industry.

Contact us

Explaining AI when there is no time for explanations

March 2022

Corti has developed an artificial intelligence (AI) that detects Out-of-Hospital Cardiac Arrest (OHCA) for emergency calls. The technical feasibility and performance of the AI was validated via several clinical studies and trials, leading us to the question of: How do you present and explain a complex AI to end-users who are operating under extreme time pressure in a high-intensity work environment?

Machine learning can support dispatchers to better and faster recognize out-of-hospital cardiac arrest during emergency calls: A retrospective study

March 2021

Fast recognition of out-of-hospital cardiac arrest (OHCA) by dispatchers might increase survival. The aim of this observational study of emergency calls was to (1) examine whether a machine learning framework (ML) can increase the proportion of calls recognizing OHCA within the first minute compared with dispatchers, (2) present the performance of ML with different false positive rate (FPR) settings, (3) examine call characteristics influencing OHCA recognition.

Do End-to-End Speech Recognition Models Care About Context?

February 2021

The two most common paradigms for end-to-end speech recognition are connectionist temporal classification (CTC) and attention-based encoder-decoder (AED) models. It has been argued that the latter is better suited for learning an implicit language model. In this paper, this hypothesis is tested by measuring temporal context-sensitivity and it is evaluated how the models perform when the amount of contextual information is constrained in the audio input.

Hierarchical VAEs Know What They Don't Know

February 2021

Deep generative models have shown themselves to be state-of-the-art density estimators. Yet, recent work has found that they often assign a higher likelihood to data from outside the training distribution.

On Scaling Contrastive Representations for Low-Resource Speech Recognition

February 2021

Recent advances in self-supervised learning through contrastive training have shown that it is possible to learn a competitive speech recognition system with as little as 10 minutes of labelled data. However, these systems are computationally expensive since they require pre-training followed by fine-tuning in large parameter space. This paper explores the performance of such systems without fine-tuning by training a state-of-the-art speech recognizer on the fixed representations from the computationally demanding wav2vec 2.0 framework.

Effect of Machine Learning on Dispatcher Recognition of Out-of-Hospital Cardiac Arrest During Calls to Emergency Medical Services

January 2021

Can a machine learning model help medical dispatchers improve recognition of out-of-hospital cardiac arrest?

MultiQT: Multimodal Learning for Real-Time Question Tracking in Speech

May 2020

We address a challenging and practical task of labeling questions in speech in real time during telephone calls to emergency medical services in English, which embeds within a broader decision support system for emergency call-takers. We propose a novel multimodal approach to real-time sequence labelling in speech. Our model treats speech and its own textual representation as two separate modalities or views, as it jointly learns from streamed audio and its noisy transcription into text via automatic speech recognition.

Machine learning as a supportive tool to recognize cardiac arrest in emergency calls

May 2019

Emergency medical dispatchers fail to identify approximately 25% of cases of out of hospital cardiac arrest, thus lose the opportunity to provide the caller instructions in cardiopulmonary resuscitation. We examined whether a machine learning framework could recognize out-of-hospital cardiac arrest from audio files of calls to the emergency medical dispatch center.

BIVA: A Very Deep Hierarchy of Latent Variables for Generative Modelling

February 2019

We introduce the Bidirectional-Inference Variational Autoencoder (BIVA), characterized by a skip-connected generative model and an inference network formed by a bidirectional stochastic inference path. We show that BIVA reaches state-of-the-art test likelihoods, generates sharp and coherent natural images, and uses the hierarchy of latent variables to capture different aspects of the data distribution.

On the Inductive Bias of Word-Character-Level Multi-Task Learning for Speech Recognition

November 2018

End-to-end automatic speech recognition (ASR) commonly transcribes audio signals into sequences of characters while its performance is evaluated by measuring the word-error rate (WER). This suggests that predicting sequences of words directly may be helpful instead.

The Orb: How Design Breathes Life Into AI

November 2018

Even from outside of the Tom Rossau flagship store, one is mesmerized by the warm light emanating from numerous circular shapes. Clusters of sculptural lamps populate walls and ceilings like flotillas of gently glowing jellyfish, conveying a singular sense of design drawing equally on geometry, woodcarving, and origami.

Paving the Product Highway for AI in Healthcare

November 2018

For some time now, people have been talking about the coming AI technologies that will revolutionize society and leave no line of work unaffected. Yet, so far, disappointingly few applications have actually seen the light of day. And those who have might seem rather underwhelming in scope. Where did the future go?

Alexa, why don't you understand me?

October 2018

In January 2017, a morning show on San Diego’s CW6 News covered a story on how a little girl from Dallas, Texas, accidentally ordered a $300 doll house and four pounds of sugar cookies by asking the family’s Amazon Alexa if it wanted to play dollhouse. The purpose of the show was to discuss a new set of issues that consumers were facing, as these voice-based assistants had made their entry into our homes.

Exploiting Nontrivial Connectivity for Automatic Speech Recognition

September 2018

We tested the effectiveness of three neural network architectures commonly used in image recognition for automatic speech recognition. These architectures: Residual Networks, Highway Networks, and Densely Connected Networks, all use nontrivial connections or skip connections. This allows networks with a very large number of layers to be trained without suffering from the vanishing gradient problem.

Utilizing Domain Knowledge in End-to-End Audio Processing

September 2018

We performed an exploratory study into improving end-to-end audio classification models. By introducing the intermediary regression task of approximating mel-spectrograms, we were able to classify raw waveform and mel-spectrogram input with equal accuracy. In future experiments we aim to fine-tune the end-to-end classification model to outperform models trained on hand-crafted features.

Improving Pre-Hospital Care For Language Minorities Using Machine Learning

July 2018

Pre-hospital emergency care should be of the same quality for all citizens. Similarly, benefits from advances in machine learning should be spread equally across society. Unfortunately, neither is the case. Language barriers, for instance, limit the quality of pre-hospital care given to language minorities, and algorithmic bias has already led to harm of specific societal groups.

Encrypt your Machine Learning

January 2018

Emergency medical dispatchers fail to identify approximately 25% of cases of out of hospital cardiac arrest, thus lose the opportunity to provide the caller instrWe have a pretty good understanding of the application of machine learning and cryptography as a security concept, but when it comes to combining the two, things become a bit nebulous and we enter fairly untraveled wilderness.

CTC Networks and Language Models: Prefix Beam Search Explained

January 2018

Automatic speech recognition (ASR) is one of the most difficult tasks in natural language processing. Traditionally it has been necessary to break down the process into a series of subtasks such as speech segmentation, acoustic modelling, and language modelling. Each of these subtasks was then solved by separate, individually trained models.

Get started with Corti

Corti integrates seamlessly with your existing systems, recording patient consultations without disrupting or interfering with the dialogue.