By clicking “Accept All Cookies”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. View our Privacy Policy for more information.
Thought leadership

Why is medical coding so difficult?

Why is medical coding so difficult?

Medical coding is a vital process in the healthcare industry, involving the translation of medical diagnoses, procedures, services, and equipment into standardized alphanumeric codes. These codes are derived from several coding systems, the most common of which is the International Classification of Diseases (ICD).

The primary purpose of medical coding is to ensure that the vast amount of information in patient records is accurately and efficiently captured for billing, research, and statistical purposes. It allows for standardized communication across the healthcare system, reducing errors and improving the efficiency of healthcare delivery. The coding process involves reviewing clinical documentation and assigning appropriate codes that reflect the diagnoses and procedures performed during patient encounters​.

Overall, medical coding is a crucial component in the administration of healthcare, impacting everything from patient care to reimbursement and health statistics.

Medical coding may sound simple, but it is not. In ICD-10-CM, the code system used for diagnoses in the US, there are more than 70,000 unique medical codes, and hundreds of new codes are added each year. The vast number of codes results from the extremely specific medical code system. For instance, there are more than 300 medical codes representing diabetes mellitus. There are also medical codes that represent absurdly specific conditions, causes of injury, and places of occurrence.

W59.22 - Struck by a turtle
- Burn due to water-skis on fire
- Bizarre personal appearance
- Problems in relationship with in-laws
Y92.146 - Swimming-pool of prison as the place of occurrence of the external cause

Medical documents are often lengthy, especially for inpatient care in the US. For instance, an average admission in the Beth Israel Deaconess Medical Center intensive care unit in Boston contains more than 10,000 words of documentation and is annotated with 14 medical codes (reference). 

Medical coding is error-prone and time-consuming due to the sheer number of codes and the length of medical documents. Coding errors can have serious consequences: inaccuracies in a patient's medical history can affect future treatments, and incorrect billing can result in healthcare providers not being reimbursed by insurance companies. The slow process of medical coding also contributes to high administrative costs in healthcare, and it takes time away from patients in hospitals where physicians code.

Previous solutions to assist medical coders

To address these challenges, many companies offer systems that aim to reduce the time and errors associated with medical coding by automatically detecting medical codes from the documentation. Most of these systems rely on a keyword-matching approach, which involves defining a set of phrases for each medical code and suggesting a code when one of the phrases appears in the text.

While this approach is intuitive, describing all possible phrases for all 70,000 codes is impractical. Moreover, medical coding is more complex than simply matching phrases to codes. The system must understand negations, combine information from multiple sources, and understand how different conditions and symptoms relate.


Corti is designed to assist, not replace, medical coders. Our AI reads medical documentation and suggests relevant medical codes, which the medical coder can then accept or reject, as well as add any missed codes. This approach speeds up medical coding.

Our AI has several advantages over keyword-based systems. It understands paraphrasing, negations, and abbreviations in medical texts without us manually defining rules. Additionally, it extracts information from the entire document, not just specific phrases, allowing it to use long-range dependencies to predict appropriate codes. We detailed state-of-the-art systems in our peer-reviewed article, Automated Medical Coding on MIMIC-III and MIMIC-IV: A Critical Review and Replicability Study.

Black box AI

The increased accuracy of AI comes at the expense of transparency. AI systems are often black boxes; we know what they predict but not why. The lack of transparency makes validating medical code suggestions time-consuming.

Consider the following example:

This example depicts the medical document on the left and the subsequent, AI-suggested medical codes on the right. When the AI suggests medical codes, the medical coder must then validate these suggestions by finding the relevant evidence in the text.

With thousands of words and dozens of suggestions, this can be prohibitively slow. Furthermore, the medical coder may incorrectly dismiss a suggestion if they don’t find the evidence. To avoid medical coders from ignoring AI suggestions, we must provide tools that simplify the validation process.

Explainable AI

With explainable AI, we can highlight the spans of text that caused the AI's code suggestions, assisting medical coders in verifying these suggestions by directing them to the evidence. 

The example below demonstrates how such explanations simplify validation;

Corti’s explainable AI research

AI explainability is important in many cases, and there are several methods already in use, but they’re poorly suited for medical coding. Let’s look at why.

Surrogate-model methods train an AI to predict the words that impact another AI’s suggestions. However, there is no guarantee that the predicted words will be the actual cause of the suggestions. It is like dog owners explaining their dog's actions; we risk misleading explanations.

Perturbation-based methods
remove words from the text and measure how they change the AI suggestions. If removing a word has a high impact, the word should be highlighted. However, removing one word can be insufficient; a condition can be mentioned multiple times, so removing one mention may not change the AI suggestion. Therefore, we should test the impact of removing combinations of words, ideally all possible combinations. However, in a text comprised of a thousand words, there are more combinations than atoms in the universe. While there are faster methods that approximate such explanations, they are still prohibitively slow on long documents.

Gradient-based methods are common in image classification (AI that predicts what is in an image). They calculate the input gradient of an output, which is equivalent to measuring how much wiggling the pixel values in the image would wiggle the AI’s output. What these methods measure in text is less intuitive because we cannot wiggle words. We found these methods to perform poorly in medical coding.

Attention-based methods are used in most previous research papers on explainable AI in medical coding. Most modern AI uses attention. Like humans, the AI learns to focus on the most important words while ignoring the rest. Attention-based methods extract which words the model attended to when suggesting a medical code. While we have found them to be more effective than the previously mentioned methods, they could be better. In three separate studies, medical coders deemed about a third of such explanations clinically irrelevant.

Due to these issues, we’re actively researching how to generate better explanations. Recently, we found a simple approach to alleviating some of the weaknesses of attention-based methods, drastically improving the explanations. We will of course publish the approach and results in an upcoming research article for all to see.

We look forward to showing you our progress and introducing our new solutions to physicians, secretaries, and medical coders to make administration faster and better!