Breaking the Code in Electronic Healthcare Data

Female doctor writing notes while talking to male patient in hospital ward
Female doctor writing notes while talking to male patient in hospital ward

The International Classification of Diseases (ICD) has recently (October 2015) been implemented in the United States in its 10th revision (ICD-10) after a significant delay (> 25 years) [1] and with high anxiety [2] reminding us of the Y2K panic. The ICD, maintained by the World Health Organization (WHO), was initially created to provide a standard worldwide statistical analysis of reporting disease. However, the United States has adopted the international standard for a much more nefarious purpose, billing.

As part of the 9th revision of the ICD, the United States adapted a "Clinical Modification" version that has been in utilization as the ICD-9-CM and contains both diagnosis and procedural codes with the Centers for Medicare and Medicaid Services (CMS) overseeing changes. The downstream chaos that has befallen the utilization of ICD codes for billing is the lack of sensitivity and specificity when doing large scale electronic healthcare research. This may be even more exacerbated by the increased complexity of ICD-10 (more to come on ICD-10 in future articles, and if you are really interested in healthcare standards, check out this textbook).

Billing codes are entered into large healthcare databases (described in my previous article) in a variety of mechanisms:

1) They can come from the direct entry from a clinician in a clinical encounter (e.g. outpatient clinic visit).

2) They can come from an inpatient admission with hospital billing via a medical coding specialist [3].

3) They can be associated with a laboratory or a radiologic test (e.g. fasting lipid panel).

When aggregating across healthcare systems in large electronic databases there can be some remarkable inconsistencies when extracting based on ICD.

For the non-clinical folks reading this there is a world of difference between Type I and Type II Diabetes (lack of insulin production vs. decreased sensitivity to insulin). I also won't get into the details that most clinicians are not aware of, such as pancreatic type diabetes (type 3C) or maturity onset diabetes of the young (MODY). So how can these vastly different disease entities end up with the same codes (ICD-9-CM 250), and how do we deal with this when trying to work with healthcare data for quality improvement and research?

For this I will utilize our Informatics for Integrating Biology & the Bedside (I2B2) installation (note that this is not on our entire dataset for INPC), a de-identified count-based system, to demonstrate on real patient data how this shakes out.

Clinical question: How many patients with type II diabetes with neurological manifestations have adequate control of their blood sugars?

In our de-identified set we look for the ICD-9-CM codes. In this case 250.61 (Diabetes with neurological manifestations, type I) has 3,405 patients and 250.60 (Diabetes with neurological manifestations, type II) has 13,366. If we utilize the tool looking for overlap we find 1,545 (9.2 percent) of patients have been diagnosed with both diseases (e.g. have both codes). If we take a step back further and look at the overlap between uncomplicated type I (250.01) and type II (250.00) we find 24,823 and 146,051 patients respectively with 16,281 (9.5 percent) overlap. Even worse patients with "not stated as uncontrolled" with type II diabetes (n=146,051) have a HgbA1C value outside the controlled range (≥7 percent) in 16,444 (11.2 percent) of cases. Humorously there were 9,158 (6.3 percent) that had the code of "not stated as uncontrolled" at the same encounter as having a HgbA1C ≥7 percent.

While this is highly troubling from a reporting status (what the ICD was initially intended for), it can wreak havoc on clinical research. An approach would be to consider having more than one of the same coding (e.g. 250.00 > 1 time). If that is the case, we drop from 146,051 to 101,331 patients. However, if we overlap 250.00 (>1x (n=101,331)) and 250.01 (>1x (n=17,167)), we still find 9,524 overlap and very well may lose patients who only were coded once appropriately.

Another typical approach might be to potentially limit patients based on a specified medication (e.g. metformin), however, many medications overlap diseases and cannot differentiate between entities (e.g. insulin). If we look at patients with 250.00, we see that 37,565 (25.7 percent) have been on metformin, while for 250.01, we see that 2,538 (10.2 percent) have been.

We can stack these different methods for becoming more specific, however, we decrease our sensitivity to bring in all patients with the desired characteristics. With all this in mind, it is mind-boggling that ICD codes are utilized for high quality research publications that ultimately affect patient care (and our ability to state that coffee prolongs life). Do we really know which patients have type II diabetes for a population based study? The answer in this case, and many other research questions, is unfortunately no. We would struggle to separate these very different diseases without manually reviewing their charts, an impractical scenario when looking over large populations for outcomes.

These challenges will lead us to a future discussions on the utilization of natural language processing (NLP) for electronic healthcare data and statistical modeling to create the best methods for accurately identifying a true disease.