The use of billing codes in large data sets for identifying diagnoses can result in incorrect identification of up to two-thirds of cases, according to a new study led by UCLA.
Medical research databases like those from the Centers for Medicare & Medicaid Services or National Inpatient Survey often depend on ambulatory billing codes to identify diseases and treatments. However, these codes' accuracy rarely gets validated in studies that utilize this data, as noted in a report published in British Journal of Surgery.
British Journal of SurgeryThe study focused on hernia diagnoses but researchers warn that similar inaccuracies could occur with other conditions when relying on billing codes. Dr. Edward Livingston, health sciences professor of surgery at UCLA's David Geffen School of Medicine and lead author of the report, explained, "Researchers often assume a disease is present if its code appears in a large dataset. Our research shows this isn't always true. Studies relying solely on these codes might draw incorrect conclusions because of misidentification."
The team analyzed records for 1.36 million patients, identifying hernias in 41,700 cases through coding—with 12,800 (45%) coded as diaphragmatic hernias, 7,000 (24%) as ventral hernias, and 8,800 (31%) as inguinal hernias.
Among those records, the researchers had images for 28,600 patients. Only 10,234 (36%) of these matched with actual hernia diagnoses: 4,325 (34%) diaphragmatic, 3,069 (44%) ventral, and 2,840 (32%) inguinal.
The researchers speculate that discrepancies arise because doctors base their coding on initial symptoms rather than the final diagnosis. For example, if a patient's visit is coded for a possible hernia, this code remains even if further tests disprove it.
"This study exposes a fundamental flaw in using administrative data to identify diseases," they write. "Diagnoses are considered but not always proven during the initial coding process."
"We've found that dependence on billing codes for identifying hernias could lead to misidentification in two-thirds of cases, highlighting limitations in administrative data used for clinical research. Validating diagnosis through accurate coding is crucial before relying on these insights."
The study was co-authored by Hila Zilicha, Dr. Douglas Bell, and Dr. Yijun Chen.