Medical Code Ontologies

OneEHR provides utilities for working with medical code systems, including ICD diagnosis codes, CCS grouping, and ATC drug classification.

ICD-9 / ICD-10

Normalize, parse, and group ICD diagnosis and procedure codes.

from oneehr.medcode import ICD9, ICD10

# Normalize codes (remove dots, whitespace, uppercase)
ICD9.normalize("401.9")   # → "4019"
ICD10.normalize("I10.0")  # → "I100"

# Chapter-level grouping
ICD9.chapter("401.9")     # → "Circulatory system"
ICD10.chapter("E11.9")    # → "Endocrine/metabolic"

# 3-digit category
ICD9.category("401.9")    # → "401"
ICD10.category("I10.0")   # → "I10"

# Parse into structured ICDCode object
code = ICD9.parse("401.9")
# ICDCode(version=9, code='4019', chapter='Circulatory system')

ICD-9 Chapters

Range Chapter
001-139 Infectious diseases
140-239 Neoplasms
240-279 Endocrine/metabolic
280-289 Blood diseases
290-319 Mental disorders
320-389 Nervous system
390-459 Circulatory system
460-519 Respiratory system
520-579 Digestive system
580-629 Genitourinary system
780-799 Symptoms/signs
800-999 Injury/poisoning
E-codes External causes
V-codes Supplementary factors

Cross-Mapping (GEM)

Load CMS General Equivalence Mapping files for ICD-9 to ICD-10 conversion:

from oneehr.medcode.icd import load_gem_mapping, icd9_to_icd10

load_gem_mapping("2018_I9gem.txt", direction="9to10")
icd9_to_icd10("4019")  # → ["I10"]

CCS Grouping

Clinical Classifications Software reduces thousands of ICD codes into ~280 clinically meaningful categories.

from oneehr.medcode import CCSGrouper

grouper = CCSGrouper("ccs_single_level.csv")
grouper.group("4019")        # → "101"
grouper.label("101")         # → "Essential hypertension"
grouper.group_with_label("4019")  # → ("101", "Essential hypertension")
print(len(grouper))          # number of mapped codes
print(grouper.categories)    # all CCS category IDs

The CCS mapping file can be downloaded from AHRQ.


ATC Drug Classification

The Anatomical Therapeutic Chemical (ATC) system classifies drugs in a 5-level hierarchy.

from oneehr.medcode import ATCHierarchy

atc = ATCHierarchy()

# Level detection
ATCHierarchy.level("A")        # → 1 (Anatomical main group)
ATCHierarchy.level("A02")      # → 2 (Therapeutic subgroup)
ATCHierarchy.level("A02BC01")  # → 5 (Chemical substance)

# Parent extraction
ATCHierarchy.parent("A02BC01", target_level=1)  # → "A"
ATCHierarchy.parent("A02BC01", target_level=2)  # → "A02"

# Name lookup (built-in for level 1)
atc.group_name("A02BC01", level=1)  # → "Alimentary tract and metabolism"

# 14 main groups
atc.main_groups  # {'A': 'Alimentary...', 'B': 'Blood...', ...}

Load a full ATC mapping for deeper levels:

atc = ATCHierarchy("atc_codes.csv")  # columns: atc_code, atc_name

CodeMapper

Unified interface for mapping event codes in dynamic tables, useful for dimensionality reduction before binning.

from oneehr.medcode import CodeMapper, CCSGrouper

mapper = CodeMapper()

# Map ICD codes to chapter-level groups
mapper.add_icd_chapter_mapping(version=9, prefix="DX_")

# Or to 3-digit categories
mapper.add_icd_category_mapping(version=9, prefix="DX_")

# Or to CCS groups
grouper = CCSGrouper("ccs_single_level.csv")
mapper.add_ccs_mapping(grouper, prefix="DX_")

# Map drug codes to ATC level-1 groups
from oneehr.medcode import ATCHierarchy
atc = ATCHierarchy()
mapper.add_atc_mapping(atc, level=1, prefix="RX_")

# Apply to events DataFrame
mapped_events = mapper.apply(events_df)

Integration with Preprocessing

Apply code mapping before running oneehr preprocess:

import pandas as pd
from oneehr.medcode import CodeMapper

# Load and map
events = pd.read_csv("data/dynamic.csv")
mapper = CodeMapper()
mapper.add_icd_chapter_mapping(version=9)
mapped = mapper.apply(events)
mapped.to_csv("data/dynamic_mapped.csv", index=False)

Then point your TOML config to the mapped file:

[dataset]
dynamic = "data/dynamic_mapped.csv"