ICD Hierarchy

EHR-based studies offer several advantages: they are cost efficient and allow large scale longitudinal analyses. EHR provides the potential to analyze hundreds of human diseases, drug responses, and many observable clinical traits. In particular, it allows phenome-wide association studies (PheWAS). The objective of Phecodes is to facilitate these large scale studies.

To define a patient’s phenotype, EHR leverages billing codes. However, those are not always organized meaningfully for the purpose of high-throughput phenotypic analyses. Phecode regroups ICD-10-cm and ICD-9 billing codes in order to facilitate clinical research using ICD codes and represent clinically relevant phenotypes.

About App

The Phecode to ICD map [or official app name] showcases the knowledge graph mapping the relationship between International Classification of Diseases (ICD) coding system and Phecode system. The major concepts involved and the main components of the app are listed below:

ICD codes

The ICD codes, maintained by the World Health Organization, are critical in Electronic Health Record (EHR) systems for cataloging a wide array of diseases, symptoms, and patient conditions into distinct identifiers. ICD-9 and ICD-10, along with their clinical modifications (ICD-9-CM and ICD-10-CM), are primarily used in the healthcare system in the United States for clinical documentation, billing, and reporting for regulatory compliance.


The Phecode system, short for “phenotype code”, was developed to categorize and group ICD codes into more manageable, clinically relevant groups that represent specific diseases or phenotypic traits to aid clinical research like phenome-wide association studies (PheWAS). An individual phecode could correspond to one or more ICD codes.


“Rollup” refers to the process by which detailed codes are aggregated into broader categories for various purposes. Since the phecode classification is hierarchical, it includes broader disease codes that can correspond to one or more codes in the ICD-9 or ICD-10-cm classifications. Rolling up ICD-9 and ICD-10-cm codes to parent codes should be done thoroughly.

Quick Start Guide

  1. Select an ICD code, Phecode entry, or phenotype of interest from the data table. Alternatively, type in a search box on the top of each column to apply filtering conditions.
  2. Look for sunburst chart and hierarchical tree visualization below that display relevant classification and hierarchical structure of Phecode-ICD mapping.
  3. On either visualization:
    • Hover on a node/wedge to see a detailed description of the code.
    • Click on a node/wedge to expand or collapse the hierarchy.
    • To see the complete tree structure associated with a node of interest, click on the wheel icon on the top right corner to change the max depth of the hierarchy.
    • (For sunburst plot only) Download the plot by clicking the camera icon located on the upper right corner.


Phecode mapping with ICD-9 and ICD-10-cm codes (top left)

  • Phecode, phenotype: phecode and their corresponding clinical phenotype
  • ICD version, ICD code, ICD Description: ICD code in their corresponding ICD version (ICD-9 or ICD-10-cm), and the clinical description
  • Rollup: denote whether or not ICDs mapped to this code also map to this code’s parents. For example, if rollup is 1, an ICD that maps to the Phecode 008.11 will also map to the Phecodes 008.1 and 008.


Sunburst chart (bottom left)

  • The highlighted text/node shows the hierarchical structure related to the selected ICD Code on the sunburst plot.
  • Inner circle shows phecodes for a given phenotype of interest, outer edges show the ICD-9 and ICD-10-cm system and the ICD codes that fall under the corresponding phecode classification.

Hierarchical tree (bottom right)

  • The highlighted text/node shows the hierarchical structure related to the selected ICD Code on the tree.
  • Root of the tree displays the parent phecodes and branches of the tree shows the corresponding ICD codes classified by their system (ICD-9, ICD-10-cm)

Notes for both visualizations

  • *: The ICD code maps exclusively to a specific Phecode (XXX.XX or XXX.X) and not its parent Phecode
  • **: The ICD code maps to both Phecode XXX.XX and its parent Phecode XXX.X, but does not map to XXX
  • G: ICD node with a prefix G: indicates ICD group that include children ICD codes, and can be expanded

Use Cases

To facilitate large scale clinical studies that involve a defined phenotype, this app allows researchers to select one or more phenotypes of interest, and collect an array of corresponding ICD-9 and ICD-10-cm codes based on the underlying Phecode-ICD mapping, which can be used to filter and subset large EHR dataset.


Wei WQ, Bastarache LA, Carroll RJ, Marlo JE, Osterman TJ, Gamazon ER, Cox NJ, Roden DM, Denny JC. Evaluating phecodes, clinical classification software, and ICD-9-CM codes for phenome-wide association studies in the electronic health record. PLoS One. 2017 Jul 7;12(7):e0175508. doi: 10.1371/journal.pone.0175508. PMID: 28686612; PMCID: PMC5501393.

Wu, P., Gifford, A., Meng, X., Li, X., Campbell, H., Varley, T., Zhao, J., Carroll, R., Bastarache, L., Denny, J. C., Theodoratou, E., & Wei, W. Q. (2019). Mapping ICD-10 and ICD-10-CM Codes to Phecodes: Workflow Development and Initial Evaluation. JMIR medical informatics, 7(4), e14325. https://doi.org/10.2196/14325

Phecode Mapping with ICD-9 and ICD-10-cm Codes