Skip to content

References Phenotype

Curation of Phenotype — listings under the References tab.

Summary Table

Click a column header to sort the table.

NAME CATEGORY Main citation YEAR
Phecode Clinical coding
Wei WQ et al., PLoS One, 2017
2017
ICD-10 Disease classification
1990
1990
ICD-11 Disease classification
2019
2019
ICD-9 Disease classification
1977
1977
ATC Drug classification
1976
1976

Clinical coding

Phecode

Phenotypes
PUBMED_LINK
28686612
DESCRIPTION
Phecodes represent one strategy for defining phenotypes for research using EHR data. They are a high-throughput phenotyping tool based on ICD (International Classification of Diseases) codes that can be used to rapidly define the case/control status of thousands of clinically meaningful diseases and conditions.
URL
https://phewascatalog.org/phecodes
KEYWORDS
PheWAS; EHR; ICD; phenotype grouping
TITLE
Evaluating phecodes, clinical classification software, and ICD-9-CM codes for phenome-wide association studies in the electronic health record.
Main citation
Wei WQ, Bastarache LA, Carroll RJ, Marlo JE, ...&, Denny JC. (2017) Evaluating phecodes, clinical classification software, and ICD-9-CM codes for phenome-wide association studies in the electronic health record. PLoS One, 12 (7) e0175508. doi:10.1371/journal.pone.0175508. PMID 28686612
ABSTRACT
OBJECTIVE: To compare three groupings of Electronic Health Record (EHR) billing codes for their ability to represent clinically meaningful phenotypes and to replicate known genetic associations. The three tested coding systems were the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) codes, the Agency for Healthcare Research and Quality Clinical Classification Software for ICD-9-CM (CCS), and manually curated "phecodes" designed to facilitate phenome-wide association studies (PheWAS) in EHRs. METHODS AND MATERIALS: We selected 100 disease phenotypes and compared the ability of each coding system to accurately represent them without performing additional groupings. The 100 phenotypes included 25 randomly-chosen clinical phenotypes pursued in prior genome-wide association studies (GWAS) and another 75 common disease phenotypes mentioned across free-text problem lists from 189,289 individuals. We then evaluated the performance of each coding system to replicate known associations for 440 SNP-phenotype pairs. RESULTS: Out of the 100 tested clinical phenotypes, phecodes exactly matched 83, compared to 53 for ICD-9-CM and 32 for CCS. ICD-9-CM codes were typically too detailed (requiring custom groupings) while CCS codes were often not granular enough. Among 440 tested known SNP-phenotype associations, use of phecodes replicated 153 SNP-phenotype pairs compared to 143 for ICD-9-CM and 139 for CCS. Phecodes also generally produced stronger odds ratios and lower p-values for known associations than ICD-9-CM and CCS. Finally, evaluation of several SNPs via PheWAS identified novel potential signals, some seen in only using the phecode approach. Among them, rs7318369 in PEPD was associated with gastrointestinal hemorrhage. CONCLUSION: Our results suggest that the phecode groupings better align with clinical diseases mentioned in clinical practice or for genomic studies. ICD-9-CM, CCS, and phecode groupings all worked for PheWAS-type studies, though the phecode groupings produced superior results.
DOI
10.1371/journal.pone.0175508

Disease classification

ICD-10

Phenotypes
FULL NAME
International Classification of Diseases 10th Revision
DESCRIPTION
WHO standard for coding diagnoses, symptoms, and procedures in health records and vital statistics. Widely used for epidemiology, billing, and EHR-based research (including mapping to other systems such as Phecodes).
URL
https://icd.who.int/browse10/2019/en
KEYWORDS
ICD-10; clinical coding; diagnosis; mortality; morbidity
Main citation
World Health Organization. International Statistical Classification of Diseases and Related Health Problems, 10th Revision (ICD-10). Geneva: WHO.

ICD-11

Phenotypes
FULL NAME
International Classification of Diseases 11th Revision
DESCRIPTION
WHO's successor to ICD-10 for mortality and morbidity statistics (ICD-11 for MMS), with a digital-first structure, updated clinical detail, and tooling for implementation in health systems and research.
URL
https://icd.who.int/en
KEYWORDS
ICD-11; clinical coding; MMS; mortality; morbidity
Main citation
World Health Organization. International Classification of Diseases for Mortality and Morbidity Statistics, 11th Revision (ICD-11). Geneva: WHO.

ICD-9

Phenotypes
FULL NAME
International Classification of Diseases 9th Revision
DESCRIPTION
Earlier WHO disease and injury classification still present in many historical EHR and claims datasets (e.g. ICD-9-CM in US data). Useful for harmonization with ICD-10/11 and aggregated phenotypes such as Phecodes.
URL
https://icd.who.int/browse9/l/en
KEYWORDS
ICD-9; ICD-9-CM; clinical coding; legacy EHR
Main citation
World Health Organization. International Classification of Diseases, 9th Revision (ICD-9). Geneva: WHO.

Drug classification

ATC

Phenotypes
FULL NAME
Anatomical Therapeutic Chemical Classification System
DESCRIPTION
Hierarchical system for classifying drugs by anatomical site, therapeutic and pharmacological subgroup, and chemical substance. Maintained by the WHO Collaborating Centre for Drug Statistics Methodology; widely used in prescription and claims data for exposure phenotypes.
URL
https://atcddd.fhi.no/atc/structure_and_principles/
KEYWORDS
ATC; drugs; pharmacoepidemiology; medication coding
Main citation
WHO Collaborating Centre for Drug Statistics Methodology. Anatomical Therapeutic Chemical (ATC) classification system. Norwegian Institute of Public Health.