Skip to content

HLA

Summary Table

NAME CATEGORY CITATION YEAR
Sakaue HLA analyses tutorial Sakaue S, Gurajala S, Curtis M, Luo Y, ...&, Raychaudhuri S. (2023) Tutorial: a statistical genetics guide to identifying HLA alleles driving complex disease Nat. Protoc., 18 (9) 2625-2641. doi:10.1038/s41596-023-00853-4. PMID 37495751 2023
Four-digit Multi-ethnic HLA v1 (2021) HLA imputation panel Luo Y, Kanai M, Choi W, Li X, ...&, Raychaudhuri S. (2021) A high-resolution HLA reference panel capturing global population diversity enables multi-ancestry fine-mapping in HIV host response Nat. Genet., 53 (10) 1504-1516. doi:10.1038/s41588-021-00935-7. PMID 34611364 2021
Four-digit Multi-ethnic HLA v2 (2022) HLA imputation panel Luo Y, Kanai M, Choi W, Li X, ...&, Raychaudhuri S. (2021) A high-resolution HLA reference panel capturing global population diversity enables multi-ancestry fine-mapping in HIV host response Nat. Genet., 53 (10) 1504-1516. doi:10.1038/s41588-021-00935-7. PMID 34611364 2021
Han-MHC HLA imputation panel Zhou F, Cao H, Zuo X, Zhang T, ...&, Zhang X. (2016) Deep sequencing of the MHC region in the Chinese population contributes to studies of complex disease Nat. Genet., 48 (7) 740-746. doi:10.1038/ng.3576. PMID 27213287 2016
HLA-TAPAS HLA imputation pipeline Luo Y, Kanai M, Choi W, Li X, ...&, Raychaudhuri S. (2021) A high-resolution HLA reference panel capturing global population diversity enables multi-ancestry fine-mapping in HIV host response Nat. Genet., 53 (10) 1504-1516. doi:10.1038/s41588-021-00935-7. PMID 34611364 2021
CookHLA HLA imputation tool Cook S, Choi W, Lim H, Luo Y, ...&, Han B. (2021) Accurate imputation of human leukocyte antigens with CookHLA Nat. Commun., 12 (1) 1264. doi:10.1038/s41467-021-21541-5. PMID 33627654 2021
DEEP*HLA HLA imputation tool Naito T, Suzuki K, Hirata J, Kamatani Y, ...&, Okada Y. (2021) A deep learning method for HLA imputation and trans-ethnic MHC fine-mapping of type 1 diabetes Nat. Commun., 12 (1) 1639. doi:10.1038/s41467-021-21975-x. PMID 33712626 2021
HIBAG HLA imputation tool Zheng X, Shen J, Cox C, Wakefield JC, ...&, Weir BS. (2014) HIBAG--HLA genotype imputation with attribute bagging Pharmacogenomics J., 14 (2) 192-200. doi:10.1038/tpj.2013.18. PMID 23712092 2014
HLARIMNT HLA imputation tool Tanaka, K., Kato, K., Nonaka, N., & Seita, J. (2022). Efficient HLA imputation from sequential SNPs data by Transformer. arXiv preprint arXiv:2211.06430. NA
SNP2HLA HLA imputation tool Jia X, Han B, Onengut-Gumuscu S, Chen WM, ...&, de Bakker PI. (2013) Imputing amino acid polymorphisms in human leukocyte antigens PLoS One, 8 (6) e64683. doi:10.1371/journal.pone.0064683. PMID 23762245 2013

HLA analyses tutorial

Sakaue

  • NAME : Sakaue
  • URL : https://github.com/immunogenomics/HLA_analyses_tutorial
  • KEYWORDS : HLA analyses tutorial
  • TITLE : Tutorial: a statistical genetics guide to identifying HLA alleles driving complex disease
  • DOI : 10.1038/s41596-023-00853-4
  • ABSTRACT : The human leukocyte antigen (HLA) locus is associated with more complex diseases than any other locus in the human genome. In many diseases, HLA explains more heritability than all other known loci combined. In silico HLA imputation methods enable rapid and accurate estimation of HLA alleles in the millions of individuals that are already genotyped on microarrays. HLA imputation has been used to define causal variation in autoimmune diseases, such as type I diabetes, and in human immunodeficiency virus infection control. However, there are few guidelines on performing HLA imputation, association testing, and fine mapping. Here, we present a comprehensive tutorial to impute HLA alleles from genotype data. We provide detailed guidance on performing standard quality control measures for input genotyping data and describe options to impute HLA alleles and amino acids either locally or using the web-based Michigan Imputation Server, which hosts a multi-ancestry HLA imputation reference panel. We also offer best practice recommendations to conduct association tests to define the alleles, amino acids, and haplotypes that affect human traits. Along with the pipeline, we provide a step-by-step online guide with scripts and available software ( https://github.com/immunogenomics/HLA_analyses_tutorial ). This tutorial will be broadly applicable to large-scale genotyping data and will contribute to defining the role of HLA in human diseases across global populations.
  • CITATION : Sakaue S, Gurajala S, Curtis M, Luo Y, ...&, Raychaudhuri S. (2023) Tutorial: a statistical genetics guide to identifying HLA alleles driving complex disease Nat. Protoc., 18 (9) 2625-2641. doi:10.1038/s41596-023-00853-4. PMID 37495751
  • JOURNAL_INFO : Nature protocols ; Nat. Protoc. ; 2023 ; 18 ; 9 ; 2625-2641
  • PUBMED_LINK : 37495751

HLA imputation panel

Four-digit Multi-ethnic HLA v1 (2021)

  • NAME : Four-digit Multi-ethnic HLA v1 (2021)
  • SHORT NAME : Four-digit Multi-ethnic HLA v1 (2021)
  • DESCRIPTION : Available on Michigan imputation server
  • URL : https://github.com/immunogenomics/HLA-TAPAS/
  • TITLE : A high-resolution HLA reference panel capturing global population diversity enables multi-ancestry fine-mapping in HIV host response
  • DOI : 10.1038/s41588-021-00935-7
  • ABSTRACT : Fine-mapping to plausible causal variation may be more effective in multi-ancestry cohorts, particularly in the MHC, which has population-specific structure. To enable such studies, we constructed a large (n = 21,546) HLA reference panel spanning five global populations based on whole-genome sequences. Despite population-specific long-range haplotypes, we demonstrated accurate imputation at G-group resolution (94.2%, 93.7%, 97.8% and 93.7% in admixed African (AA), East Asian (EAS), European (EUR) and Latino (LAT) populations). Applying HLA imputation to genome-wide association study data for HIV-1 viral load in three populations (EUR, AA and LAT), we obviated effects of previously reported associations from population-specific HIV studies and discovered a novel association at position 156 in HLA-B. We pinpointed the MHC association to three amino acid positions (97, 67 and 156) marking three consecutive pockets (C, B and D) within the HLA-B peptide-binding groove, explaining 12.9% of trait variance.
  • CITATION : Luo Y, Kanai M, Choi W, Li X, ...&, Raychaudhuri S. (2021) A high-resolution HLA reference panel capturing global population diversity enables multi-ancestry fine-mapping in HIV host response Nat. Genet., 53 (10) 1504-1516. doi:10.1038/s41588-021-00935-7. PMID 34611364
  • JOURNAL_INFO : Nature genetics ; Nat. Genet. ; 2021 ; 53 ; 10 ; 1504-1516
  • PUBMED_LINK : 34611364

Four-digit Multi-ethnic HLA v2 (2022)

  • NAME : Four-digit Multi-ethnic HLA v2 (2022)
  • SHORT NAME : Four-digit Multi-ethnic HLA v2 (2022)
  • DESCRIPTION : Available on Michigan imputation server
  • URL : https://github.com/immunogenomics/HLA-TAPAS/
  • TITLE : A high-resolution HLA reference panel capturing global population diversity enables multi-ancestry fine-mapping in HIV host response
  • DOI : 10.1038/s41588-021-00935-7
  • ABSTRACT : Fine-mapping to plausible causal variation may be more effective in multi-ancestry cohorts, particularly in the MHC, which has population-specific structure. To enable such studies, we constructed a large (n = 21,546) HLA reference panel spanning five global populations based on whole-genome sequences. Despite population-specific long-range haplotypes, we demonstrated accurate imputation at G-group resolution (94.2%, 93.7%, 97.8% and 93.7% in admixed African (AA), East Asian (EAS), European (EUR) and Latino (LAT) populations). Applying HLA imputation to genome-wide association study data for HIV-1 viral load in three populations (EUR, AA and LAT), we obviated effects of previously reported associations from population-specific HIV studies and discovered a novel association at position 156 in HLA-B. We pinpointed the MHC association to three amino acid positions (97, 67 and 156) marking three consecutive pockets (C, B and D) within the HLA-B peptide-binding groove, explaining 12.9% of trait variance.
  • CITATION : Luo Y, Kanai M, Choi W, Li X, ...&, Raychaudhuri S. (2021) A high-resolution HLA reference panel capturing global population diversity enables multi-ancestry fine-mapping in HIV host response Nat. Genet., 53 (10) 1504-1516. doi:10.1038/s41588-021-00935-7. PMID 34611364
  • JOURNAL_INFO : Nature genetics ; Nat. Genet. ; 2021 ; 53 ; 10 ; 1504-1516
  • PUBMED_LINK : 34611364

Han-MHC

  • NAME : Han-MHC
  • SHORT NAME : Han-MHC
  • URL : http://gigadb.org/dataset/100156
  • TITLE : Deep sequencing of the MHC region in the Chinese population contributes to studies of complex disease
  • DOI : 10.1038/ng.3576
  • ABSTRACT : The human major histocompatibility complex (MHC) region has been shown to be associated with numerous diseases. However, it remains a challenge to pinpoint the causal variants for these associations because of the extreme complexity of the region. We thus sequenced the entire 5-Mb MHC region in 20,635 individuals of Han Chinese ancestry (10,689 controls and 9,946 patients with psoriasis) and constructed a Han-MHC database that includes both variants and HLA gene typing results of high accuracy. We further identified multiple independent new susceptibility loci in HLA-C, HLA-B, HLA-DPB1 and BTNL2 and an intergenic variant, rs118179173, associated with psoriasis and confirmed the well-established risk allele HLA-C*06:02. We anticipate that our Han-MHC reference panel built by deep sequencing of a large number of samples will serve as a useful tool for investigating the role of the MHC region in a variety of diseases and thus advance understanding of the pathogenesis of these disorders.
  • CITATION : Zhou F, Cao H, Zuo X, Zhang T, ...&, Zhang X. (2016) Deep sequencing of the MHC region in the Chinese population contributes to studies of complex disease Nat. Genet., 48 (7) 740-746. doi:10.1038/ng.3576. PMID 27213287
  • JOURNAL_INFO : Nature genetics ; Nat. Genet. ; 2016 ; 48 ; 7 ; 740-746
  • PUBMED_LINK : 27213287

HLA imputation pipeline

HLA-TAPAS

  • NAME : HLA-TAPAS
  • SHORT NAME : HLA-TAPAS
  • FULL NAME : HLA-Typing At Protein for Association Studie
  • DESCRIPTION : HLA-TAPAS (HLA-Typing At Protein for Association Studies) is an HLA-focused pipeline that can handle HLA reference panel construction (MakeReference), HLA imputation (SNP2HLA), and HLA association (HLAassoc).
  • URL : https://github.com/immunogenomics/HLA-TAPAS
  • KEYWORDS : HLA pipeline
  • TITLE : A high-resolution HLA reference panel capturing global population diversity enables multi-ancestry fine-mapping in HIV host response
  • DOI : 10.1038/s41588-021-00935-7
  • ABSTRACT : Fine-mapping to plausible causal variation may be more effective in multi-ancestry cohorts, particularly in the MHC, which has population-specific structure. To enable such studies, we constructed a large (n = 21,546) HLA reference panel spanning five global populations based on whole-genome sequences. Despite population-specific long-range haplotypes, we demonstrated accurate imputation at G-group resolution (94.2%, 93.7%, 97.8% and 93.7% in admixed African (AA), East Asian (EAS), European (EUR) and Latino (LAT) populations). Applying HLA imputation to genome-wide association study data for HIV-1 viral load in three populations (EUR, AA and LAT), we obviated effects of previously reported associations from population-specific HIV studies and discovered a novel association at position 156 in HLA-B. We pinpointed the MHC association to three amino acid positions (97, 67 and 156) marking three consecutive pockets (C, B and D) within the HLA-B peptide-binding groove, explaining 12.9% of trait variance.
  • CITATION : Luo Y, Kanai M, Choi W, Li X, ...&, Raychaudhuri S. (2021) A high-resolution HLA reference panel capturing global population diversity enables multi-ancestry fine-mapping in HIV host response Nat. Genet., 53 (10) 1504-1516. doi:10.1038/s41588-021-00935-7. PMID 34611364
  • JOURNAL_INFO : Nature genetics ; Nat. Genet. ; 2021 ; 53 ; 10 ; 1504-1516
  • PUBMED_LINK : 34611364

HLA imputation tool

CookHLA

  • NAME : CookHLA
  • URL : https://github.com/WansonChoi/CookHLA
  • TITLE : Accurate imputation of human leukocyte antigens with CookHLA
  • DOI : 10.1038/s41467-021-21541-5
  • ABSTRACT : The recent development of imputation methods enabled the prediction of human leukocyte antigen (HLA) alleles from intergenic SNP data, allowing studies to fine-map HLA for immune phenotypes. Here we report an accurate HLA imputation method, CookHLA, which has superior imputation accuracy compared to previous methods. CookHLA differs from other approaches in that it locally embeds prediction markers into highly polymorphic exons to account for exonic variability, and in that it adaptively learns the genetic map within MHC from the data to facilitate imputation. Our benchmarking with real datasets shows that our method achieves high imputation accuracy in a wide range of scenarios, including situations where the reference panel is small or ethnically unmatched.
  • COPYRIGHT : https://creativecommons.org/licenses/by/4.0
  • CITATION : Cook S, Choi W, Lim H, Luo Y, ...&, Han B. (2021) Accurate imputation of human leukocyte antigens with CookHLA Nat. Commun., 12 (1) 1264. doi:10.1038/s41467-021-21541-5. PMID 33627654
  • JOURNAL_INFO : Nature communications ; Nat. Commun. ; 2021 ; 12 ; 1 ; 1264
  • PUBMED_LINK : 33627654

DEEP*HLA

  • NAME : DEEP*HLA
  • URL : https://github.com/tatsuhikonaito/DEEP-HLA
  • TITLE : A deep learning method for HLA imputation and trans-ethnic MHC fine-mapping of type 1 diabetes
  • DOI : 10.1038/s41467-021-21975-x
  • ABSTRACT : Conventional human leukocyte antigen (HLA) imputation methods drop their performance for infrequent alleles, which is one of the factors that reduce the reliability of trans-ethnic major histocompatibility complex (MHC) fine-mapping due to inter-ethnic heterogeneity in allele frequency spectra. We develop DEEPHLA, a deep learning method for imputing HLA genotypes. Through validation using the Japanese and European HLA reference panels (n = 1,118 and 5,122), DEEPHLA achieves the highest accuracies with significant superiority for low-frequency and rare alleles. DEEPHLA is less dependent on distance-dependent linkage disequilibrium decay of the target alleles and might capture the complicated region-wide information. We apply DEEPHLA to type 1 diabetes GWAS data from BioBank Japan (n = 62,387) and UK Biobank (n = 354,459), and successfully disentangle independently associated class I and II HLA variants with shared risk among diverse populations (the top signal at amino acid position 71 of HLA-DRβ1; P = 7.5 × 10-120). Our study illustrates the value of deep learning in genotype imputation and trans-ethnic MHC fine-mapping.
  • CITATION : Naito T, Suzuki K, Hirata J, Kamatani Y, ...&, Okada Y. (2021) A deep learning method for HLA imputation and trans-ethnic MHC fine-mapping of type 1 diabetes Nat. Commun., 12 (1) 1639. doi:10.1038/s41467-021-21975-x. PMID 33712626
  • JOURNAL_INFO : Nature communications ; Nat. Commun. ; 2021 ; 12 ; 1 ; 1639
  • PUBMED_LINK : 33712626

HIBAG

  • NAME : HIBAG
  • URL : https://github.com/zhengxwen/HIBAG
  • TITLE : HIBAG--HLA genotype imputation with attribute bagging
  • DOI : 10.1038/tpj.2013.18
  • ABSTRACT : Genotyping of classical human leukocyte antigen (HLA) alleles is an essential tool in the analysis of diseases and adverse drug reactions with associations mapping to the major histocompatibility complex (MHC). However, deriving high-resolution HLA types subsequent to whole-genome single-nucleotide polymorphism (SNP) typing or sequencing is often cost prohibitive for large samples. An alternative approach takes advantage of the extended haplotype structure within the MHC to predict HLA alleles using dense SNP genotypes, such as those available from genome-wide SNP panels. Current methods for HLA imputation are difficult to apply or may require the user to have access to large training data sets with SNP and HLA types. We propose HIBAG, HLA Imputation using attribute BAGging, that makes predictions by averaging HLA-type posterior probabilities over an ensemble of classifiers built on bootstrap samples. We assess the performance of HIBAG using our study data (n=2668 subjects of European ancestry) as a training set and HLA data from the British 1958 birth cohort study (n≈1000 subjects) as independent validation samples. Prediction accuracies for HLA-A, B, C, DRB1 and DQB1 range from 92.2% to 98.1% using a set of SNP markers common to the Illumina 1M Duo, OmniQuad, OmniExpress, 660K and 550K platforms. HIBAG performed well compared with the other two leading methods, HLA*IMP and BEAGLE. This method is implemented in a freely available HIBAG R package that includes pre-fit classifiers for European, Asian, Hispanic and African ancestries, providing a readily available imputation approach without the need to have access to large training data sets.
  • CITATION : Zheng X, Shen J, Cox C, Wakefield JC, ...&, Weir BS. (2014) HIBAG--HLA genotype imputation with attribute bagging Pharmacogenomics J., 14 (2) 192-200. doi:10.1038/tpj.2013.18. PMID 23712092
  • JOURNAL_INFO : The pharmacogenomics journal ; Pharmacogenomics J. ; 2014 ; 14 ; 2 ; 192-200
  • PUBMED_LINK : 23712092

HLARIMNT

  • NAME : HLARIMNT
  • SHORT NAME : HLARIMNT
  • FULL NAME : HLA Reliable IMputatioN by Transformer
  • URL : https://github.com/seitalab/HLARIMNT
  • KEYWORDS : HLA, imputation
  • CITATION : Tanaka, K., Kato, K., Nonaka, N., & Seita, J. (2022). Efficient HLA imputation from sequential SNPs data by Transformer. arXiv preprint arXiv:2211.06430.

SNP2HLA

  • NAME : SNP2HLA
  • URL : http://software.broadinstitute.org/mpg/snp2hla/
  • TITLE : Imputing amino acid polymorphisms in human leukocyte antigens
  • DOI : 10.1371/journal.pone.0064683
  • ABSTRACT : DNA sequence variation within human leukocyte antigen (HLA) genes mediate susceptibility to a wide range of human diseases. The complex genetic structure of the major histocompatibility complex (MHC) makes it difficult, however, to collect genotyping data in large cohorts. Long-range linkage disequilibrium between HLA loci and SNP markers across the major histocompatibility complex (MHC) region offers an alternative approach through imputation to interrogate HLA variation in existing GWAS data sets. Here we describe a computational strategy, SNP2HLA, to impute classical alleles and amino acid polymorphisms at class I (HLA-A, -B, -C) and class II (-DPA1, -DPB1, -DQA1, -DQB1, and -DRB1) loci. To characterize performance of SNP2HLA, we constructed two European ancestry reference panels, one based on data collected in HapMap-CEPH pedigrees (90 individuals) and another based on data collected by the Type 1 Diabetes Genetics Consortium (T1DGC, 5,225 individuals). We imputed HLA alleles in an independent data set from the British 1958 Birth Cohort (N = 918) with gold standard four-digit HLA types and SNPs genotyped using the Affymetrix GeneChip 500 K and Illumina Immunochip microarrays. We demonstrate that the sample size of the reference panel, rather than SNP density of the genotyping platform, is critical to achieve high imputation accuracy. Using the larger T1DGC reference panel, the average accuracy at four-digit resolution is 94.7% using the low-density Affymetrix GeneChip 500 K, and 96.7% using the high-density Illumina Immunochip. For amino acid polymorphisms within HLA genes, we achieve 98.6% and 99.3% accuracy using the Affymetrix GeneChip 500 K and Illumina Immunochip, respectively. Finally, we demonstrate how imputation and association testing at amino acid resolution can facilitate fine-mapping of primary MHC association signals, giving a specific example from type 1 diabetes.
  • CITATION : Jia X, Han B, Onengut-Gumuscu S, Chen WM, ...&, de Bakker PI. (2013) Imputing amino acid polymorphisms in human leukocyte antigens PLoS One, 8 (6) e64683. doi:10.1371/journal.pone.0064683. PMID 23762245
  • JOURNAL_INFO : PloS one ; PLoS One ; 2013 ; 8 ; 6 ; e64683
  • PUBMED_LINK : 23762245