Tools Heritability and genetic correlation

Curation of Heritability and genetic correlation — listings under the GWAS Tools tab.

Summary Table

Click a column header to sort the table.

NAME	CATEGORY	Main citation	YEAR
S-LDXR	Genetic correlation	Shi H et al., Nat Commun, 2021	2021
cross-trait LDSC	Genetic correlation	Bulik-Sullivan B et al., Nat Genet, 2015	2015
popcorn	Genetic correlation	Brown BC et al., Am J Hum Genet, 2016	2016
BayesRR-RC	Heritability	Patxot M et al., Nat Commun, 2021	2021
GCTA-GREML-Binary	Heritability	Lee SH et al., Am J Hum Genet, 2011	2011
GCTA-GREML-Bivariate	Heritability	Lee SH et al., Bioinformatics, 2012	2012
GCTA-GREML-LDMS	Heritability	Yang J et al., Nat Genet, 2015	2015
GCTA-GREML-Partition	Heritability	Yang J et al., Nat Genet, 2011	2011
GCTA-GREML-Quantitative	Heritability	Yang J et al., Nat Genet, 2010	2010
HDL	Heritability	Ning Z et al., Nat Genet, 2020	2020
LDAK	Heritability	Speed D et al., Am J Hum Genet, 2012	2012
LDSC	Heritability	Bulik-Sullivan BK et al., Nat Genet, 2015	2015
PHBC	Heritability	Zhao Y et al., Nat Genet, 2026	2026
SumHer	Heritability	Speed D et al., Nat Genet, 2019	2019
TetraHer	Heritability	Speed D et al., Am J Hum Genet, 2024	2024
GNOVA	Local heritability/genetic correlation	Lu Q et al., Am J Hum Genet, 2017	2017
HDL-L	Local heritability/genetic correlation	Li Y et al., Nat Genet, 2025	2025
HEELS	Local heritability/genetic correlation	Li H et al., Nat Commun, 2023	2023
HESS	Local heritability/genetic correlation	Shi H et al., Am J Hum Genet, 2016	2016
LAVA	Local heritability/genetic correlation	Werme J et al., Nat Genet, 2022	2022
Logica	Local heritability/genetic correlation	NA	NA
SUPERGNOVA	Local heritability/genetic correlation	Zhang Y et al., Genome Biol, 2021	2021

Genetic correlation

S-LDXR

Tool

PUBMED_LINK

33597505

DESCRIPTION

S-LDXR is a software for estimating enrichment of stratified squared trans-ethnic genetic correlation across genomic annotations from GWAS summary statistics data.

Show full descriptionShow less

URL

https://huwenboshi.github.io/s-ldxr/

KEYWORDS

trans-ethnic, stratified, functional categories

Show full keywordsShow less

TITLE

Population-specific causal disease effect sizes in functionally important regions impacted by selection.

Main citation

Shi H, Gazal S, Kanai M, Koch EM, ...&, Price AL. (2021) Population-specific causal disease effect sizes in functionally important regions impacted by selection. Nat Commun, 12 (1) 1098. doi:10.1038/s41467-021-21286-1. PMID 33597505

ABSTRACT

Many diseases exhibit population-specific causal effect sizes with trans-ethnic genetic correlations significantly less than 1, limiting trans-ethnic polygenic risk prediction. We develop a new method, S-LDXR, for stratifying squared trans-ethnic genetic correlation across genomic annotations, and apply S-LDXR to genome-wide summary statistics for 31 diseases and complex traits in East Asians (average N = 90K) and Europeans (average N = 267K) with an average trans-ethnic genetic correlation of 0.85. We determine that squared trans-ethnic genetic correlation is 0.82× (s.e. 0.01) depleted in the top quintile of background selection statistic, implying more population-specific causal effect sizes. Accordingly, causal effect sizes are more population-specific in functionally important regions, including conserved and regulatory regions. In regions surrounding specifically expressed genes, causal effect sizes are most population-specific for skin and immune genes, and least population-specific for brain genes. Our results could potentially be explained by stronger gene-environment interaction at loci impacted by selection, particularly positive selection.

Show full abstractShow less

DOI

10.1038/s41467-021-21286-1

cross-trait LDSC

Tool

PUBMED_LINK

26414676

FULL NAME

cross-trait LD Score Regression

DESCRIPTION

ldsc is a command line tool for estimating heritability and genetic correlation from GWAS summary statistics. ldsc also computes LD Scores.

Show full descriptionShow less

URL

https://github.com/bulik/ldsc

KEYWORDS

cross-trait, LD score regression

Show full keywordsShow less

TITLE

An atlas of genetic correlations across human diseases and traits.

Main citation

Bulik-Sullivan B, Finucane HK, Anttila V, Gusev A, ...&, Neale BM. (2015) An atlas of genetic correlations across human diseases and traits. Nat Genet, 47 (11) 1236-41. doi:10.1038/ng.3406. PMID 26414676

ABSTRACT

Identifying genetic correlations between complex traits and diseases can provide useful etiological insights and help prioritize likely causal relationships. The major challenges preventing estimation of genetic correlation from genome-wide association study (GWAS) data with current methods are the lack of availability of individual-level genotype data and widespread sample overlap among meta-analyses. We circumvent these difficulties by introducing a technique-cross-trait LD Score regression-for estimating genetic correlation that requires only GWAS summary statistics and is not biased by sample overlap. We use this method to estimate 276 genetic correlations among 24 traits. The results include genetic correlations between anorexia nervosa and schizophrenia, anorexia and obesity, and educational attainment and several diseases. These results highlight the power of genome-wide analyses, as there currently are no significantly associated SNPs for anorexia nervosa and only three for educational attainment.

Show full abstractShow less

DOI

10.1038/ng.3406

popcorn

Tool

PUBMED_LINK

27321947

DESCRIPTION

Popcorn is a program for estimaing the correlation of causal variant effect. This is the python3 version of Popcorn and still under development sizes across populations in GWAS.

Show full descriptionShow less

URL

https://github.com/brielin/Popcorn

KEYWORDS

trans-ethnic

Show full keywordsShow less

TITLE

Transethnic Genetic-Correlation Estimates from Summary Statistics.

Main citation

Brown BC, Asian Genetic Epidemiology Network Type 2 Diabetes Consortium, Ye CJ, Price AL, ...&, Zaitlen N. (2016) Transethnic Genetic-Correlation Estimates from Summary Statistics. Am J Hum Genet, 99 (1) 76-88. doi:10.1016/j.ajhg.2016.05.001. PMID 27321947

ABSTRACT

The increasing number of genetic association studies conducted in multiple populations provides an unprecedented opportunity to study how the genetic architecture of complex phenotypes varies between populations, a problem important for both medical and population genetics. Here, we have developed a method for estimating the transethnic genetic correlation: the correlation of causal-variant effect sizes at SNPs common in populations. This methods takes advantage of the entire spectrum of SNP associations and uses only summary-level data from genome-wide association studies. This avoids the computational costs and privacy concerns associated with genotype-level information while remaining scalable to hundreds of thousands of individuals and millions of SNPs. We applied our method to data on gene expression, rheumatoid arthritis, and type 2 diabetes and overwhelmingly found that the genetic correlation was significantly less than 1. Our method is implemented in a Python package called Popcorn.

Show full abstractShow less

DOI

10.1016/j.ajhg.2016.05.001

Heritability

BayesRR-RC

Tool

PUBMED_LINK

34848700

DESCRIPTION

gmrm is hybrid-parallel software for a Bayesian grouped mixture of regressions model for genome-wide association studies (GWAS). It is written in C++ using extensive optimisations and code vectorisation. It relies on plink's .bed format. It can handle multiple traits simultaneously.

Show full descriptionShow less

URL

https://github.com/medical-genomics-group/gmrm

TITLE

Probabilistic inference of the genetic architecture underlying functional enrichment of complex traits.

Main citation

Patxot M, Banos DT, Kousathanas A, Orliac EJ, ...&, Robinson MR. (2021) Probabilistic inference of the genetic architecture underlying functional enrichment of complex traits. Nat Commun, 12 (1) 6972. doi:10.1038/s41467-021-27258-9. PMID 34848700

ABSTRACT

We develop a Bayesian model (BayesRR-RC) that provides robust SNP-heritability estimation, an alternative to marker discovery, and accurate genomic prediction, taking 22 seconds per iteration to estimate 8.4 million SNP-effects and 78 SNP-heritability parameters in the UK Biobank. We find that only ≤10% of the genetic variation captured for height, body mass index, cardiovascular disease, and type 2 diabetes is attributable to proximal regulatory regions within 10kb upstream of genes, while 12-25% is attributed to coding regions, 32-44% to introns, and 22-28% to distal 10-500kb upstream regions. Up to 24% of all cis and coding regions of each chromosome are associated with each trait, with over 3,100 independent exonic and intronic regions and over 5,400 independent regulatory regions having ≥95% probability of contributing ≥0.001% to the genetic variance of these four traits. Our open-source software (GMRM) provides a scalable alternative to current approaches for biobank data.

Show full abstractShow less

DOI

10.1038/s41467-021-27258-9

GCTA-GREML-Binary (GREML)

Tool

PUBMED_LINK

21376301

FULL NAME

Genome-wide complex trait analysis (GCTA) Genome-based restricted maximum likelihood (GREML)

DESCRIPTION

(case-control)

Show full descriptionShow less

URL

https://yanglab.westlake.edu.cn/software/gcta/#GREML

TITLE

Estimating missing heritability for disease from genome-wide association studies.

Main citation

Lee SH, Wray NR, Goddard ME, Visscher PM. (2011) Estimating missing heritability for disease from genome-wide association studies. Am J Hum Genet, 88 (3) 294-305. doi:10.1016/j.ajhg.2011.02.002. PMID 21376301

ABSTRACT

Genome-wide association studies are designed to discover SNPs that are associated with a complex trait. Employing strict significance thresholds when testing individual SNPs avoids false positives at the expense of increasing false negatives. Recently, we developed a method for quantitative traits that estimates the variation accounted for when fitting all SNPs simultaneously. Here we develop this method further for case-control studies. We use a linear mixed model for analysis of binary traits and transform the estimates to a liability scale by adjusting both for scale and for ascertainment of the case samples. We show by theory and simulation that the method is unbiased. We apply the method to data from the Wellcome Trust Case Control Consortium and show that a substantial proportion of variation in liability for Crohn disease, bipolar disorder, and type I diabetes is tagged by common SNPs.

Show full abstractShow less

DOI

10.1016/j.ajhg.2011.02.002

GCTA-GREML-Bivariate (GREML)

Tool

PUBMED_LINK

22843982

FULL NAME

Genome-wide complex trait analysis (GCTA) Genome-based restricted maximum likelihood (GREML)

DESCRIPTION

(Bivariate GREML)

Show full descriptionShow less

URL

https://yanglab.westlake.edu.cn/software/gcta/#GREML

KEYWORDS

bivariate

Show full keywordsShow less

TITLE

Estimation of pleiotropy between complex diseases using single-nucleotide polymorphism-derived genomic relationships and restricted maximum likelihood.

Main citation

Lee SH, Yang J, Goddard ME, Visscher PM, ...&, Wray NR. (2012) Estimation of pleiotropy between complex diseases using single-nucleotide polymorphism-derived genomic relationships and restricted maximum likelihood. Bioinformatics, 28 (19) 2540-2. doi:10.1093/bioinformatics/bts474. PMID 22843982

ABSTRACT

SUMMARY: Genetic correlations are the genome-wide aggregate effects of causal variants affecting multiple traits. Traditionally, genetic correlations between complex traits are estimated from pedigree studies, but such estimates can be confounded by shared environmental factors. Moreover, for diseases, low prevalence rates imply that even if the true genetic correlation between disorders was high, co-aggregation of disorders in families might not occur or could not be distinguished from chance. We have developed and implemented statistical methods based on linear mixed models to obtain unbiased estimates of the genetic correlation between pairs of quantitative traits or pairs of binary traits of complex diseases using population-based case-control studies with genome-wide single-nucleotide polymorphism data. The method is validated in a simulation study and applied to estimate genetic correlation between various diseases from Wellcome Trust Case Control Consortium data in a series of bivariate analyses. We estimate a significant positive genetic correlation between risk of Type 2 diabetes and hypertension of ~0.31 (SE 0.14, P = 0.024). AVAILABILITY: Our methods, appropriate for both quantitative and binary traits, are implemented in the freely available software GCTA (http://www.complextraitgenomics.com/software/gcta/reml_bivar.html). CONTACT: hong.lee@uq.edu.au SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Show full abstractShow less

DOI

10.1093/bioinformatics/bts474

GCTA-GREML-LDMS

Tool

PUBMED_LINK

26323059

DESCRIPTION

(GREML-LDMS)

Show full descriptionShow less

URL

https://yanglab.westlake.edu.cn/software/gcta/#GREML

TITLE

Genetic variance estimation with imputed variants finds negligible missing heritability for human height and body mass index.

Main citation

Yang J, Bakshi A, Zhu Z, Hemani G, ...&, Visscher PM. (2015) Genetic variance estimation with imputed variants finds negligible missing heritability for human height and body mass index. Nat Genet, 47 (10) 1114-20. doi:10.1038/ng.3390. PMID 26323059

ABSTRACT

We propose a method (GREML-LDMS) to estimate heritability for human complex traits in unrelated individuals using whole-genome sequencing data. We demonstrate using simulations based on whole-genome sequencing data that ∼97% and ∼68% of variation at common and rare variants, respectively, can be captured by imputation. Using the GREML-LDMS method, we estimate from 44,126 unrelated individuals that all ∼17 million imputed variants explain 56% (standard error (s.e.) = 2.3%) of variance for height and 27% (s.e. = 2.5%) of variance for body mass index (BMI), and we find evidence that height- and BMI-associated variants have been under natural selection. Considering the imperfect tagging of imputation and potential overestimation of heritability from previous family-based studies, heritability is likely to be 60-70% for height and 30-40% for BMI. Therefore, the missing heritability is small for both traits. For further discovery of genes associated with complex traits, a study design with SNP arrays followed by imputation is more cost-effective than whole-genome sequencing at current prices.

Show full abstractShow less

DOI

10.1038/ng.3390

GCTA-GREML-Partition (GREML)

Tool

PUBMED_LINK

21552263

FULL NAME

Genome-wide complex trait analysis (GCTA) Genome-based restricted maximum likelihood (GREML)

DESCRIPTION

(partition the genetic variance into individual chromosomes and genomic segments)

Show full descriptionShow less

URL

https://yanglab.westlake.edu.cn/software/gcta/#GREML

TITLE

Genome partitioning of genetic variation for complex traits using common SNPs.

Main citation

Yang J, Manolio TA, Pasquale LR, Boerwinkle E, ...&, Visscher PM. (2011) Genome partitioning of genetic variation for complex traits using common SNPs. Nat Genet, 43 (6) 519-25. doi:10.1038/ng.823. PMID 21552263

ABSTRACT

We estimate and partition genetic variation for height, body mass index (BMI), von Willebrand factor and QT interval (QTi) using 586,898 SNPs genotyped on 11,586 unrelated individuals. We estimate that ∼45%, ∼17%, ∼25% and ∼21% of the variance in height, BMI, von Willebrand factor and QTi, respectively, can be explained by all autosomal SNPs and a further ∼0.5-1% can be explained by X chromosome SNPs. We show that the variance explained by each chromosome is proportional to its length, and that SNPs in or near genes explain more variation than SNPs between genes. We propose a new approach to estimate variation due to cryptic relatedness and population stratification. Our results provide further evidence that a substantial proportion of heritability is captured by common SNPs, that height, BMI and QTi are highly polygenic traits, and that the additive variation explained by a part of the genome is approximately proportional to the total length of DNA contained within genes therein.

Show full abstractShow less

DOI

10.1038/ng.823

GCTA-GREML-Quantitative (GREML)

Tool

PUBMED_LINK

20562875

FULL NAME

Genome-wide complex trait analysis (GCTA) Genome-based restricted maximum likelihood (GREML)

DESCRIPTION

GCTA-GREML analysis: estimating the variance explained by the SNPs / GCTA-GREML analysis for a case-control study

Show full descriptionShow less

URL

https://yanglab.westlake.edu.cn/software/gcta/#GREML

TITLE

Common SNPs explain a large proportion of the heritability for human height.

Main citation

Yang J, Benyamin B, McEvoy BP, Gordon S, ...&, Visscher PM. (2010) Common SNPs explain a large proportion of the heritability for human height. Nat Genet, 42 (7) 565-9. doi:10.1038/ng.608. PMID 20562875

ABSTRACT

SNPs discovered by genome-wide association studies (GWASs) account for only a small fraction of the genetic variation of complex traits in human populations. Where is the remaining heritability? We estimated the proportion of variance for human height explained by 294,831 SNPs genotyped on 3,925 unrelated individuals using a linear model analysis, and validated the estimation method with simulations based on the observed genotype data. We show that 45% of variance can be explained by considering all SNPs simultaneously. Thus, most of the heritability is not missing but has not previously been detected because the individual effects are too small to pass stringent significance tests. We provide evidence that the remaining heritability is due to incomplete linkage disequilibrium between causal variants and genotyped SNPs, exacerbated by causal variants having lower minor allele frequency than the SNPs explored to date.

Show full abstractShow less

DOI

10.1038/ng.608

HDL

Tool

PUBMED_LINK

32601477

FULL NAME

High-Definition Likelihood

DESCRIPTION

High-Definition Likelihood (HDL) is a likelihood-based method for estimating genetic correlation using GWAS summary statistics. Compared to LD Score regression (LDSC), It reduces the variance of a genetic correlation estimate by about 60%.

Show full descriptionShow less

URL

https://github.com/zhenin/HDL/

TITLE

High-definition likelihood inference of genetic correlations across human complex traits.

Main citation

Ning Z, Pawitan Y, Shen X. (2020) High-definition likelihood inference of genetic correlations across human complex traits. Nat Genet, 52 (8) 859-864. doi:10.1038/s41588-020-0653-y. PMID 32601477

ABSTRACT

Genetic correlation is a central parameter for understanding shared genetic architecture between complex traits. By using summary statistics from genome-wide association studies (GWAS), linkage disequilibrium score regression (LDSC) was developed for unbiased estimation of genetic correlations. Although easy to use, LDSC only partially utilizes LD information. By fully accounting for LD across the genome, we develop a high-definition likelihood (HDL) method to improve precision in genetic correlation estimation. Compared to LDSC, HDL reduces the variance of genetic correlation estimates by about 60%, equivalent to a 2.5-fold increase in sample size. We apply HDL and LDSC to estimate 435 genetic correlations among 30 behavioral and disease-related phenotypes measured in the UK Biobank (UKBB). In addition to 154 significant genetic correlations observed for both methods, HDL identified another 57 significant genetic correlations, compared to only another 2 significant genetic correlations identified by LDSC. HDL brings more power to genomic analyses and better reveals the underlying connections across human complex traits.

Show full abstractShow less

DOI

10.1038/s41588-020-0653-y

LDAK

Tool

PUBMED_LINK

23217325

FULL NAME

LD-adjusted kinships

DESCRIPTION

LDAK is a software package for analysing association study data.

Show full descriptionShow less

URL

http://www.ldak.org/

TITLE

Improved heritability estimation from genome-wide SNPs.

Main citation

Speed D, Hemani G, Johnson MR, Balding DJ. (2012) Improved heritability estimation from genome-wide SNPs. Am J Hum Genet, 91 (6) 1011-21. doi:10.1016/j.ajhg.2012.10.010. PMID 23217325

ABSTRACT

Estimation of narrow-sense heritability, h(2), from genome-wide SNPs genotyped in unrelated individuals has recently attracted interest and offers several advantages over traditional pedigree-based methods. With the use of this approach, it has been estimated that over half the heritability of human height can be attributed to the ~300,000 SNPs on a genome-wide genotyping array. In comparison, only 5%-10% can be explained by SNPs reaching genome-wide significance. We investigated via simulation the validity of several key assumptions underpinning the mixed-model analysis used in SNP-based h(2) estimation. Although we found that the method is reasonably robust to violations of four key assumptions, it can be highly sensitive to uneven linkage disequilibrium (LD) between SNPs: contributions to h(2) are overestimated from causal variants in regions of high LD and are underestimated in regions of low LD. The overall direction of the bias can be up or down depending on the genetic architecture of the trait, but it can be substantial in realistic scenarios. We propose a modified kinship matrix in which SNPs are weighted according to local LD. We show that this correction greatly reduces the bias and increases the precision of h(2) estimates. We demonstrate the impact of our method on the first seven diseases studied by the Wellcome Trust Case Control Consortium. Our LD adjustment revises downward the h(2) estimate for immune-related diseases, as expected because of high LD in the major-histocompatibility region, but increases it for some nonimmune diseases. To calculate our revised kinship matrix, we developed LDAK, software for computing LD-adjusted kinships.

Show full abstractShow less

DOI

10.1016/j.ajhg.2012.10.010

LDSC

Tool

PUBMED_LINK

25642630

FULL NAME

LD Score Regression

DESCRIPTION

ldsc is a command line tool for estimating heritability and genetic correlation from GWAS summary statistics. ldsc also computes LD Scores.

Show full descriptionShow less

URL

https://github.com/bulik/ldsc

TITLE

LD Score regression distinguishes confounding from polygenicity in genome-wide association studies.

Main citation

Bulik-Sullivan BK, Loh PR, Finucane HK, Ripke S, ...&, Neale BM. (2015) LD Score regression distinguishes confounding from polygenicity in genome-wide association studies. Nat Genet, 47 (3) 291-5. doi:10.1038/ng.3211. PMID 25642630

ABSTRACT

Both polygenicity (many small genetic effects) and confounding biases, such as cryptic relatedness and population stratification, can yield an inflated distribution of test statistics in genome-wide association studies (GWAS). However, current methods cannot distinguish between inflation from a true polygenic signal and bias. We have developed an approach, LD Score regression, that quantifies the contribution of each by examining the relationship between test statistics and linkage disequilibrium (LD). The LD Score regression intercept can be used to estimate a more powerful and accurate correction factor than genomic control. We find strong evidence that polygenicity accounts for the majority of the inflation in test statistics in many GWAS of large sample size.

Show full abstractShow less

DOI

10.1038/ng.3211

PHBC

Tool

PUBMED_LINK

40585167

FULL NAME

Pleiotropic Shared Heritability with Bias Correction

DESCRIPTION

PHBC (pleiotropic shared heritability with bias correction) estimates the liability-scale genetic variance of a target disease that is shared with a specific set of auxiliary diseases. It uses GWAS summary statistics and a genetic correlation matrix with Monte-Carlo bias correction to account for sampling noise. Available as the R package pleioh2g on CRAN.

Show full descriptionShow less

URL

https://cran.r-project.org/package=pleioh2g

TITLE

Pleiotropic shared heritability quantifies the shared genetic variance of common diseases.

Main citation

Zhao Y, Strober B, Hou K, Kerner G, Danesh J, Gazal S, Cheng W, Inouye M, Price AL, Jiang X. (2026) Pleiotropic shared heritability quantifies the shared genetic variance of common diseases. Nat Genet. doi:10.1038/s41588-026-02607-w. PMID 40585167

ABSTRACT

The overall contribution of pleiotropy to disease architectures is unknown, as most studies estimate genetic correlations with each auxiliary disease in turn. Here we propose a method—pleiotropic shared heritability with bias correction (PHBC)—to estimate the liability-scale genetic variance of a target disease that is shared with a specific set of auxiliary diseases (h2_pleio). PHBC estimates h2_pleio from a genetic correlation matrix using a Monte Carlo bias correction procedure to account for sampling noise. The average ratio of h2_pleio to total SNP heritability (h2_pleio/h2) across 15 UK Biobank diseases (spanning seven disease categories) was 27 +/- 3%, increasing to 48 +/- 5% when expanding to 62 auxiliary diseases/traits. h2_pleio/h2 was broadly distributed across disease categories, decreasing only modestly when removing the most informative auxiliary disease categories. The average h2_pleio/h2 was 1.51 +/- 0.16-times larger than the proportion of total phenotypic variance explained by auxiliary diseases, implying higher pleiotropy for genetic effects. In summary, roughly half of common disease heritability is pleiotropic with a broad range of diseases.

Show full abstractShow less

DOI

10.1038/s41588-026-02607-w

ARROW_SUMMARY

GWAS summary statistics for target + auxiliary diseases → Cross-trait LDSC genetic correlation matrix → Monte Carlo bias correction → PHBC h²_pleio estimate → Quantify proportion of heritability shared with auxiliary diseases/traits

SumHer

Tool

PUBMED_LINK

30510236

URL

http://www.ldak.org/

TITLE

SumHer better estimates the SNP heritability of complex traits from summary statistics.

Main citation

Speed D, Balding DJ. (2019) SumHer better estimates the SNP heritability of complex traits from summary statistics. Nat Genet, 51 (2) 277-284. doi:10.1038/s41588-018-0279-5. PMID 30510236

ABSTRACT

We present SumHer, software for estimating confounding bias, SNP heritability, enrichments of heritability and genetic correlations using summary statistics from genome-wide association studies. The key difference between SumHer and the existing software LD Score Regression (LDSC) is that SumHer allows the user to specify the heritability model. We apply SumHer to results from 24 large-scale association studies (average sample size 121,000) using our recommended heritability model. We show that these studies tended to substantially over-correct for confounding, and as a result the number of genome-wide significant loci was under-reported by about a quarter. We also estimate enrichments for 24 categories of SNPs defined by functional annotations. A previous study using LDSC reported that conserved regions were 13-fold enriched, and found a further six categories with above threefold enrichment. By contrast, our analysis using SumHer finds that none of the categories have enrichment above twofold. SumHer provides an improved understanding of the genetic architecture of complex traits, which enables more efficient analysis of future genetic data.

Show full abstractShow less

DOI

10.1038/s41588-018-0279-5

TetraHer

Tool

PUBMED_LINK

38490208

DESCRIPTION

a method for estimating the liability heritability of binary phenotypes

Show full descriptionShow less

URL

http://www.ldak.org/

TITLE

Estimating disease heritability from complex pedigrees allowing for ascertainment and covariates.

Main citation

Speed D, Evans DM. (2024) Estimating disease heritability from complex pedigrees allowing for ascertainment and covariates. Am J Hum Genet, 111 (4) 680-690. doi:10.1016/j.ajhg.2024.02.010. PMID 38490208

ABSTRACT

We propose TetraHer, a method for estimating the liability heritability of binary phenotypes. TetraHer has five key features. First, it can be applied to data from complex pedigrees that contain multiple types of relationships. Second, it can correct for ascertainment of cases or controls. Third, it can accommodate covariates. Fourth, it can model the contribution of common environment. Fifth, it produces a likelihood that can be used for significance testing. We first demonstrate the validity of TetraHer on simulated data. We then use TetraHer to estimate liability heritability for 229 codes from the tenth International Classification of Diseases (ICD-10). We identify 107 codes with significant heritability (p < 0.05/229), which can be used in future analyses for investigating the genetic architecture of human diseases.

Show full abstractShow less

DOI

10.1016/j.ajhg.2024.02.010

Local heritability/genetic correlation

GNOVA

Tool

PUBMED_LINK

29220677

FULL NAME

GeNetic cOVariance Analyzer

DESCRIPTION

A principled framework to estimate annotation-stratified genetic covariance using GWAS summary statistics.

Show full descriptionShow less

URL

https://github.com/xtonyjiang/GNOVA

TITLE

A Powerful Approach to Estimating Annotation-Stratified Genetic Covariance via GWAS Summary Statistics.

Main citation

Lu Q, Li B, Ou D, Erlendsdottir M, ...&, Zhao H. (2017) A Powerful Approach to Estimating Annotation-Stratified Genetic Covariance via GWAS Summary Statistics. Am J Hum Genet, 101 (6) 939-964. doi:10.1016/j.ajhg.2017.11.001. PMID 29220677

ABSTRACT

Despite the success of large-scale genome-wide association studies (GWASs) on complex traits, our understanding of their genetic architecture is far from complete. Jointly modeling multiple traits' genetic profiles has provided insights into the shared genetic basis of many complex traits. However, large-scale inference sets a high bar for both statistical power and biological interpretability. Here we introduce a principled framework to estimate annotation-stratified genetic covariance between traits using GWAS summary statistics. Through theoretical and numerical analyses, we demonstrate that our method provides accurate covariance estimates, thereby enabling researchers to dissect both the shared and distinct genetic architecture across traits to better understand their etiologies. Among 50 complex traits with publicly accessible GWAS summary statistics (Ntotal≈ 4.5 million), we identified more than 170 pairs with statistically significant genetic covariance. In particular, we found strong genetic covariance between late-onset Alzheimer disease (LOAD) and amyotrophic lateral sclerosis (ALS), two major neurodegenerative diseases, in single-nucleotide polymorphisms (SNPs) with high minor allele frequencies and in SNPs located in the predicted functional genome. Joint analysis of LOAD, ALS, and other traits highlights LOAD's correlation with cognitive traits and hints at an autoimmune component for ALS.

Show full abstractShow less

DOI

10.1016/j.ajhg.2017.11.001

HDL-L

Tool

PUBMED_LINK

40065165

FULL NAME

high-definition likelihood (local)

DESCRIPTION

High-Definition Likelihood (HDL) is a likelihood-based method for estimating genetic correlation using GWAS summary statistics. Compared to LD Score regression (LDSC), It reduces the variance of a genetic correlation estimate by about 60%. Here, we provide an R-based computational tool HDL to implement our method.

Show full descriptionShow less

URL

https://github.com/zhenin/HDL/

KEYWORDS

likelihood-based inference

Show full keywordsShow less

TITLE

An enhanced framework for local genetic correlation analysis.

Main citation

Li Y, Pawitan Y, Shen X. (2025) An enhanced framework for local genetic correlation analysis. Nat Genet, 57 (4) 1053-1058. doi:10.1038/s41588-025-02123-3. PMID 40065165

ABSTRACT

Genetic correlation is a key parameter in the joint genetic model of complex traits, but it is usually estimated on a global genomic scale. Understanding local genetic correlations provides more detailed insight into the shared genetic architecture of complex traits. However, a state-of-the-art tool for local genetic correlation analysis, LAVA, is prone to false inference. Here we extend the high-definition likelihood (HDL) method to a local version, HDL-L, which performs genetic correlation analysis in small, approximately independent linkage disequilibrium blocks. HDL-L allows a more granular estimation of genetic variances and covariances. Simulations show that HDL-L offers more consistent heritability estimates and more efficient genetic correlation estimates compared with LAVA. HDL-L demonstrated robust performance across a wide range of simulations conducted under varying parameter settings. In the analysis of 30 phenotypes from the UK Biobank, HDL-L identified 109 significant local genetic correlations and showed a notable computational advantage. HDL-L proves to be a powerful tool for uncovering the detailed genetic landscape that underlies complex human traits, offering both accuracy and computational efficiency.

Show full abstractShow less

DOI

10.1038/s41588-025-02123-3

HEELS

Tool

PUBMED_LINK

38040712

FULL NAME

Heritability Estimation with high Efficiency using LD and association Summary Statistics

DESCRIPTION

HEELS is a Python-based command line tool that produce accurate and precise local heritability estimates using summary-level statistics (marginal association test statistics along with the empirical (in-sample) LD statistics).

Show full descriptionShow less

URL

https://github.com/huilisabrina/HEELS

TITLE

Accurate and efficient estimation of local heritability using summary statistics and the linkage disequilibrium matrix.

Main citation

Li H, Mazumder R, Lin X. (2023) Accurate and efficient estimation of local heritability using summary statistics and the linkage disequilibrium matrix. Nat Commun, 14 (1) 7954. doi:10.1038/s41467-023-43565-9. PMID 38040712

ABSTRACT

Existing SNP-heritability estimators that leverage summary statistics from genome-wide association studies (GWAS) are much less efficient (i.e., have larger standard errors) than the restricted maximum likelihood (REML) estimators which require access to individual-level data. We introduce a new method for local heritability estimation-Heritability Estimation with high Efficiency using LD and association Summary Statistics (HEELS)-that significantly improves the statistical efficiency of summary-statistics-based heritability estimator and attains comparable statistical efficiency as REML (with a relative statistical efficiency >92%). Moreover, we propose representing the empirical LD matrix as the sum of a low-rank matrix and a banded matrix. We show that this way of modeling the LD can not only reduce the storage and memory cost, but also improve the computational efficiency of heritability estimation. We demonstrate the statistical efficiency of HEELS and the advantages of our proposed LD approximation strategies both in simulations and through empirical analyses of the UK Biobank data.

Show full abstractShow less

DOI

10.1038/s41467-023-43565-9

HESS

Tool

PUBMED_LINK

27346688

FULL NAME

Heritability Estimation from Summary Statistics

DESCRIPTION

HESS (Heritability Estimation from Summary Statistics) is a software package for estimating and visualizing local SNP-heritability and genetic covariance (correlation) from GWAS summary association data.

Show full descriptionShow less

URL

https://huwenboshi.github.io/hess/

TITLE

Contrasting the Genetic Architecture of 30 Complex Traits from Summary Association Data.

Main citation

Shi H, Kichaev G, Pasaniuc B. (2016) Contrasting the Genetic Architecture of 30 Complex Traits from Summary Association Data. Am J Hum Genet, 99 (1) 139-53. doi:10.1016/j.ajhg.2016.05.013. PMID 27346688

ABSTRACT

Variance-component methods that estimate the aggregate contribution of large sets of variants to the heritability of complex traits have yielded important insights into the genetic architecture of common diseases. Here, we introduce methods that estimate the total trait variance explained by the typed variants at a single locus in the genome (local SNP heritability) from genome-wide association study (GWAS) summary data while accounting for linkage disequilibrium among variants. We applied our estimator to ultra-large-scale GWAS summary data of 30 common traits and diseases to gain insights into their local genetic architecture. First, we found that common SNPs have a high contribution to the heritability of all studied traits. Second, we identified traits for which the majority of the SNP heritability can be confined to a small percentage of the genome. Third, we identified GWAS risk loci where the entire locus explains significantly more variance in the trait than the GWAS reported variants. Finally, we identified loci that explain a significant amount of heritability across multiple traits.

Show full abstractShow less

DOI

10.1016/j.ajhg.2016.05.013

LAVA

Tool

PUBMED_LINK

35288712

FULL NAME

Local Analysis of [co]Variant Association

DESCRIPTION

LAVA is a tool to conduct genome-wide, local genetic correlation analysis on multiple traits, using GWAS summary statistics as input.

Show full descriptionShow less

URL

https://ctg.cncr.nl/software/lava

TITLE

An integrated framework for local genetic correlation analysis.

Main citation

Werme J, van der Sluis S, Posthuma D, de Leeuw CA. (2022) An integrated framework for local genetic correlation analysis. Nat Genet, 54 (3) 274-282. doi:10.1038/s41588-022-01017-y. PMID 35288712

ABSTRACT

Genetic correlation (rg) analysis is used to identify phenotypes that may have a shared genetic basis. Traditionally, rg is studied globally, considering only the average of the shared signal across the genome, although this approach may fail when the rg is confined to particular genomic regions or in opposing directions at different loci. Current tools for local rg analysis are restricted to analysis of two phenotypes. Here we introduce LAVA, an integrated framework for local rg analysis that, in addition to testing the standard bivariate local rgs between two phenotypes, can evaluate local heritabilities and analyze conditional genetic relations between several phenotypes using partial correlation and multiple regression. Applied to 25 behavioral and health phenotypes, we show considerable heterogeneity in the bivariate local rgs across the genome, which is often masked by the global rg patterns, and demonstrate how our conditional approaches can elucidate more complex, multivariate genetic relations.

Show full abstractShow less

DOI

10.1038/s41588-022-01017-y

Logica

Tool

FULL NAME

LOcal GenetIc Correlation across Ancestries

DESCRIPTION

Logica (LOcal GenetIc Correlation across Ancestries), a new method specifically designed to estimate local genetic correlations across ancestries. Logica employs a bivariate linear mixed model that explicitly accounts for diverse LD patterns across ancestries, operates on GWAS summary statistics, and utilizes a maximum likelihood framework for robust inference. Logica is implemented as an open-source R package。

Show full descriptionShow less

URL

https://github.com/borangao/Logica

SUPERGNOVA

Tool

PUBMED_LINK

34493297

FULL NAME

SUPER GeNetic cOVariance Analyzer

DESCRIPTION

SUPERGNOVA (SUPER GeNetic cOVariance Analyzer) is a statistical framework to perform local genetic covariance analysis.

Show full descriptionShow less

URL

https://github.com/qlu-lab/SUPERGNOVA

TITLE

SUPERGNOVA: local genetic correlation analysis reveals heterogeneous etiologic sharing of complex traits.

Main citation

Zhang Y, Lu Q, Ye Y, Huang K, ...&, Zhao H. (2021) SUPERGNOVA: local genetic correlation analysis reveals heterogeneous etiologic sharing of complex traits. Genome Biol, 22 (1) 262. doi:10.1186/s13059-021-02478-w. PMID 34493297

ABSTRACT

Local genetic correlation quantifies the genetic similarity of complex traits in specific genomic regions. However, accurate estimation of local genetic correlation remains challenging, due to linkage disequilibrium in local genomic regions and sample overlap across studies. We introduce SUPERGNOVA, a statistical framework to estimate local genetic correlations using summary statistics from genome-wide association studies. We demonstrate that SUPERGNOVA outperforms existing methods through simulations and analyses of 30 complex traits. In particular, we show that the positive yet paradoxical genetic correlation between autism spectrum disorder and cognitive performance could be explained by two etiologically distinct genetic signatures with bidirectional local genetic correlations.

Show full abstractShow less

DOI

10.1186/s13059-021-02478-w