Heritability_and_genetic_correlation

Summary Table

NAME	CATEGORY	CITATION	YEAR
S-LDXR	Genetic correlation	Shi H, Gazal S, Kanai M, Koch EM, ...&, Price AL. (2021) Population-specific causal disease effect sizes in functionally important regions impacted by selection Nat. Commun., 12 (1) 1098. doi:10.1038/s41467-021-21286-1. PMID 33597505	2021
cross-trait LDSC	Genetic correlation	Bulik-Sullivan B, Finucane HK, Anttila V, Gusev A, ...&, Neale BM. (2015) An atlas of genetic correlations across human diseases and traits Nat. Genet., 47 (11) 1236-1241. doi:10.1038/ng.3406. PMID 26414676	2015
popcorn	Genetic correlation	Brown BC, Asian Genetic Epidemiology Network Type 2 Diabetes Consortium, Ye CJ, Price AL, ...&, Zaitlen N. (2016) Transethnic Genetic-Correlation Estimates from Summary Statistics Am. J. Hum. Genet., 99 (1) 76-88. doi:10.1016/j.ajhg.2016.05.001. PMID 27321947	2016
BayesRR-RC	Heritability	Patxot M, Banos DT, Kousathanas A, Orliac EJ, ...&, Robinson MR. (2021) Probabilistic inference of the genetic architecture underlying functional enrichment of complex traits Nat. Commun., 12 (1) 6972. doi:10.1038/s41467-021-27258-9. PMID 34848700	2021
GCTA-GREML-Binary	Heritability	Lee SH, Wray NR, Goddard ME, Visscher PM. (2011) Estimating missing heritability for disease from genome-wide association studies Am. J. Hum. Genet., 88 (3) 294-305. doi:10.1016/j.ajhg.2011.02.002. PMID 21376301	2011
GCTA-GREML-Bivariate	Heritability	Lee SH, Yang J, Goddard ME, Visscher PM, ...&, Wray NR. (2012) Estimation of pleiotropy between complex diseases using single-nucleotide polymorphism-derived genomic relationships and restricted maximum likelihood Bioinformatics, 28 (19) 2540-2542. doi:10.1093/bioinformatics/bts474. PMID 22843982	2012
GCTA-GREML-LDMS	Heritability	Yang J, Bakshi A, Zhu Z, Hemani G, ...&, Visscher PM. (2015) Genetic variance estimation with imputed variants finds negligible missing heritability for human height and body mass index Nat. Genet., 47 (10) 1114-1120. doi:10.1038/ng.3390. PMID 26323059	2015
GCTA-GREML-Partition	Heritability	Yang J, Manolio TA, Pasquale LR, Boerwinkle E, ...&, Visscher PM. (2011) Genome partitioning of genetic variation for complex traits using common SNPs Nat. Genet., 43 (6) 519-525. doi:10.1038/ng.823. PMID 21552263	2011
GCTA-GREML-Quantitative	Heritability	Yang J, Benyamin B, McEvoy BP, Gordon S, ...&, Visscher PM. (2010) Common SNPs explain a large proportion of the heritability for human height Nat. Genet., 42 (7) 565-569. doi:10.1038/ng.608. PMID 20562875	2010
HDL	Heritability	Ning Z, Pawitan Y, Shen X. (2020) High-definition likelihood inference of genetic correlations across human complex traits Nat. Genet., 52 (8) 859-864. doi:10.1038/s41588-020-0653-y. PMID 32601477	2020
LDAK	Heritability	Speed D, Hemani G, Johnson MR, Balding DJ. (2012) Improved heritability estimation from genome-wide SNPs Am. J. Hum. Genet., 91 (6) 1011-1021. doi:10.1016/j.ajhg.2012.10.010. PMID 23217325	2012
LDSC	Heritability	Bulik-Sullivan B, Loh PR, Finucane HK, Ripke S, ...&, O'Donovan MC. (2015) LD Score regression distinguishes confounding from polygenicity in genome-wide association studies Nat. Genet., 47 (3) 291-295. doi:10.1038/ng.3211. PMID 25642630	2015
SumHer	Heritability	Speed D, Balding DJ. (2019) SumHer better estimates the SNP heritability of complex traits from summary statistics Nat. Genet., 51 (2) 277-284. doi:10.1038/s41588-018-0279-5. PMID 30510236	2019
TetraHer	Heritability	Speed D, Evans DM. (2024) Estimating disease heritability from complex pedigrees allowing for ascertainment and covariates Am. J. Hum. Genet., 111 (4) 680-690. doi:10.1016/j.ajhg.2024.02.010. PMID 38490208	2024
GNOVA	Local heritability/genetic correlation	Lu Q, Li B, Ou D, Erlendsdottir M, ...&, Zhao H. (2017) A powerful approach to estimating annotation-stratified genetic covariance via GWAS summary statistics Am. J. Hum. Genet., 101 (6) 939-964. doi:10.1016/j.ajhg.2017.11.001. PMID 29220677	2017
HDL-L	Local heritability/genetic correlation	Li, Y., Pawitan, Y. & Shen, X. An enhanced framework for local genetic correlation analysis. Nat. Genet. 1–6 (2025).	NA
HEELS	Local heritability/genetic correlation	Li H, Mazumder R, Lin X. (2023) Accurate and efficient estimation of local heritability using summary statistics and the linkage disequilibrium matrix Nat. Commun., 14 (1) 7954. doi:10.1038/s41467-023-43565-9. PMID 38040712	2023
HESS	Local heritability/genetic correlation	Shi H, Kichaev G, Pasaniuc B. (2016) Contrasting the genetic architecture of 30 complex traits from summary association data Am. J. Hum. Genet., 99 (1) 139-153. doi:10.1016/j.ajhg.2016.05.013. PMID 27346688	2016
LAVA	Local heritability/genetic correlation	Werme J, van der Sluis S, Posthuma D, de Leeuw CA. (2022) An integrated framework for local genetic correlation analysis Nat. Genet., 54 (3) 274-282. doi:10.1038/s41588-022-01017-y. PMID 35288712	2022
SUPERGNOVA	Local heritability/genetic correlation	Zhang Y, Lu Q, Ye Y, Huang K, ...&, Zhao H. (2021) SUPERGNOVA: local genetic correlation analysis reveals heterogeneous etiologic sharing of complex traits Genome Biol., 22 (1) 262. doi:10.1186/s13059-021-02478-w. PMID 34493297	2021

Genetic correlation

S-LDXR

NAME : S-LDXR
SHORT NAME : S-LDXR
FULL NAME : S-LDXR
DESCRIPTION : S-LDXR is a software for estimating enrichment of stratified squared trans-ethnic genetic correlation across genomic annotations from GWAS summary statistics data.
URL : https://huwenboshi.github.io/s-ldxr/
KEYWORDS : trans-ethnic, stratified, functional categories
TITLE : Population-specific causal disease effect sizes in functionally important regions impacted by selection
DOI : 10.1038/s41467-021-21286-1
ABSTRACT : Many diseases exhibit population-specific causal effect sizes with trans-ethnic genetic correlations significantly less than 1, limiting trans-ethnic polygenic risk prediction. We develop a new method, S-LDXR, for stratifying squared trans-ethnic genetic correlation across genomic annotations, and apply S-LDXR to genome-wide summary statistics for 31 diseases and complex traits in East Asians (average N = 90K) and Europeans (average N = 267K) with an average trans-ethnic genetic correlation of 0.85. We determine that squared trans-ethnic genetic correlation is 0.82× (s.e. 0.01) depleted in the top quintile of background selection statistic, implying more population-specific causal effect sizes. Accordingly, causal effect sizes are more population-specific in functionally important regions, including conserved and regulatory regions. In regions surrounding specifically expressed genes, causal effect sizes are most population-specific for skin and immune genes, and least population-specific for brain genes. Our results could potentially be explained by stronger gene-environment interaction at loci impacted by selection, particularly positive selection.
CITATION : Shi H, Gazal S, Kanai M, Koch EM, ...&, Price AL. (2021) Population-specific causal disease effect sizes in functionally important regions impacted by selection Nat. Commun., 12 (1) 1098. doi:10.1038/s41467-021-21286-1. PMID 33597505
JOURNAL_INFO : Nature communications ; Nat. Commun. ; 2021 ; 12 ; 1 ; 1098
PUBMED_LINK : 33597505

cross-trait LDSC

NAME : cross-trait LDSC
SHORT NAME : cross-trait LDSC
FULL NAME : cross-trait LD Score Regression
DESCRIPTION : ldsc is a command line tool for estimating heritability and genetic correlation from GWAS summary statistics. ldsc also computes LD Scores.
URL : https://github.com/bulik/ldsc
KEYWORDS : cross-trait, LD score regression
TITLE : An atlas of genetic correlations across human diseases and traits
DOI : 10.1038/ng.3406
ABSTRACT : Identifying genetic correlations between complex traits and diseases can provide useful etiological insights and help prioritize likely causal relationships. The major challenges preventing estimation of genetic correlation from genome-wide association study (GWAS) data with current methods are the lack of availability of individual-level genotype data and widespread sample overlap among meta-analyses. We circumvent these difficulties by introducing a technique-cross-trait LD Score regression-for estimating genetic correlation that requires only GWAS summary statistics and is not biased by sample overlap. We use this method to estimate 276 genetic correlations among 24 traits. The results include genetic correlations between anorexia nervosa and schizophrenia, anorexia and obesity, and educational attainment and several diseases. These results highlight the power of genome-wide analyses, as there currently are no significantly associated SNPs for anorexia nervosa and only three for educational attainment.
CITATION : Bulik-Sullivan B, Finucane HK, Anttila V, Gusev A, ...&, Neale BM. (2015) An atlas of genetic correlations across human diseases and traits Nat. Genet., 47 (11) 1236-1241. doi:10.1038/ng.3406. PMID 26414676
JOURNAL_INFO : Nature genetics ; Nat. Genet. ; 2015 ; 47 ; 11 ; 1236-1241
PUBMED_LINK : 26414676

popcorn

NAME : popcorn
SHORT NAME : popcorn
FULL NAME : popcorn
DESCRIPTION : Popcorn is a program for estimaing the correlation of causal variant effect. This is the python3 version of Popcorn and still under development sizes across populations in GWAS.
URL : https://github.com/brielin/Popcorn
KEYWORDS : trans-ethnic
TITLE : Transethnic Genetic-Correlation Estimates from Summary Statistics
DOI : 10.1016/j.ajhg.2016.05.001
ABSTRACT : The increasing number of genetic association studies conducted in multiple populations provides an unprecedented opportunity to study how the genetic architecture of complex phenotypes varies between populations, a problem important for both medical and population genetics. Here, we have developed a method for estimating the transethnic genetic correlation: the correlation of causal-variant effect sizes at SNPs common in populations. This methods takes advantage of the entire spectrum of SNP associations and uses only summary-level data from genome-wide association studies. This avoids the computational costs and privacy concerns associated with genotype-level information while remaining scalable to hundreds of thousands of individuals and millions of SNPs. We applied our method to data on gene expression, rheumatoid arthritis, and type 2 diabetes and overwhelmingly found that the genetic correlation was significantly less than 1. Our method is implemented in a Python package called Popcorn.
CITATION : Brown BC, Asian Genetic Epidemiology Network Type 2 Diabetes Consortium, Ye CJ, Price AL, ...&, Zaitlen N. (2016) Transethnic Genetic-Correlation Estimates from Summary Statistics Am. J. Hum. Genet., 99 (1) 76-88. doi:10.1016/j.ajhg.2016.05.001. PMID 27321947
JOURNAL_INFO : American journal of human genetics ; Am. J. Hum. Genet. ; 2016 ; 99 ; 1 ; 76-88
PUBMED_LINK : 27321947

Heritability

BayesRR-RC

NAME : BayesRR-RC
SHORT NAME : BayesRR-RC
FULL NAME : BayesRR-RC
DESCRIPTION : gmrm is hybrid-parallel software for a Bayesian grouped mixture of regressions model for genome-wide association studies (GWAS). It is written in C++ using extensive optimisations and code vectorisation. It relies on plink's .bed format. It can handle multiple traits simultaneously.
URL : https://github.com/medical-genomics-group/gmrm
TITLE : Probabilistic inference of the genetic architecture underlying functional enrichment of complex traits
DOI : 10.1038/s41467-021-27258-9
ABSTRACT : We develop a Bayesian model (BayesRR-RC) that provides robust SNP-heritability estimation, an alternative to marker discovery, and accurate genomic prediction, taking 22 seconds per iteration to estimate 8.4 million SNP-effects and 78 SNP-heritability parameters in the UK Biobank. We find that only ≤10% of the genetic variation captured for height, body mass index, cardiovascular disease, and type 2 diabetes is attributable to proximal regulatory regions within 10kb upstream of genes, while 12-25% is attributed to coding regions, 32-44% to introns, and 22-28% to distal 10-500kb upstream regions. Up to 24% of all cis and coding regions of each chromosome are associated with each trait, with over 3,100 independent exonic and intronic regions and over 5,400 independent regulatory regions having ≥95% probability of contributing ≥0.001% to the genetic variance of these four traits. Our open-source software (GMRM) provides a scalable alternative to current approaches for biobank data.
COPYRIGHT : https://creativecommons.org/licenses/by/4.0
CITATION : Patxot M, Banos DT, Kousathanas A, Orliac EJ, ...&, Robinson MR. (2021) Probabilistic inference of the genetic architecture underlying functional enrichment of complex traits Nat. Commun., 12 (1) 6972. doi:10.1038/s41467-021-27258-9. PMID 34848700
JOURNAL_INFO : Nature communications ; Nat. Commun. ; 2021 ; 12 ; 1 ; 6972
PUBMED_LINK : 34848700

GCTA-GREML-Binary

NAME : GCTA-GREML-Binary
SHORT NAME : GREML
FULL NAME : Genome-wide complex trait analysis (GCTA) Genome-based restricted maximum likelihood (GREML)
DESCRIPTION : (case-control)
URL : https://yanglab.westlake.edu.cn/software/gcta/#GREML
TITLE : Estimating missing heritability for disease from genome-wide association studies
DOI : 10.1016/j.ajhg.2011.02.002
ABSTRACT : Genome-wide association studies are designed to discover SNPs that are associated with a complex trait. Employing strict significance thresholds when testing individual SNPs avoids false positives at the expense of increasing false negatives. Recently, we developed a method for quantitative traits that estimates the variation accounted for when fitting all SNPs simultaneously. Here we develop this method further for case-control studies. We use a linear mixed model for analysis of binary traits and transform the estimates to a liability scale by adjusting both for scale and for ascertainment of the case samples. We show by theory and simulation that the method is unbiased. We apply the method to data from the Wellcome Trust Case Control Consortium and show that a substantial proportion of variation in liability for Crohn disease, bipolar disorder, and type I diabetes is tagged by common SNPs.
COPYRIGHT : http://www.elsevier.com/open-access/userlicense/1.0/
CITATION : Lee SH, Wray NR, Goddard ME, Visscher PM. (2011) Estimating missing heritability for disease from genome-wide association studies Am. J. Hum. Genet., 88 (3) 294-305. doi:10.1016/j.ajhg.2011.02.002. PMID 21376301
JOURNAL_INFO : The American Journal of Human Genetics ; Am. J. Hum. Genet. ; 2011 ; 88 ; 3 ; 294-305
PUBMED_LINK : 21376301

GCTA-GREML-Bivariate

NAME : GCTA-GREML-Bivariate
SHORT NAME : GREML
FULL NAME : Genome-wide complex trait analysis (GCTA) Genome-based restricted maximum likelihood (GREML)
DESCRIPTION : (Bivariate GREML)
URL : https://yanglab.westlake.edu.cn/software/gcta/#GREML
KEYWORDS : bivariate
TITLE : Estimation of pleiotropy between complex diseases using single-nucleotide polymorphism-derived genomic relationships and restricted maximum likelihood
DOI : 10.1093/bioinformatics/bts474
ABSTRACT : SUMMARY: Genetic correlations are the genome-wide aggregate effects of causal variants affecting multiple traits. Traditionally, genetic correlations between complex traits are estimated from pedigree studies, but such estimates can be confounded by shared environmental factors. Moreover, for diseases, low prevalence rates imply that even if the true genetic correlation between disorders was high, co-aggregation of disorders in families might not occur or could not be distinguished from chance. We have developed and implemented statistical methods based on linear mixed models to obtain unbiased estimates of the genetic correlation between pairs of quantitative traits or pairs of binary traits of complex diseases using population-based case-control studies with genome-wide single-nucleotide polymorphism data. The method is validated in a simulation study and applied to estimate genetic correlation between various diseases from Wellcome Trust Case Control Consortium data in a series of bivariate analyses. We estimate a significant positive genetic correlation between risk of Type 2 diabetes and hypertension of ~0.31 (SE 0.14, P = 0.024). AVAILABILITY: Our methods, appropriate for both quantitative and binary traits, are implemented in the freely available software GCTA (http://www.complextraitgenomics.com/software/gcta/reml_bivar.html). CONTACT: hong.lee@uq.edu.au SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
CITATION : Lee SH, Yang J, Goddard ME, Visscher PM, ...&, Wray NR. (2012) Estimation of pleiotropy between complex diseases using single-nucleotide polymorphism-derived genomic relationships and restricted maximum likelihood Bioinformatics, 28 (19) 2540-2542. doi:10.1093/bioinformatics/bts474. PMID 22843982
JOURNAL_INFO : Bioinformatics (Oxford, England) ; Bioinformatics ; 2012 ; 28 ; 19 ; 2540-2542
PUBMED_LINK : 22843982

GCTA-GREML-LDMS

NAME : GCTA-GREML-LDMS
SHORT NAME : GCTA-GREML-LDMS
FULL NAME : GCTA-GREML-LDMS
DESCRIPTION : (GREML-LDMS)
URL : https://yanglab.westlake.edu.cn/software/gcta/#GREML
TITLE : Genetic variance estimation with imputed variants finds negligible missing heritability for human height and body mass index
DOI : 10.1038/ng.3390
ABSTRACT : We propose a method (GREML-LDMS) to estimate heritability for human complex traits in unrelated individuals using whole-genome sequencing data. We demonstrate using simulations based on whole-genome sequencing data that ∼97% and ∼68% of variation at common and rare variants, respectively, can be captured by imputation. Using the GREML-LDMS method, we estimate from 44,126 unrelated individuals that all ∼17 million imputed variants explain 56% (standard error (s.e.) = 2.3%) of variance for height and 27% (s.e. = 2.5%) of variance for body mass index (BMI), and we find evidence that height- and BMI-associated variants have been under natural selection. Considering the imperfect tagging of imputation and potential overestimation of heritability from previous family-based studies, heritability is likely to be 60-70% for height and 30-40% for BMI. Therefore, the missing heritability is small for both traits. For further discovery of genes associated with complex traits, a study design with SNP arrays followed by imputation is more cost-effective than whole-genome sequencing at current prices.
CITATION : Yang J, Bakshi A, Zhu Z, Hemani G, ...&, Visscher PM. (2015) Genetic variance estimation with imputed variants finds negligible missing heritability for human height and body mass index Nat. Genet., 47 (10) 1114-1120. doi:10.1038/ng.3390. PMID 26323059
JOURNAL_INFO : Nature genetics ; Nat. Genet. ; 2015 ; 47 ; 10 ; 1114-1120
PUBMED_LINK : 26323059

GCTA-GREML-Partition

NAME : GCTA-GREML-Partition
SHORT NAME : GREML
FULL NAME : Genome-wide complex trait analysis (GCTA) Genome-based restricted maximum likelihood (GREML)
DESCRIPTION : (partition the genetic variance into individual chromosomes and genomic segments)
URL : https://yanglab.westlake.edu.cn/software/gcta/#GREML
TITLE : Genome partitioning of genetic variation for complex traits using common SNPs
DOI : 10.1038/ng.823
ABSTRACT : We estimate and partition genetic variation for height, body mass index (BMI), von Willebrand factor and QT interval (QTi) using 586,898 SNPs genotyped on 11,586 unrelated individuals. We estimate that ∼45%, ∼17%, ∼25% and ∼21% of the variance in height, BMI, von Willebrand factor and QTi, respectively, can be explained by all autosomal SNPs and a further ∼0.5-1% can be explained by X chromosome SNPs. We show that the variance explained by each chromosome is proportional to its length, and that SNPs in or near genes explain more variation than SNPs between genes. We propose a new approach to estimate variation due to cryptic relatedness and population stratification. Our results provide further evidence that a substantial proportion of heritability is captured by common SNPs, that height, BMI and QTi are highly polygenic traits, and that the additive variation explained by a part of the genome is approximately proportional to the total length of DNA contained within genes therein.
CITATION : Yang J, Manolio TA, Pasquale LR, Boerwinkle E, ...&, Visscher PM. (2011) Genome partitioning of genetic variation for complex traits using common SNPs Nat. Genet., 43 (6) 519-525. doi:10.1038/ng.823. PMID 21552263
JOURNAL_INFO : Nature genetics ; Nat. Genet. ; 2011 ; 43 ; 6 ; 519-525
PUBMED_LINK : 21552263

GCTA-GREML-Quantitative

NAME : GCTA-GREML-Quantitative
SHORT NAME : GREML
FULL NAME : Genome-wide complex trait analysis (GCTA) Genome-based restricted maximum likelihood (GREML)
DESCRIPTION : GCTA-GREML analysis: estimating the variance explained by the SNPs / GCTA-GREML analysis for a case-control study
URL : https://yanglab.westlake.edu.cn/software/gcta/#GREML
TITLE : Common SNPs explain a large proportion of the heritability for human height
DOI : 10.1038/ng.608
ABSTRACT : SNPs discovered by genome-wide association studies (GWASs) account for only a small fraction of the genetic variation of complex traits in human populations. Where is the remaining heritability? We estimated the proportion of variance for human height explained by 294,831 SNPs genotyped on 3,925 unrelated individuals using a linear model analysis, and validated the estimation method with simulations based on the observed genotype data. We show that 45% of variance can be explained by considering all SNPs simultaneously. Thus, most of the heritability is not missing but has not previously been detected because the individual effects are too small to pass stringent significance tests. We provide evidence that the remaining heritability is due to incomplete linkage disequilibrium between causal variants and genotyped SNPs, exacerbated by causal variants having lower minor allele frequency than the SNPs explored to date.
CITATION : Yang J, Benyamin B, McEvoy BP, Gordon S, ...&, Visscher PM. (2010) Common SNPs explain a large proportion of the heritability for human height Nat. Genet., 42 (7) 565-569. doi:10.1038/ng.608. PMID 20562875
JOURNAL_INFO : Nature genetics ; Nat. Genet. ; 2010 ; 42 ; 7 ; 565-569
PUBMED_LINK : 20562875

HDL

NAME : HDL
SHORT NAME : HDL
FULL NAME : High-Definition Likelihood
DESCRIPTION : High-Definition Likelihood (HDL) is a likelihood-based method for estimating genetic correlation using GWAS summary statistics. Compared to LD Score regression (LDSC), It reduces the variance of a genetic correlation estimate by about 60%.
URL : https://github.com/zhenin/HDL/
TITLE : High-definition likelihood inference of genetic correlations across human complex traits
DOI : 10.1038/s41588-020-0653-y
ABSTRACT : Genetic correlation is a central parameter for understanding shared genetic architecture between complex traits. By using summary statistics from genome-wide association studies (GWAS), linkage disequilibrium score regression (LDSC) was developed for unbiased estimation of genetic correlations. Although easy to use, LDSC only partially utilizes LD information. By fully accounting for LD across the genome, we develop a high-definition likelihood (HDL) method to improve precision in genetic correlation estimation. Compared to LDSC, HDL reduces the variance of genetic correlation estimates by about 60%, equivalent to a 2.5-fold increase in sample size. We apply HDL and LDSC to estimate 435 genetic correlations among 30 behavioral and disease-related phenotypes measured in the UK Biobank (UKBB). In addition to 154 significant genetic correlations observed for both methods, HDL identified another 57 significant genetic correlations, compared to only another 2 significant genetic correlations identified by LDSC. HDL brings more power to genomic analyses and better reveals the underlying connections across human complex traits.
CITATION : Ning Z, Pawitan Y, Shen X. (2020) High-definition likelihood inference of genetic correlations across human complex traits Nat. Genet., 52 (8) 859-864. doi:10.1038/s41588-020-0653-y. PMID 32601477
JOURNAL_INFO : Nature genetics ; Nat. Genet. ; 2020 ; 52 ; 8 ; 859-864
PUBMED_LINK : 32601477

LDAK

NAME : LDAK
SHORT NAME : LDAK
FULL NAME : LD-adjusted kinships
DESCRIPTION : LDAK is a software package for analysing association study data.
URL : http://www.ldak.org/
TITLE : Improved heritability estimation from genome-wide SNPs
DOI : 10.1016/j.ajhg.2012.10.010
ABSTRACT : Estimation of narrow-sense heritability, h(2), from genome-wide SNPs genotyped in unrelated individuals has recently attracted interest and offers several advantages over traditional pedigree-based methods. With the use of this approach, it has been estimated that over half the heritability of human height can be attributed to the ~300,000 SNPs on a genome-wide genotyping array. In comparison, only 5%-10% can be explained by SNPs reaching genome-wide significance. We investigated via simulation the validity of several key assumptions underpinning the mixed-model analysis used in SNP-based h(2) estimation. Although we found that the method is reasonably robust to violations of four key assumptions, it can be highly sensitive to uneven linkage disequilibrium (LD) between SNPs: contributions to h(2) are overestimated from causal variants in regions of high LD and are underestimated in regions of low LD. The overall direction of the bias can be up or down depending on the genetic architecture of the trait, but it can be substantial in realistic scenarios. We propose a modified kinship matrix in which SNPs are weighted according to local LD. We show that this correction greatly reduces the bias and increases the precision of h(2) estimates. We demonstrate the impact of our method on the first seven diseases studied by the Wellcome Trust Case Control Consortium. Our LD adjustment revises downward the h(2) estimate for immune-related diseases, as expected because of high LD in the major-histocompatibility region, but increases it for some nonimmune diseases. To calculate our revised kinship matrix, we developed LDAK, software for computing LD-adjusted kinships.
COPYRIGHT : http://creativecommons.org/licenses/by/3.0/
CITATION : Speed D, Hemani G, Johnson MR, Balding DJ. (2012) Improved heritability estimation from genome-wide SNPs Am. J. Hum. Genet., 91 (6) 1011-1021. doi:10.1016/j.ajhg.2012.10.010. PMID 23217325
JOURNAL_INFO : The American Journal of Human Genetics ; Am. J. Hum. Genet. ; 2012 ; 91 ; 6 ; 1011-1021
PUBMED_LINK : 23217325

LDSC

NAME : LDSC
SHORT NAME : LDSC
FULL NAME : LD Score Regression
DESCRIPTION : ldsc is a command line tool for estimating heritability and genetic correlation from GWAS summary statistics. ldsc also computes LD Scores.
URL : https://github.com/bulik/ldsc
TITLE : LD Score regression distinguishes confounding from polygenicity in genome-wide association studies
DOI : 10.1038/ng.3211
ABSTRACT : Both polygenicity (many small genetic effects) and confounding biases, such as cryptic relatedness and population stratification, can yield an inflated distribution of test statistics in genome-wide association studies (GWAS). However, current methods cannot distinguish between inflation from a true polygenic signal and bias. We have developed an approach, LD Score regression, that quantifies the contribution of each by examining the relationship between test statistics and linkage disequilibrium (LD). The LD Score regression intercept can be used to estimate a more powerful and accurate correction factor than genomic control. We find strong evidence that polygenicity accounts for the majority of the inflation in test statistics in many GWAS of large sample size.
CITATION : Bulik-Sullivan B, Loh PR, Finucane HK, Ripke S, ...&, O'Donovan MC. (2015) LD Score regression distinguishes confounding from polygenicity in genome-wide association studies Nat. Genet., 47 (3) 291-295. doi:10.1038/ng.3211. PMID 25642630
JOURNAL_INFO : Nature genetics ; Nat. Genet. ; 2015 ; 47 ; 3 ; 291-295
PUBMED_LINK : 25642630

SumHer

NAME : SumHer
SHORT NAME : SumHer
FULL NAME : SumHer
URL : http://www.ldak.org/
TITLE : SumHer better estimates the SNP heritability of complex traits from summary statistics
DOI : 10.1038/s41588-018-0279-5
ABSTRACT : We present SumHer, software for estimating confounding bias, SNP heritability, enrichments of heritability and genetic correlations using summary statistics from genome-wide association studies. The key difference between SumHer and the existing software LD Score Regression (LDSC) is that SumHer allows the user to specify the heritability model. We apply SumHer to results from 24 large-scale association studies (average sample size 121,000) using our recommended heritability model. We show that these studies tended to substantially over-correct for confounding, and as a result the number of genome-wide significant loci was under-reported by about a quarter. We also estimate enrichments for 24 categories of SNPs defined by functional annotations. A previous study using LDSC reported that conserved regions were 13-fold enriched, and found a further six categories with above threefold enrichment. By contrast, our analysis using SumHer finds that none of the categories have enrichment above twofold. SumHer provides an improved understanding of the genetic architecture of complex traits, which enables more efficient analysis of future genetic data.
CITATION : Speed D, Balding DJ. (2019) SumHer better estimates the SNP heritability of complex traits from summary statistics Nat. Genet., 51 (2) 277-284. doi:10.1038/s41588-018-0279-5. PMID 30510236
JOURNAL_INFO : Nature genetics ; Nat. Genet. ; 2019 ; 51 ; 2 ; 277-284
PUBMED_LINK : 30510236

TetraHer

NAME : TetraHer
SHORT NAME : TetraHer
FULL NAME : TetraHer
DESCRIPTION : a method for estimating the liability heritability of binary phenotypes
URL : http://www.ldak.org/
TITLE : Estimating disease heritability from complex pedigrees allowing for ascertainment and covariates
DOI : 10.1016/j.ajhg.2024.02.010
ABSTRACT : We propose TetraHer, a method for estimating the liability heritability of binary phenotypes. TetraHer has five key features. First, it can be applied to data from complex pedigrees that contain multiple types of relationships. Second, it can correct for ascertainment of cases or controls. Third, it can accommodate covariates. Fourth, it can model the contribution of common environment. Fifth, it produces a likelihood that can be used for significance testing. We first demonstrate the validity of TetraHer on simulated data. We then use TetraHer to estimate liability heritability for 229 codes from the tenth International Classification of Diseases (ICD-10). We identify 107 codes with significant heritability (p < 0.05/229), which can be used in future analyses for investigating the genetic architecture of human diseases.
CITATION : Speed D, Evans DM. (2024) Estimating disease heritability from complex pedigrees allowing for ascertainment and covariates Am. J. Hum. Genet., 111 (4) 680-690. doi:10.1016/j.ajhg.2024.02.010. PMID 38490208
JOURNAL_INFO : The American Journal of Human Genetics ; Am. J. Hum. Genet. ; 2024 ; 111 ; 4 ; 680-690
PUBMED_LINK : 38490208

Local heritability/genetic correlation

GNOVA

NAME : GNOVA
SHORT NAME : GNOVA
FULL NAME : GeNetic cOVariance Analyzer
DESCRIPTION : A principled framework to estimate annotation-stratified genetic covariance using GWAS summary statistics.
URL : https://github.com/xtonyjiang/GNOVA
TITLE : A powerful approach to estimating annotation-stratified genetic covariance via GWAS summary statistics
DOI : 10.1016/j.ajhg.2017.11.001
ABSTRACT : Despite the success of large-scale genome-wide association studies (GWASs) on complex traits, our understanding of their genetic architecture is far from complete. Jointly modeling multiple traits' genetic profiles has provided insights into the shared genetic basis of many complex traits. However, large-scale inference sets a high bar for both statistical power and biological interpretability. Here we introduce a principled framework to estimate annotation-stratified genetic covariance between traits using GWAS summary statistics. Through theoretical and numerical analyses, we demonstrate that our method provides accurate covariance estimates, thereby enabling researchers to dissect both the shared and distinct genetic architecture across traits to better understand their etiologies. Among 50 complex traits with publicly accessible GWAS summary statistics (Ntotal≈ 4.5 million), we identified more than 170 pairs with statistically significant genetic covariance. In particular, we found strong genetic covariance between late-onset Alzheimer disease (LOAD) and amyotrophic lateral sclerosis (ALS), two major neurodegenerative diseases, in single-nucleotide polymorphisms (SNPs) with high minor allele frequencies and in SNPs located in the predicted functional genome. Joint analysis of LOAD, ALS, and other traits highlights LOAD's correlation with cognitive traits and hints at an autoimmune component for ALS.
COPYRIGHT : https://www.elsevier.com/open-access/userlicense/1.0/
CITATION : Lu Q, Li B, Ou D, Erlendsdottir M, ...&, Zhao H. (2017) A powerful approach to estimating annotation-stratified genetic covariance via GWAS summary statistics Am. J. Hum. Genet., 101 (6) 939-964. doi:10.1016/j.ajhg.2017.11.001. PMID 29220677
JOURNAL_INFO : The American Journal of Human Genetics ; Am. J. Hum. Genet. ; 2017 ; 101 ; 6 ; 939-964
PUBMED_LINK : 29220677

HDL-L

NAME : HDL-L
SHORT NAME : HDL-L
FULL NAME : high-definition likelihood (local)
DESCRIPTION : High-Definition Likelihood (HDL) is a likelihood-based method for estimating genetic correlation using GWAS summary statistics. Compared to LD Score regression (LDSC), It reduces the variance of a genetic correlation estimate by about 60%. Here, we provide an R-based computational tool HDL to implement our method.
URL : https://github.com/zhenin/HDL/
KEYWORDS : likelihood-based inference
CITATION : Li, Y., Pawitan, Y. & Shen, X. An enhanced framework for local genetic correlation analysis. Nat. Genet. 1–6 (2025).
PUBMED_LINK : 40065165

HEELS

NAME : HEELS
SHORT NAME : HEELS
FULL NAME : Heritability Estimation with high Efficiency using LD and association Summary Statistics
DESCRIPTION : HEELS is a Python-based command line tool that produce accurate and precise local heritability estimates using summary-level statistics (marginal association test statistics along with the empirical (in-sample) LD statistics).
URL : https://github.com/huilisabrina/HEELS
TITLE : Accurate and efficient estimation of local heritability using summary statistics and the linkage disequilibrium matrix
DOI : 10.1038/s41467-023-43565-9
ABSTRACT : Existing SNP-heritability estimators that leverage summary statistics from genome-wide association studies (GWAS) are much less efficient (i.e., have larger standard errors) than the restricted maximum likelihood (REML) estimators which require access to individual-level data. We introduce a new method for local heritability estimation-Heritability Estimation with high Efficiency using LD and association Summary Statistics (HEELS)-that significantly improves the statistical efficiency of summary-statistics-based heritability estimator and attains comparable statistical efficiency as REML (with a relative statistical efficiency >92%). Moreover, we propose representing the empirical LD matrix as the sum of a low-rank matrix and a banded matrix. We show that this way of modeling the LD can not only reduce the storage and memory cost, but also improve the computational efficiency of heritability estimation. We demonstrate the statistical efficiency of HEELS and the advantages of our proposed LD approximation strategies both in simulations and through empirical analyses of the UK Biobank data.
CITATION : Li H, Mazumder R, Lin X. (2023) Accurate and efficient estimation of local heritability using summary statistics and the linkage disequilibrium matrix Nat. Commun., 14 (1) 7954. doi:10.1038/s41467-023-43565-9. PMID 38040712
JOURNAL_INFO : Nature communications ; Nat. Commun. ; 2023 ; 14 ; 1 ; 7954
PUBMED_LINK : 38040712

HESS

NAME : HESS
SHORT NAME : HESS
FULL NAME : Heritability Estimation from Summary Statistics
DESCRIPTION : HESS (Heritability Estimation from Summary Statistics) is a software package for estimating and visualizing local SNP-heritability and genetic covariance (correlation) from GWAS summary association data.
URL : https://huwenboshi.github.io/hess/
TITLE : Contrasting the genetic architecture of 30 complex traits from summary association data
DOI : 10.1016/j.ajhg.2016.05.013
ABSTRACT : Variance-component methods that estimate the aggregate contribution of large sets of variants to the heritability of complex traits have yielded important insights into the genetic architecture of common diseases. Here, we introduce methods that estimate the total trait variance explained by the typed variants at a single locus in the genome (local SNP heritability) from genome-wide association study (GWAS) summary data while accounting for linkage disequilibrium among variants. We applied our estimator to ultra-large-scale GWAS summary data of 30 common traits and diseases to gain insights into their local genetic architecture. First, we found that common SNPs have a high contribution to the heritability of all studied traits. Second, we identified traits for which the majority of the SNP heritability can be confined to a small percentage of the genome. Third, we identified GWAS risk loci where the entire locus explains significantly more variance in the trait than the GWAS reported variants. Finally, we identified loci that explain a significant amount of heritability across multiple traits.
CITATION : Shi H, Kichaev G, Pasaniuc B. (2016) Contrasting the genetic architecture of 30 complex traits from summary association data Am. J. Hum. Genet., 99 (1) 139-153. doi:10.1016/j.ajhg.2016.05.013. PMID 27346688
JOURNAL_INFO : The American Journal of Human Genetics ; Am. J. Hum. Genet. ; 2016 ; 99 ; 1 ; 139-153
PUBMED_LINK : 27346688

LAVA

NAME : LAVA
SHORT NAME : LAVA
FULL NAME : Local Analysis of [co]Variant Association
DESCRIPTION : LAVA is a tool to conduct genome-wide, local genetic correlation analysis on multiple traits, using GWAS summary statistics as input.
URL : https://ctg.cncr.nl/software/lava
TITLE : An integrated framework for local genetic correlation analysis
DOI : 10.1038/s41588-022-01017-y
ABSTRACT : Genetic correlation (rg) analysis is used to identify phenotypes that may have a shared genetic basis. Traditionally, rg is studied globally, considering only the average of the shared signal across the genome, although this approach may fail when the rg is confined to particular genomic regions or in opposing directions at different loci. Current tools for local rg analysis are restricted to analysis of two phenotypes. Here we introduce LAVA, an integrated framework for local rg analysis that, in addition to testing the standard bivariate local rgs between two phenotypes, can evaluate local heritabilities and analyze conditional genetic relations between several phenotypes using partial correlation and multiple regression. Applied to 25 behavioral and health phenotypes, we show considerable heterogeneity in the bivariate local rgs across the genome, which is often masked by the global rg patterns, and demonstrate how our conditional approaches can elucidate more complex, multivariate genetic relations.
CITATION : Werme J, van der Sluis S, Posthuma D, de Leeuw CA. (2022) An integrated framework for local genetic correlation analysis Nat. Genet., 54 (3) 274-282. doi:10.1038/s41588-022-01017-y. PMID 35288712
JOURNAL_INFO : Nature genetics ; Nat. Genet. ; 2022 ; 54 ; 3 ; 274-282
PUBMED_LINK : 35288712

SUPERGNOVA

NAME : SUPERGNOVA
SHORT NAME : SUPERGNOVA
FULL NAME : SUPER GeNetic cOVariance Analyzer
DESCRIPTION : SUPERGNOVA (SUPER GeNetic cOVariance Analyzer) is a statistical framework to perform local genetic covariance analysis.
URL : https://github.com/qlu-lab/SUPERGNOVA
TITLE : SUPERGNOVA: local genetic correlation analysis reveals heterogeneous etiologic sharing of complex traits
DOI : 10.1186/s13059-021-02478-w
ABSTRACT : Local genetic correlation quantifies the genetic similarity of complex traits in specific genomic regions. However, accurate estimation of local genetic correlation remains challenging, due to linkage disequilibrium in local genomic regions and sample overlap across studies. We introduce SUPERGNOVA, a statistical framework to estimate local genetic correlations using summary statistics from genome-wide association studies. We demonstrate that SUPERGNOVA outperforms existing methods through simulations and analyses of 30 complex traits. In particular, we show that the positive yet paradoxical genetic correlation between autism spectrum disorder and cognitive performance could be explained by two etiologically distinct genetic signatures with bidirectional local genetic correlations.
COPYRIGHT : https://creativecommons.org/licenses/by/4.0
CITATION : Zhang Y, Lu Q, Ye Y, Huang K, ...&, Zhao H. (2021) SUPERGNOVA: local genetic correlation analysis reveals heterogeneous etiologic sharing of complex traits Genome Biol., 22 (1) 262. doi:10.1186/s13059-021-02478-w. PMID 34493297
JOURNAL_INFO : Genome biology ; Genome Biol. ; 2021 ; 22 ; 1 ; 262
PUBMED_LINK : 34493297