Heritability_and_genetic_correlation
Summary Table
NAME | CATEGORY | CITATION | YEAR |
---|---|---|---|
S-LDXR | Genetic correlation | Shi H, Gazal S, Kanai M, Koch EM, ...&, Price AL. (2021) Population-specific causal disease effect sizes in functionally important regions impacted by selection Nat. Commun., 12 (1) 1098. doi:10.1038/s41467-021-21286-1. PMID 33597505 | 2021 |
cross-trait LDSC | Genetic correlation | Bulik-Sullivan B, Finucane HK, Anttila V, Gusev A, ...&, Neale BM. (2015) An atlas of genetic correlations across human diseases and traits Nat. Genet., 47 (11) 1236-1241. doi:10.1038/ng.3406. PMID 26414676 | 2015 |
popcorn | Genetic correlation | Brown BC, Asian Genetic Epidemiology Network Type 2 Diabetes Consortium, Ye CJ, Price AL, ...&, Zaitlen N. (2016) Transethnic Genetic-Correlation Estimates from Summary Statistics Am. J. Hum. Genet., 99 (1) 76-88. doi:10.1016/j.ajhg.2016.05.001. PMID 27321947 | 2016 |
BayesRR-RC | Heritability | Patxot M, Banos DT, Kousathanas A, Orliac EJ, ...&, Robinson MR. (2021) Probabilistic inference of the genetic architecture underlying functional enrichment of complex traits Nat. Commun., 12 (1) 6972. doi:10.1038/s41467-021-27258-9. PMID 34848700 | 2021 |
GCTA-GREML-Binary | Heritability | Lee SH, Wray NR, Goddard ME, Visscher PM. (2011) Estimating missing heritability for disease from genome-wide association studies Am. J. Hum. Genet., 88 (3) 294-305. doi:10.1016/j.ajhg.2011.02.002. PMID 21376301 | 2011 |
GCTA-GREML-Bivariate | Heritability | Lee SH, Yang J, Goddard ME, Visscher PM, ...&, Wray NR. (2012) Estimation of pleiotropy between complex diseases using single-nucleotide polymorphism-derived genomic relationships and restricted maximum likelihood Bioinformatics, 28 (19) 2540-2542. doi:10.1093/bioinformatics/bts474. PMID 22843982 | 2012 |
GCTA-GREML-LDMS | Heritability | Yang J, Bakshi A, Zhu Z, Hemani G, ...&, Visscher PM. (2015) Genetic variance estimation with imputed variants finds negligible missing heritability for human height and body mass index Nat. Genet., 47 (10) 1114-1120. doi:10.1038/ng.3390. PMID 26323059 | 2015 |
GCTA-GREML-Partition | Heritability | Yang J, Manolio TA, Pasquale LR, Boerwinkle E, ...&, Visscher PM. (2011) Genome partitioning of genetic variation for complex traits using common SNPs Nat. Genet., 43 (6) 519-525. doi:10.1038/ng.823. PMID 21552263 | 2011 |
GCTA-GREML-Quantitative | Heritability | Yang J, Benyamin B, McEvoy BP, Gordon S, ...&, Visscher PM. (2010) Common SNPs explain a large proportion of the heritability for human height Nat. Genet., 42 (7) 565-569. doi:10.1038/ng.608. PMID 20562875 | 2010 |
HDL | Heritability | Ning Z, Pawitan Y, Shen X. (2020) High-definition likelihood inference of genetic correlations across human complex traits Nat. Genet., 52 (8) 859-864. doi:10.1038/s41588-020-0653-y. PMID 32601477 | 2020 |
LDAK | Heritability | Speed D, Hemani G, Johnson MR, Balding DJ. (2012) Improved heritability estimation from genome-wide SNPs Am. J. Hum. Genet., 91 (6) 1011-1021. doi:10.1016/j.ajhg.2012.10.010. PMID 23217325 | 2012 |
LDSC | Heritability | Bulik-Sullivan B, Loh PR, Finucane HK, Ripke S, ...&, O'Donovan MC. (2015) LD Score regression distinguishes confounding from polygenicity in genome-wide association studies Nat. Genet., 47 (3) 291-295. doi:10.1038/ng.3211. PMID 25642630 | 2015 |
SumHer | Heritability | Speed D, Balding DJ. (2019) SumHer better estimates the SNP heritability of complex traits from summary statistics Nat. Genet., 51 (2) 277-284. doi:10.1038/s41588-018-0279-5. PMID 30510236 | 2019 |
TetraHer | Heritability | Speed D, Evans DM. (2024) Estimating disease heritability from complex pedigrees allowing for ascertainment and covariates Am. J. Hum. Genet., 111 (4) 680-690. doi:10.1016/j.ajhg.2024.02.010. PMID 38490208 | 2024 |
GNOVA | Local heritability/genetic correlation | Lu Q, Li B, Ou D, Erlendsdottir M, ...&, Zhao H. (2017) A powerful approach to estimating annotation-stratified genetic covariance via GWAS summary statistics Am. J. Hum. Genet., 101 (6) 939-964. doi:10.1016/j.ajhg.2017.11.001. PMID 29220677 | 2017 |
HEELS | Local heritability/genetic correlation | Li H, Mazumder R, Lin X. (2023) Accurate and efficient estimation of local heritability using summary statistics and the linkage disequilibrium matrix Nat. Commun., 14 (1) 7954. doi:10.1038/s41467-023-43565-9. PMID 38040712 | 2023 |
HESS | Local heritability/genetic correlation | Shi H, Kichaev G, Pasaniuc B. (2016) Contrasting the genetic architecture of 30 complex traits from summary association data Am. J. Hum. Genet., 99 (1) 139-153. doi:10.1016/j.ajhg.2016.05.013. PMID 27346688 | 2016 |
LAVA | Local heritability/genetic correlation | Werme J, van der Sluis S, Posthuma D, de Leeuw CA. (2022) An integrated framework for local genetic correlation analysis Nat. Genet., 54 (3) 274-282. doi:10.1038/s41588-022-01017-y. PMID 35288712 | 2022 |
SUPERGNOVA | Local heritability/genetic correlation | Zhang Y, Lu Q, Ye Y, Huang K, ...&, Zhao H. (2021) SUPERGNOVA: local genetic correlation analysis reveals heterogeneous etiologic sharing of complex traits Genome Biol., 22 (1) 262. doi:10.1186/s13059-021-02478-w. PMID 34493297 | 2021 |
Genetic correlation
S-LDXR
- NAME : S-LDXR
- SHORT NAME : S-LDXR
- FULL NAME : S-LDXR
- DESCRIPTION : S-LDXR is a software for estimating enrichment of stratified squared trans-ethnic genetic correlation across genomic annotations from GWAS summary statistics data.
- URL : https://huwenboshi.github.io/s-ldxr/
- KEYWORDS : trans-ethnic, stratified, functional categories
- TITLE : Population-specific causal disease effect sizes in functionally important regions impacted by selection
- DOI : 10.1038/s41467-021-21286-1
- ABSTRACT : Many diseases exhibit population-specific causal effect sizes with trans-ethnic genetic correlations significantly less than 1, limiting trans-ethnic polygenic risk prediction. We develop a new method, S-LDXR, for stratifying squared trans-ethnic genetic correlation across genomic annotations, and apply S-LDXR to genome-wide summary statistics for 31 diseases and complex traits in East Asians (average N = 90K) and Europeans (average N = 267K) with an average trans-ethnic genetic correlation of 0.85. We determine that squared trans-ethnic genetic correlation is 0.82× (s.e. 0.01) depleted in the top quintile of background selection statistic, implying more population-specific causal effect sizes. Accordingly, causal effect sizes are more population-specific in functionally important regions, including conserved and regulatory regions. In regions surrounding specifically expressed genes, causal effect sizes are most population-specific for skin and immune genes, and least population-specific for brain genes. Our results could potentially be explained by stronger gene-environment interaction at loci impacted by selection, particularly positive selection.
- CITATION : Shi H, Gazal S, Kanai M, Koch EM, ...&, Price AL. (2021) Population-specific causal disease effect sizes in functionally important regions impacted by selection Nat. Commun., 12 (1) 1098. doi:10.1038/s41467-021-21286-1. PMID 33597505
- JOURNAL_INFO : Nature communications ; Nat. Commun. ; 2021 ; 12 ; 1 ; 1098
- PUBMED_LINK : 33597505
cross-trait LDSC
- NAME : cross-trait LDSC
- SHORT NAME : cross-trait LDSC
- FULL NAME : cross-trait LD Score Regression
- DESCRIPTION : ldsc is a command line tool for estimating heritability and genetic correlation from GWAS summary statistics. ldsc also computes LD Scores.
- URL : https://github.com/bulik/ldsc
- KEYWORDS : cross-trait, LD score regression
- TITLE : An atlas of genetic correlations across human diseases and traits
- DOI : 10.1038/ng.3406
- ABSTRACT : Identifying genetic correlations between complex traits and diseases can provide useful etiological insights and help prioritize likely causal relationships. The major challenges preventing estimation of genetic correlation from genome-wide association study (GWAS) data with current methods are the lack of availability of individual-level genotype data and widespread sample overlap among meta-analyses. We circumvent these difficulties by introducing a technique-cross-trait LD Score regression-for estimating genetic correlation that requires only GWAS summary statistics and is not biased by sample overlap. We use this method to estimate 276 genetic correlations among 24 traits. The results include genetic correlations between anorexia nervosa and schizophrenia, anorexia and obesity, and educational attainment and several diseases. These results highlight the power of genome-wide analyses, as there currently are no significantly associated SNPs for anorexia nervosa and only three for educational attainment.
- CITATION : Bulik-Sullivan B, Finucane HK, Anttila V, Gusev A, ...&, Neale BM. (2015) An atlas of genetic correlations across human diseases and traits Nat. Genet., 47 (11) 1236-1241. doi:10.1038/ng.3406. PMID 26414676
- JOURNAL_INFO : Nature genetics ; Nat. Genet. ; 2015 ; 47 ; 11 ; 1236-1241
- PUBMED_LINK : 26414676
popcorn
- NAME : popcorn
- SHORT NAME : popcorn
- FULL NAME : popcorn
- DESCRIPTION : Popcorn is a program for estimaing the correlation of causal variant effect. This is the python3 version of Popcorn and still under development sizes across populations in GWAS.
- URL : https://github.com/brielin/Popcorn
- KEYWORDS : trans-ethnic
- TITLE : Transethnic Genetic-Correlation Estimates from Summary Statistics
- DOI : 10.1016/j.ajhg.2016.05.001
- ABSTRACT : The increasing number of genetic association studies conducted in multiple populations provides an unprecedented opportunity to study how the genetic architecture of complex phenotypes varies between populations, a problem important for both medical and population genetics. Here, we have developed a method for estimating the transethnic genetic correlation: the correlation of causal-variant effect sizes at SNPs common in populations. This methods takes advantage of the entire spectrum of SNP associations and uses only summary-level data from genome-wide association studies. This avoids the computational costs and privacy concerns associated with genotype-level information while remaining scalable to hundreds of thousands of individuals and millions of SNPs. We applied our method to data on gene expression, rheumatoid arthritis, and type 2 diabetes and overwhelmingly found that the genetic correlation was significantly less than 1. Our method is implemented in a Python package called Popcorn.
- CITATION : Brown BC, Asian Genetic Epidemiology Network Type 2 Diabetes Consortium, Ye CJ, Price AL, ...&, Zaitlen N. (2016) Transethnic Genetic-Correlation Estimates from Summary Statistics Am. J. Hum. Genet., 99 (1) 76-88. doi:10.1016/j.ajhg.2016.05.001. PMID 27321947
- JOURNAL_INFO : American journal of human genetics ; Am. J. Hum. Genet. ; 2016 ; 99 ; 1 ; 76-88
- PUBMED_LINK : 27321947
Heritability
BayesRR-RC
- NAME : BayesRR-RC
- SHORT NAME : BayesRR-RC
- FULL NAME : BayesRR-RC
- DESCRIPTION : gmrm is hybrid-parallel software for a Bayesian grouped mixture of regressions model for genome-wide association studies (GWAS). It is written in C++ using extensive optimisations and code vectorisation. It relies on plink's .bed format. It can handle multiple traits simultaneously.
- URL : https://github.com/medical-genomics-group/gmrm
- TITLE : Probabilistic inference of the genetic architecture underlying functional enrichment of complex traits
- DOI : 10.1038/s41467-021-27258-9
- ABSTRACT : We develop a Bayesian model (BayesRR-RC) that provides robust SNP-heritability estimation, an alternative to marker discovery, and accurate genomic prediction, taking 22 seconds per iteration to estimate 8.4 million SNP-effects and 78 SNP-heritability parameters in the UK Biobank. We find that only ≤10% of the genetic variation captured for height, body mass index, cardiovascular disease, and type 2 diabetes is attributable to proximal regulatory regions within 10kb upstream of genes, while 12-25% is attributed to coding regions, 32-44% to introns, and 22-28% to distal 10-500kb upstream regions. Up to 24% of all cis and coding regions of each chromosome are associated with each trait, with over 3,100 independent exonic and intronic regions and over 5,400 independent regulatory regions having ≥95% probability of contributing ≥0.001% to the genetic variance of these four traits. Our open-source software (GMRM) provides a scalable alternative to current approaches for biobank data.
- COPYRIGHT : https://creativecommons.org/licenses/by/4.0
- CITATION : Patxot M, Banos DT, Kousathanas A, Orliac EJ, ...&, Robinson MR. (2021) Probabilistic inference of the genetic architecture underlying functional enrichment of complex traits Nat. Commun., 12 (1) 6972. doi:10.1038/s41467-021-27258-9. PMID 34848700
- JOURNAL_INFO : Nature communications ; Nat. Commun. ; 2021 ; 12 ; 1 ; 6972
- PUBMED_LINK : 34848700
GCTA-GREML-Binary
- NAME : GCTA-GREML-Binary
- SHORT NAME : GREML
- FULL NAME : Genome-wide complex trait analysis (GCTA) Genome-based restricted maximum likelihood (GREML)
- DESCRIPTION : (case-control)
- URL : https://yanglab.westlake.edu.cn/software/gcta/#GREML
- TITLE : Estimating missing heritability for disease from genome-wide association studies
- DOI : 10.1016/j.ajhg.2011.02.002
- ABSTRACT : Genome-wide association studies are designed to discover SNPs that are associated with a complex trait. Employing strict significance thresholds when testing individual SNPs avoids false positives at the expense of increasing false negatives. Recently, we developed a method for quantitative traits that estimates the variation accounted for when fitting all SNPs simultaneously. Here we develop this method further for case-control studies. We use a linear mixed model for analysis of binary traits and transform the estimates to a liability scale by adjusting both for scale and for ascertainment of the case samples. We show by theory and simulation that the method is unbiased. We apply the method to data from the Wellcome Trust Case Control Consortium and show that a substantial proportion of variation in liability for Crohn disease, bipolar disorder, and type I diabetes is tagged by common SNPs.
- COPYRIGHT : http://www.elsevier.com/open-access/userlicense/1.0/
- CITATION : Lee SH, Wray NR, Goddard ME, Visscher PM. (2011) Estimating missing heritability for disease from genome-wide association studies Am. J. Hum. Genet., 88 (3) 294-305. doi:10.1016/j.ajhg.2011.02.002. PMID 21376301
- JOURNAL_INFO : The American Journal of Human Genetics ; Am. J. Hum. Genet. ; 2011 ; 88 ; 3 ; 294-305
- PUBMED_LINK : 21376301
GCTA-GREML-Bivariate
- NAME : GCTA-GREML-Bivariate
- SHORT NAME : GREML
- FULL NAME : Genome-wide complex trait analysis (GCTA) Genome-based restricted maximum likelihood (GREML)
- DESCRIPTION : (Bivariate GREML)
- URL : https://yanglab.westlake.edu.cn/software/gcta/#GREML
- KEYWORDS : bivariate
- TITLE : Estimation of pleiotropy between complex diseases using single-nucleotide polymorphism-derived genomic relationships and restricted maximum likelihood
- DOI : 10.1093/bioinformatics/bts474
- ABSTRACT : SUMMARY: Genetic correlations are the genome-wide aggregate effects of causal variants affecting multiple traits. Traditionally, genetic correlations between complex traits are estimated from pedigree studies, but such estimates can be confounded by shared environmental factors. Moreover, for diseases, low prevalence rates imply that even if the true genetic correlation between disorders was high, co-aggregation of disorders in families might not occur or could not be distinguished from chance. We have developed and implemented statistical methods based on linear mixed models to obtain unbiased estimates of the genetic correlation between pairs of quantitative traits or pairs of binary traits of complex diseases using population-based case-control studies with genome-wide single-nucleotide polymorphism data. The method is validated in a simulation study and applied to estimate genetic correlation between various diseases from Wellcome Trust Case Control Consortium data in a series of bivariate analyses. We estimate a significant positive genetic correlation between risk of Type 2 diabetes and hypertension of ~0.31 (SE 0.14, P = 0.024). AVAILABILITY: Our methods, appropriate for both quantitative and binary traits, are implemented in the freely available software GCTA (http://www.complextraitgenomics.com/software/gcta/reml_bivar.html). CONTACT: hong.lee@uq.edu.au SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
- CITATION : Lee SH, Yang J, Goddard ME, Visscher PM, ...&, Wray NR. (2012) Estimation of pleiotropy between complex diseases using single-nucleotide polymorphism-derived genomic relationships and restricted maximum likelihood Bioinformatics, 28 (19) 2540-2542. doi:10.1093/bioinformatics/bts474. PMID 22843982
- JOURNAL_INFO : Bioinformatics (Oxford, England) ; Bioinformatics ; 2012 ; 28 ; 19 ; 2540-2542
- PUBMED_LINK : 22843982
GCTA-GREML-LDMS
- NAME : GCTA-GREML-LDMS
- SHORT NAME : GCTA-GREML-LDMS
- FULL NAME : GCTA-GREML-LDMS
- DESCRIPTION : (GREML-LDMS)
- URL : https://yanglab.westlake.edu.cn/software/gcta/#GREML
- TITLE : Genetic variance estimation with imputed variants finds negligible missing heritability for human height and body mass index
- DOI : 10.1038/ng.3390
- ABSTRACT : We propose a method (GREML-LDMS) to estimate heritability for human complex traits in unrelated individuals using whole-genome sequencing data. We demonstrate using simulations based on whole-genome sequencing data that ∼97% and ∼68% of variation at common and rare variants, respectively, can be captured by imputation. Using the GREML-LDMS method, we estimate from 44,126 unrelated individuals that all ∼17 million imputed variants explain 56% (standard error (s.e.) = 2.3%) of variance for height and 27% (s.e. = 2.5%) of variance for body mass index (BMI), and we find evidence that height- and BMI-associated variants have been under natural selection. Considering the imperfect tagging of imputation and potential overestimation of heritability from previous family-based studies, heritability is likely to be 60-70% for height and 30-40% for BMI. Therefore, the missing heritability is small for both traits. For further discovery of genes associated with complex traits, a study design with SNP arrays followed by imputation is more cost-effective than whole-genome sequencing at current prices.
- CITATION : Yang J, Bakshi A, Zhu Z, Hemani G, ...&, Visscher PM. (2015) Genetic variance estimation with imputed variants finds negligible missing heritability for human height and body mass index Nat. Genet., 47 (10) 1114-1120. doi:10.1038/ng.3390. PMID 26323059
- JOURNAL_INFO : Nature genetics ; Nat. Genet. ; 2015 ; 47 ; 10 ; 1114-1120
- PUBMED_LINK : 26323059
GCTA-GREML-Partition
- NAME : GCTA-GREML-Partition
- SHORT NAME : GREML
- FULL NAME : Genome-wide complex trait analysis (GCTA) Genome-based restricted maximum likelihood (GREML)
- DESCRIPTION : (partition the genetic variance into individual chromosomes and genomic segments)
- URL : https://yanglab.westlake.edu.cn/software/gcta/#GREML
- TITLE : Genome partitioning of genetic variation for complex traits using common SNPs
- DOI : 10.1038/ng.823
- ABSTRACT : We estimate and partition genetic variation for height, body mass index (BMI), von Willebrand factor and QT interval (QTi) using 586,898 SNPs genotyped on 11,586 unrelated individuals. We estimate that ∼45%, ∼17%, ∼25% and ∼21% of the variance in height, BMI, von Willebrand factor and QTi, respectively, can be explained by all autosomal SNPs and a further ∼0.5-1% can be explained by X chromosome SNPs. We show that the variance explained by each chromosome is proportional to its length, and that SNPs in or near genes explain more variation than SNPs between genes. We propose a new approach to estimate variation due to cryptic relatedness and population stratification. Our results provide further evidence that a substantial proportion of heritability is captured by common SNPs, that height, BMI and QTi are highly polygenic traits, and that the additive variation explained by a part of the genome is approximately proportional to the total length of DNA contained within genes therein.
- CITATION : Yang J, Manolio TA, Pasquale LR, Boerwinkle E, ...&, Visscher PM. (2011) Genome partitioning of genetic variation for complex traits using common SNPs Nat. Genet., 43 (6) 519-525. doi:10.1038/ng.823. PMID 21552263
- JOURNAL_INFO : Nature genetics ; Nat. Genet. ; 2011 ; 43 ; 6 ; 519-525
- PUBMED_LINK : 21552263
GCTA-GREML-Quantitative
- NAME : GCTA-GREML-Quantitative
- SHORT NAME : GREML
- FULL NAME : Genome-wide complex trait analysis (GCTA) Genome-based restricted maximum likelihood (GREML)
- DESCRIPTION : GCTA-GREML analysis: estimating the variance explained by the SNPs / GCTA-GREML analysis for a case-control study
- URL : https://yanglab.westlake.edu.cn/software/gcta/#GREML
- TITLE : Common SNPs explain a large proportion of the heritability for human height
- DOI : 10.1038/ng.608
- ABSTRACT : SNPs discovered by genome-wide association studies (GWASs) account for only a small fraction of the genetic variation of complex traits in human populations. Where is the remaining heritability? We estimated the proportion of variance for human height explained by 294,831 SNPs genotyped on 3,925 unrelated individuals using a linear model analysis, and validated the estimation method with simulations based on the observed genotype data. We show that 45% of variance can be explained by considering all SNPs simultaneously. Thus, most of the heritability is not missing but has not previously been detected because the individual effects are too small to pass stringent significance tests. We provide evidence that the remaining heritability is due to incomplete linkage disequilibrium between causal variants and genotyped SNPs, exacerbated by causal variants having lower minor allele frequency than the SNPs explored to date.
- CITATION : Yang J, Benyamin B, McEvoy BP, Gordon S, ...&, Visscher PM. (2010) Common SNPs explain a large proportion of the heritability for human height Nat. Genet., 42 (7) 565-569. doi:10.1038/ng.608. PMID 20562875
- JOURNAL_INFO : Nature genetics ; Nat. Genet. ; 2010 ; 42 ; 7 ; 565-569
- PUBMED_LINK : 20562875
HDL
- NAME : HDL
- SHORT NAME : HDL
- FULL NAME : High-Definition Likelihood
- DESCRIPTION : High-Definition Likelihood (HDL) is a likelihood-based method for estimating genetic correlation using GWAS summary statistics. Compared to LD Score regression (LDSC), It reduces the variance of a genetic correlation estimate by about 60%.
- URL : https://github.com/zhenin/HDL/
- TITLE : High-definition likelihood inference of genetic correlations across human complex traits
- DOI : 10.1038/s41588-020-0653-y
- ABSTRACT : Genetic correlation is a central parameter for understanding shared genetic architecture between complex traits. By using summary statistics from genome-wide association studies (GWAS), linkage disequilibrium score regression (LDSC) was developed for unbiased estimation of genetic correlations. Although easy to use, LDSC only partially utilizes LD information. By fully accounting for LD across the genome, we develop a high-definition likelihood (HDL) method to improve precision in genetic correlation estimation. Compared to LDSC, HDL reduces the variance of genetic correlation estimates by about 60%, equivalent to a 2.5-fold increase in sample size. We apply HDL and LDSC to estimate 435 genetic correlations among 30 behavioral and disease-related phenotypes measured in the UK Biobank (UKBB). In addition to 154 significant genetic correlations observed for both methods, HDL identified another 57 significant genetic correlations, compared to only another 2 significant genetic correlations identified by LDSC. HDL brings more power to genomic analyses and better reveals the underlying connections across human complex traits.
- CITATION : Ning Z, Pawitan Y, Shen X. (2020) High-definition likelihood inference of genetic correlations across human complex traits Nat. Genet., 52 (8) 859-864. doi:10.1038/s41588-020-0653-y. PMID 32601477
- JOURNAL_INFO : Nature genetics ; Nat. Genet. ; 2020 ; 52 ; 8 ; 859-864
- PUBMED_LINK : 32601477
LDAK
- NAME : LDAK
- SHORT NAME : LDAK
- FULL NAME : LD-adjusted kinships
- DESCRIPTION : LDAK is a software package for analysing association study data.
- URL : http://www.ldak.org/
- TITLE : Improved heritability estimation from genome-wide SNPs
- DOI : 10.1016/j.ajhg.2012.10.010
- ABSTRACT : Estimation of narrow-sense heritability, h(2), from genome-wide SNPs genotyped in unrelated individuals has recently attracted interest and offers several advantages over traditional pedigree-based methods. With the use of this approach, it has been estimated that over half the heritability of human height can be attributed to the ~300,000 SNPs on a genome-wide genotyping array. In comparison, only 5%-10% can be explained by SNPs reaching genome-wide significance. We investigated via simulation the validity of several key assumptions underpinning the mixed-model analysis used in SNP-based h(2) estimation. Although we found that the method is reasonably robust to violations of four key assumptions, it can be highly sensitive to uneven linkage disequilibrium (LD) between SNPs: contributions to h(2) are overestimated from causal variants in regions of high LD and are underestimated in regions of low LD. The overall direction of the bias can be up or down depending on the genetic architecture of the trait, but it can be substantial in realistic scenarios. We propose a modified kinship matrix in which SNPs are weighted according to local LD. We show that this correction greatly reduces the bias and increases the precision of h(2) estimates. We demonstrate the impact of our method on the first seven diseases studied by the Wellcome Trust Case Control Consortium. Our LD adjustment revises downward the h(2) estimate for immune-related diseases, as expected because of high LD in the major-histocompatibility region, but increases it for some nonimmune diseases. To calculate our revised kinship matrix, we developed LDAK, software for computing LD-adjusted kinships.
- COPYRIGHT : http://creativecommons.org/licenses/by/3.0/
- CITATION : Speed D, Hemani G, Johnson MR, Balding DJ. (2012) Improved heritability estimation from genome-wide SNPs Am. J. Hum. Genet., 91 (6) 1011-1021. doi:10.1016/j.ajhg.2012.10.010. PMID 23217325
- JOURNAL_INFO : The American Journal of Human Genetics ; Am. J. Hum. Genet. ; 2012 ; 91 ; 6 ; 1011-1021
- PUBMED_LINK : 23217325
LDSC
- NAME : LDSC
- SHORT NAME : LDSC
- FULL NAME : LD Score Regression
- DESCRIPTION : ldsc is a command line tool for estimating heritability and genetic correlation from GWAS summary statistics. ldsc also computes LD Scores.
- URL : https://github.com/bulik/ldsc
- TITLE : LD Score regression distinguishes confounding from polygenicity in genome-wide association studies
- DOI : 10.1038/ng.3211
- ABSTRACT : Both polygenicity (many small genetic effects) and confounding biases, such as cryptic relatedness and population stratification, can yield an inflated distribution of test statistics in genome-wide association studies (GWAS). However, current methods cannot distinguish between inflation from a true polygenic signal and bias. We have developed an approach, LD Score regression, that quantifies the contribution of each by examining the relationship between test statistics and linkage disequilibrium (LD). The LD Score regression intercept can be used to estimate a more powerful and accurate correction factor than genomic control. We find strong evidence that polygenicity accounts for the majority of the inflation in test statistics in many GWAS of large sample size.
- CITATION : Bulik-Sullivan B, Loh PR, Finucane HK, Ripke S, ...&, O'Donovan MC. (2015) LD Score regression distinguishes confounding from polygenicity in genome-wide association studies Nat. Genet., 47 (3) 291-295. doi:10.1038/ng.3211. PMID 25642630
- JOURNAL_INFO : Nature genetics ; Nat. Genet. ; 2015 ; 47 ; 3 ; 291-295
- PUBMED_LINK : 25642630
SumHer
- NAME : SumHer
- SHORT NAME : SumHer
- FULL NAME : SumHer
- URL : http://www.ldak.org/
- TITLE : SumHer better estimates the SNP heritability of complex traits from summary statistics
- DOI : 10.1038/s41588-018-0279-5
- ABSTRACT : We present SumHer, software for estimating confounding bias, SNP heritability, enrichments of heritability and genetic correlations using summary statistics from genome-wide association studies. The key difference between SumHer and the existing software LD Score Regression (LDSC) is that SumHer allows the user to specify the heritability model. We apply SumHer to results from 24 large-scale association studies (average sample size 121,000) using our recommended heritability model. We show that these studies tended to substantially over-correct for confounding, and as a result the number of genome-wide significant loci was under-reported by about a quarter. We also estimate enrichments for 24 categories of SNPs defined by functional annotations. A previous study using LDSC reported that conserved regions were 13-fold enriched, and found a further six categories with above threefold enrichment. By contrast, our analysis using SumHer finds that none of the categories have enrichment above twofold. SumHer provides an improved understanding of the genetic architecture of complex traits, which enables more efficient analysis of future genetic data.
- CITATION : Speed D, Balding DJ. (2019) SumHer better estimates the SNP heritability of complex traits from summary statistics Nat. Genet., 51 (2) 277-284. doi:10.1038/s41588-018-0279-5. PMID 30510236
- JOURNAL_INFO : Nature genetics ; Nat. Genet. ; 2019 ; 51 ; 2 ; 277-284
- PUBMED_LINK : 30510236
TetraHer
- NAME : TetraHer
- SHORT NAME : TetraHer
- FULL NAME : TetraHer
- DESCRIPTION : a method for estimating the liability heritability of binary phenotypes
- URL : http://www.ldak.org/
- TITLE : Estimating disease heritability from complex pedigrees allowing for ascertainment and covariates
- DOI : 10.1016/j.ajhg.2024.02.010
- ABSTRACT : We propose TetraHer, a method for estimating the liability heritability of binary phenotypes. TetraHer has five key features. First, it can be applied to data from complex pedigrees that contain multiple types of relationships. Second, it can correct for ascertainment of cases or controls. Third, it can accommodate covariates. Fourth, it can model the contribution of common environment. Fifth, it produces a likelihood that can be used for significance testing. We first demonstrate the validity of TetraHer on simulated data. We then use TetraHer to estimate liability heritability for 229 codes from the tenth International Classification of Diseases (ICD-10). We identify 107 codes with significant heritability (p < 0.05/229), which can be used in future analyses for investigating the genetic architecture of human diseases.
- CITATION : Speed D, Evans DM. (2024) Estimating disease heritability from complex pedigrees allowing for ascertainment and covariates Am. J. Hum. Genet., 111 (4) 680-690. doi:10.1016/j.ajhg.2024.02.010. PMID 38490208
- JOURNAL_INFO : The American Journal of Human Genetics ; Am. J. Hum. Genet. ; 2024 ; 111 ; 4 ; 680-690
- PUBMED_LINK : 38490208
Local heritability/genetic correlation
GNOVA
- NAME : GNOVA
- SHORT NAME : GNOVA
- FULL NAME : GeNetic cOVariance Analyzer
- DESCRIPTION : A principled framework to estimate annotation-stratified genetic covariance using GWAS summary statistics.
- URL : https://github.com/xtonyjiang/GNOVA
- TITLE : A powerful approach to estimating annotation-stratified genetic covariance via GWAS summary statistics
- DOI : 10.1016/j.ajhg.2017.11.001
- ABSTRACT : Despite the success of large-scale genome-wide association studies (GWASs) on complex traits, our understanding of their genetic architecture is far from complete. Jointly modeling multiple traits' genetic profiles has provided insights into the shared genetic basis of many complex traits. However, large-scale inference sets a high bar for both statistical power and biological interpretability. Here we introduce a principled framework to estimate annotation-stratified genetic covariance between traits using GWAS summary statistics. Through theoretical and numerical analyses, we demonstrate that our method provides accurate covariance estimates, thereby enabling researchers to dissect both the shared and distinct genetic architecture across traits to better understand their etiologies. Among 50 complex traits with publicly accessible GWAS summary statistics (Ntotal≈ 4.5 million), we identified more than 170 pairs with statistically significant genetic covariance. In particular, we found strong genetic covariance between late-onset Alzheimer disease (LOAD) and amyotrophic lateral sclerosis (ALS), two major neurodegenerative diseases, in single-nucleotide polymorphisms (SNPs) with high minor allele frequencies and in SNPs located in the predicted functional genome. Joint analysis of LOAD, ALS, and other traits highlights LOAD's correlation with cognitive traits and hints at an autoimmune component for ALS.
- COPYRIGHT : https://www.elsevier.com/open-access/userlicense/1.0/
- CITATION : Lu Q, Li B, Ou D, Erlendsdottir M, ...&, Zhao H. (2017) A powerful approach to estimating annotation-stratified genetic covariance via GWAS summary statistics Am. J. Hum. Genet., 101 (6) 939-964. doi:10.1016/j.ajhg.2017.11.001. PMID 29220677
- JOURNAL_INFO : The American Journal of Human Genetics ; Am. J. Hum. Genet. ; 2017 ; 101 ; 6 ; 939-964
- PUBMED_LINK : 29220677
HEELS
- NAME : HEELS
- SHORT NAME : HEELS
- FULL NAME : Heritability Estimation with high Efficiency using LD and association Summary Statistics
- DESCRIPTION : HEELS is a Python-based command line tool that produce accurate and precise local heritability estimates using summary-level statistics (marginal association test statistics along with the empirical (in-sample) LD statistics).
- URL : https://github.com/huilisabrina/HEELS
- TITLE : Accurate and efficient estimation of local heritability using summary statistics and the linkage disequilibrium matrix
- DOI : 10.1038/s41467-023-43565-9
- ABSTRACT : Existing SNP-heritability estimators that leverage summary statistics from genome-wide association studies (GWAS) are much less efficient (i.e., have larger standard errors) than the restricted maximum likelihood (REML) estimators which require access to individual-level data. We introduce a new method for local heritability estimation-Heritability Estimation with high Efficiency using LD and association Summary Statistics (HEELS)-that significantly improves the statistical efficiency of summary-statistics-based heritability estimator and attains comparable statistical efficiency as REML (with a relative statistical efficiency >92%). Moreover, we propose representing the empirical LD matrix as the sum of a low-rank matrix and a banded matrix. We show that this way of modeling the LD can not only reduce the storage and memory cost, but also improve the computational efficiency of heritability estimation. We demonstrate the statistical efficiency of HEELS and the advantages of our proposed LD approximation strategies both in simulations and through empirical analyses of the UK Biobank data.
- CITATION : Li H, Mazumder R, Lin X. (2023) Accurate and efficient estimation of local heritability using summary statistics and the linkage disequilibrium matrix Nat. Commun., 14 (1) 7954. doi:10.1038/s41467-023-43565-9. PMID 38040712
- JOURNAL_INFO : Nature communications ; Nat. Commun. ; 2023 ; 14 ; 1 ; 7954
- PUBMED_LINK : 38040712
HESS
- NAME : HESS
- SHORT NAME : HESS
- FULL NAME : Heritability Estimation from Summary Statistics
- DESCRIPTION : HESS (Heritability Estimation from Summary Statistics) is a software package for estimating and visualizing local SNP-heritability and genetic covariance (correlation) from GWAS summary association data.
- URL : https://huwenboshi.github.io/hess/
- TITLE : Contrasting the genetic architecture of 30 complex traits from summary association data
- DOI : 10.1016/j.ajhg.2016.05.013
- ABSTRACT : Variance-component methods that estimate the aggregate contribution of large sets of variants to the heritability of complex traits have yielded important insights into the genetic architecture of common diseases. Here, we introduce methods that estimate the total trait variance explained by the typed variants at a single locus in the genome (local SNP heritability) from genome-wide association study (GWAS) summary data while accounting for linkage disequilibrium among variants. We applied our estimator to ultra-large-scale GWAS summary data of 30 common traits and diseases to gain insights into their local genetic architecture. First, we found that common SNPs have a high contribution to the heritability of all studied traits. Second, we identified traits for which the majority of the SNP heritability can be confined to a small percentage of the genome. Third, we identified GWAS risk loci where the entire locus explains significantly more variance in the trait than the GWAS reported variants. Finally, we identified loci that explain a significant amount of heritability across multiple traits.
- CITATION : Shi H, Kichaev G, Pasaniuc B. (2016) Contrasting the genetic architecture of 30 complex traits from summary association data Am. J. Hum. Genet., 99 (1) 139-153. doi:10.1016/j.ajhg.2016.05.013. PMID 27346688
- JOURNAL_INFO : The American Journal of Human Genetics ; Am. J. Hum. Genet. ; 2016 ; 99 ; 1 ; 139-153
- PUBMED_LINK : 27346688
LAVA
- NAME : LAVA
- SHORT NAME : LAVA
- FULL NAME : Local Analysis of [co]Variant Association
- DESCRIPTION : LAVA is a tool to conduct genome-wide, local genetic correlation analysis on multiple traits, using GWAS summary statistics as input.
- URL : https://ctg.cncr.nl/software/lava
- TITLE : An integrated framework for local genetic correlation analysis
- DOI : 10.1038/s41588-022-01017-y
- ABSTRACT : Genetic correlation (rg) analysis is used to identify phenotypes that may have a shared genetic basis. Traditionally, rg is studied globally, considering only the average of the shared signal across the genome, although this approach may fail when the rg is confined to particular genomic regions or in opposing directions at different loci. Current tools for local rg analysis are restricted to analysis of two phenotypes. Here we introduce LAVA, an integrated framework for local rg analysis that, in addition to testing the standard bivariate local rgs between two phenotypes, can evaluate local heritabilities and analyze conditional genetic relations between several phenotypes using partial correlation and multiple regression. Applied to 25 behavioral and health phenotypes, we show considerable heterogeneity in the bivariate local rgs across the genome, which is often masked by the global rg patterns, and demonstrate how our conditional approaches can elucidate more complex, multivariate genetic relations.
- CITATION : Werme J, van der Sluis S, Posthuma D, de Leeuw CA. (2022) An integrated framework for local genetic correlation analysis Nat. Genet., 54 (3) 274-282. doi:10.1038/s41588-022-01017-y. PMID 35288712
- JOURNAL_INFO : Nature genetics ; Nat. Genet. ; 2022 ; 54 ; 3 ; 274-282
- PUBMED_LINK : 35288712
SUPERGNOVA
- NAME : SUPERGNOVA
- SHORT NAME : SUPERGNOVA
- FULL NAME : SUPER GeNetic cOVariance Analyzer
- DESCRIPTION : SUPERGNOVA (SUPER GeNetic cOVariance Analyzer) is a statistical framework to perform local genetic covariance analysis.
- URL : https://github.com/qlu-lab/SUPERGNOVA
- TITLE : SUPERGNOVA: local genetic correlation analysis reveals heterogeneous etiologic sharing of complex traits
- DOI : 10.1186/s13059-021-02478-w
- ABSTRACT : Local genetic correlation quantifies the genetic similarity of complex traits in specific genomic regions. However, accurate estimation of local genetic correlation remains challenging, due to linkage disequilibrium in local genomic regions and sample overlap across studies. We introduce SUPERGNOVA, a statistical framework to estimate local genetic correlations using summary statistics from genome-wide association studies. We demonstrate that SUPERGNOVA outperforms existing methods through simulations and analyses of 30 complex traits. In particular, we show that the positive yet paradoxical genetic correlation between autism spectrum disorder and cognitive performance could be explained by two etiologically distinct genetic signatures with bidirectional local genetic correlations.
- COPYRIGHT : https://creativecommons.org/licenses/by/4.0
- CITATION : Zhang Y, Lu Q, Ye Y, Huang K, ...&, Zhao H. (2021) SUPERGNOVA: local genetic correlation analysis reveals heterogeneous etiologic sharing of complex traits Genome Biol., 22 (1) 262. doi:10.1186/s13059-021-02478-w. PMID 34493297
- JOURNAL_INFO : Genome biology ; Genome Biol. ; 2021 ; 22 ; 1 ; 262
- PUBMED_LINK : 34493297