Fine_mapping
Summary Table
NAME | CITATION | YEAR |
---|---|---|
CAFEH | Arvanitis M, Tayeb K, Strober BJ, Battle A. (2022) Redefining tissue specificity of genetic regulation of gene expression in the presence of allelic heterogeneity Am. J. Hum. Genet., 109 (2) 223-239. doi:10.1016/j.ajhg.2022.01.002. PMID 35085493 | 2022 |
CAVIARBF | Chen W, Larrabee BR, Ovsyannikova IG, Kennedy RB, ...&, Schaid DJ. (2015) Fine mapping causal variants with an approximate Bayesian method using marginal test statistics Genetics, 200 (3) 719-736. doi:10.1534/genetics.115.176107. PMID 25948564 | 2015 |
CAVIAR | Hormozdiari F, Kostem E, Kang EY, Pasaniuc B, ...&, Eskin E. (2014) Identifying causal variants at loci with multiple signals of association Genetics, 198 (2) 497-508. doi:10.1534/genetics.114.167908. PMID 25104515 | 2014 |
FINEMAP | Benner C, Spencer CC, Havulinna AS, Salomaa V, ...&, Pirinen M. (2016) FINEMAP: efficient variable selection using summary data from genome-wide association studies Bioinformatics, 32 (10) 1493-1501. doi:10.1093/bioinformatics/btw018. PMID 26773131 | 2016 |
JAM | Newcombe PJ, Conti DV, Richardson S. (2016) JAM: A Scalable Bayesian Framework for Joint Analysis of Marginal SNP Effects Genet. Epidemiol., 40 (3) 188-201. doi:10.1002/gepi.21953. PMID 27027514 | 2016 |
MESuSiE | Gao B, Zhou X. (2024) MESuSiE enables scalable and powerful multi-ancestry fine-mapping of causal variants in genome-wide association studies Nat. Genet., 56 (1) 170-179. doi:10.1038/s41588-023-01604-7. PMID 38168930 | 2024 |
MR-MEGA | Mägi R, Horikoshi M, Sofer T, Mahajan A, ...&, Morris AP. (2017) Trans-ethnic meta-regression of genome-wide association studies accounting for ancestry increases power for discovery and improves fine-mapping resolution Hum. Mol. Genet., 26 (18) 3639-3650. doi:10.1093/hmg/ddx280. PMID 28911207 | 2017 |
MsCAVIAR | LaPierre N, Taraszka K, Huang H, He R, ...&, Eskin E. (2021) Identifying causal variants by fine mapping across multiple studies PLoS Genet., 17 (9) e1009733. doi:10.1371/journal.pgen.1009733. PMID 34543273 | 2021 |
MultiSuSiE | Rossen, J. et al. MultiSuSiE improves multi-ancestry fine-mapping in All of Us whole-genome sequencing data. medRxiv 2024.05.13.24307291 (2024) doi:10.1101/2024.05.13.24307291 | NA |
PAINTOR | Kichaev G, Yang WY, Lindstrom S, Hormozdiari F, ...&, Pasaniuc B. (2014) Integrating functional data to prioritize causal variants in statistical fine-mapping studies PLoS Genet., 10 (10) e1004722. doi:10.1371/journal.pgen.1004722. PMID 25357204 | 2014 |
RFR SuSiE-inf FINEMAP-inf | Cui R, Elzur RA, Kanai M, Ulirsch JC, ...&, Finucane HK. (2024) Improving fine-mapping by modeling infinitesimal effects Nat. Genet., 56 (1) 162-169. doi:10.1038/s41588-023-01597-3. PMID 38036779 | 2024 |
SUSIE-RSS | Zou Y, Carbonetto P, Wang G, Stephens M. (2022) Fine-mapping from summary data with the "Sum of Single Effects" model PLoS Genet., 18 (7) e1010299. doi:10.1371/journal.pgen.1010299. PMID 35853082 | 2022 |
SUSIE | Wang G, Sarkar A, Carbonetto P, Stephens M. (2020) A simple new approach to variable selection in regression, with application to genetic fine mapping J. R. Stat. Soc. Series B Stat. Methodol., 82 (5) 1273-1300. doi:10.1111/rssb.12388. PMID 37220626 | 2020 |
SUSIEx | Yuan, K., Longchamps, R. J., Pardiñas, A. F., Yu, M., Chen, T. T., Lin, S. C., ... & Schizophrenia Workgroup of Psychiatric Genomics Consortium. (2023). Fine-mapping across diverse ancestries drives the discovery of putative causal variants underlying human complex traits and diseases. medRxiv. | NA |
SparsePro | Zhang W, Najafabadi H, Li Y. (2023) SparsePro: An efficient fine-mapping method integrating summary statistics and functional annotations PLoS Genet., 19 (12) e1011104. doi:10.1371/journal.pgen.1011104. PMID 38153934 | 2023 |
mJAM | Shen, J., Jiang, L., Wang, K., Wang, A., Chen, F., Newcombe, P. J., ... & Conti, D. V. (2022). Fine-mapping and credible set construction using a multi-population joint analysis of marginal summary statistics from genome-wide association studies. bioRxiv, 2022-12. | NA |
mvSuSiE | Zou, Y., Carbonetto, P., Xie, D., Wang, G., & Stephens, M. (2023). Fast and flexible joint fine-mapping of multiple traits via the Sum of Single Effects model. bioRxiv, 2023-04. | NA |
CAFEH
- NAME : CAFEH
- SHORT NAME : CAFEH
- FULL NAME : colocalization and fine-mapping in the presence of allelic heterogeneity
- DESCRIPTION : CAFEH is a method that performs finemapping and colocalization jointly over multiple phenotypes. CAFEH can be run with 10s of phenotypes and 1000s of variants in a few minutes.
- URL : https://github.com/karltayeb/cafeh
- KEYWORDS : multi-trait, finemapping, colocalization
- TITLE : Redefining tissue specificity of genetic regulation of gene expression in the presence of allelic heterogeneity
- DOI : 10.1016/j.ajhg.2022.01.002
- ABSTRACT : Uncovering the functional impact of genetic variation on gene expression is important in understanding tissue biology and the pathogenesis of complex traits. Despite large efforts to map expression quantitative trait loci (eQTLs) across many human tissues, our ability to translate those findings to understanding human disease has been incomplete, and the majority of disease loci are not explained by association with expression of a target gene. Cell-type specificity and the presence of multiple independent causal variants for many eQTLs are potential confounders contributing to the apparent discrepancy with disease loci. In this study, we investigate the tissue specificity of genetic effects on gene expression and the overlap with disease loci while considering the presence of multiple causal variants within and across tissues. We find evidence of pervasive tissue specificity of eQTLs, often masked by linkage disequilibrium that misleads traditional meta-analytic approaches. We propose CAFEH (colocalization and fine-mapping in the presence of allelic heterogeneity), a Bayesian method that integrates genetic association data across multiple traits, incorporating linkage disequilibrium to identify causal variants. CAFEH outperforms previous approaches in colocalization and fine-mapping. Using CAFEH, we show that genes with highly tissue-specific genetic effects are under greater selection, enriched in differentiation and developmental processes, and more likely to be involved in human disease. Last, we demonstrate that CAFEH can efficiently leverage the widespread allelic heterogeneity in genetic regulation of gene expression to prioritize the target tissue in genome-wide association complex trait loci, thereby improving our ability to interpret complex trait genetics.
- CITATION : Arvanitis M, Tayeb K, Strober BJ, Battle A. (2022) Redefining tissue specificity of genetic regulation of gene expression in the presence of allelic heterogeneity Am. J. Hum. Genet., 109 (2) 223-239. doi:10.1016/j.ajhg.2022.01.002. PMID 35085493
- JOURNAL_INFO : American journal of human genetics ; Am. J. Hum. Genet. ; 2022 ; 109 ; 2 ; 223-239
- PUBMED_LINK : 35085493
CAVIAR
- NAME : CAVIAR
- SHORT NAME : CAVIAR
- FULL NAME : causal variants identification in associated regions
- DESCRIPTION : a statistical framework that quantifies the probability of each variant to be causal while allowing an arbitrary number of causal variants.
- URL : http://genetics.cs.ucla.edu/caviar/
- TITLE : Identifying causal variants at loci with multiple signals of association
- DOI : 10.1534/genetics.114.167908
- ABSTRACT : Although genome-wide association studies have successfully identified thousands of risk loci for complex traits, only a handful of the biologically causal variants, responsible for association at these loci, have been successfully identified. Current statistical methods for identifying causal variants at risk loci either use the strength of the association signal in an iterative conditioning framework or estimate probabilities for variants to be causal. A main drawback of existing methods is that they rely on the simplifying assumption of a single causal variant at each risk locus, which is typically invalid at many risk loci. In this work, we propose a new statistical framework that allows for the possibility of an arbitrary number of causal variants when estimating the posterior probability of a variant being causal. A direct benefit of our approach is that we predict a set of variants for each locus that under reasonable assumptions will contain all of the true causal variants with a high confidence level (e.g., 95%) even when the locus contains multiple causal variants. We use simulations to show that our approach provides 20-50% improvement in our ability to identify the causal variants compared to the existing methods at loci harboring multiple causal variants. We validate our approach using empirical data from an expression QTL study of CHI3L2 to identify new causal variants that affect gene expression at this locus. CAVIAR is publicly available online at http://genetics.cs.ucla.edu/caviar/.
- COPYRIGHT : https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model
- CITATION : Hormozdiari F, Kostem E, Kang EY, Pasaniuc B, ...&, Eskin E. (2014) Identifying causal variants at loci with multiple signals of association Genetics, 198 (2) 497-508. doi:10.1534/genetics.114.167908. PMID 25104515
- JOURNAL_INFO : Genetics ; Genetics ; 2014 ; 198 ; 2 ; 497-508
- PUBMED_LINK : 25104515
CAVIARBF
- NAME : CAVIARBF
- SHORT NAME : CAVIARBF
- FULL NAME : CAVIAR Bayes factor
- DESCRIPTION : a fine-mapping method using marginal test statistics in the Bayesian framework
- URL : https://bitbucket.org/Wenan/caviarbf/src/master/
- KEYWORDS : Bayes factor
- TITLE : Fine mapping causal variants with an approximate Bayesian method using marginal test statistics
- DOI : 10.1534/genetics.115.176107
- ABSTRACT : Two recently developed fine-mapping methods, CAVIAR and PAINTOR, demonstrate better performance over other fine-mapping methods. They also have the advantage of using only the marginal test statistics and the correlation among SNPs. Both methods leverage the fact that the marginal test statistics asymptotically follow a multivariate normal distribution and are likelihood based. However, their relationship with Bayesian fine mapping, such as BIMBAM, is not clear. In this study, we first show that CAVIAR and BIMBAM are actually approximately equivalent to each other. This leads to a fine-mapping method using marginal test statistics in the Bayesian framework, which we call CAVIAR Bayes factor (CAVIARBF). Another advantage of the Bayesian framework is that it can answer both association and fine-mapping questions. We also used simulations to compare CAVIARBF with other methods under different numbers of causal variants. The results showed that both CAVIARBF and BIMBAM have better performance than PAINTOR and other methods. Compared to BIMBAM, CAVIARBF has the advantage of using only marginal test statistics and takes about one-quarter to one-fifth of the running time. We applied different methods on two independent cohorts of the same phenotype. Results showed that CAVIARBF, BIMBAM, and PAINTOR selected the same top 3 SNPs; however, CAVIARBF and BIMBAM had better consistency in selecting the top 10 ranked SNPs between the two cohorts. Software is available at https://bitbucket.org/Wenan/caviarbf.
- COPYRIGHT : https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model
- CITATION : Chen W, Larrabee BR, Ovsyannikova IG, Kennedy RB, ...&, Schaid DJ. (2015) Fine mapping causal variants with an approximate Bayesian method using marginal test statistics Genetics, 200 (3) 719-736. doi:10.1534/genetics.115.176107. PMID 25948564
- JOURNAL_INFO : Genetics ; Genetics ; 2015 ; 200 ; 3 ; 719-736
- PUBMED_LINK : 25948564
FINEMAP
- NAME : FINEMAP
- SHORT NAME : FINEMAP
- FULL NAME : FINEMAP
- DESCRIPTION : FINEMAP is a program for 1.identifying causal SNPs, 2. estimating effect sizes of causal SNPs, 3 estimating the heritability contribution of causal SNPs
- URL : http://www.christianbenner.com/
- KEYWORDS : Shotgun Stochastic Search (SSS)
- TITLE : FINEMAP: efficient variable selection using summary data from genome-wide association studies
- DOI : 10.1093/bioinformatics/btw018
- ABSTRACT : MOTIVATION: The goal of fine-mapping in genomic regions associated with complex diseases and traits is to identify causal variants that point to molecular mechanisms behind the associations. Recent fine-mapping methods using summary data from genome-wide association studies rely on exhaustive search through all possible causal configurations, which is computationally expensive. RESULTS: We introduce FINEMAP, a software package to efficiently explore a set of the most important causal configurations of the region via a shotgun stochastic search algorithm. We show that FINEMAP produces accurate results in a fraction of processing time of existing approaches and is therefore a promising tool for analyzing growing amounts of data produced in genome-wide association studies and emerging sequencing projects. AVAILABILITY AND IMPLEMENTATION: FINEMAP v1.0 is freely available for Mac OS X and Linux at http://www.christianbenner.com CONTACT: : christian.benner@helsinki.fi or matti.pirinen@helsinki.fi.
- COPYRIGHT : http://creativecommons.org/licenses/by-nc/4.0/
- CITATION : Benner C, Spencer CC, Havulinna AS, Salomaa V, ...&, Pirinen M. (2016) FINEMAP: efficient variable selection using summary data from genome-wide association studies Bioinformatics, 32 (10) 1493-1501. doi:10.1093/bioinformatics/btw018. PMID 26773131
- JOURNAL_INFO : Bioinformatics (Oxford, England) ; Bioinformatics ; 2016 ; 32 ; 10 ; 1493-1501
- PUBMED_LINK : 26773131
JAM
- NAME : JAM
- SHORT NAME : JAM
- FULL NAME : joint analysis of marginal summary statistics
- DESCRIPTION : Bayesian variable selection under a range of likelihoods, including linear regression for continuous outcomes, logistic regression for binary outcomes, Weibull regression for survival outcomes binary and survial outcomes, and the "JAM" model for summary genetic association data.
- URL : https://github.com/pjnewcombe/R2BGLiMS
- TITLE : JAM: A Scalable Bayesian Framework for Joint Analysis of Marginal SNP Effects
- DOI : 10.1002/gepi.21953
- ABSTRACT : Recently, large scale genome-wide association study (GWAS) meta-analyses have boosted the number of known signals for some traits into the tens and hundreds. Typically, however, variants are only analysed one-at-a-time. This complicates the ability of fine-mapping to identify a small set of SNPs for further functional follow-up. We describe a new and scalable algorithm, joint analysis of marginal summary statistics (JAM), for the re-analysis of published marginal summary stactistics under joint multi-SNP models. The correlation is accounted for according to estimates from a reference dataset, and models and SNPs that best explain the complete joint pattern of marginal effects are highlighted via an integrated Bayesian penalized regression framework. We provide both enumerated and Reversible Jump MCMC implementations of JAM and present some comparisons of performance. In a series of realistic simulation studies, JAM demonstrated identical performance to various alternatives designed for single region settings. In multi-region settings, where the only multivariate alternative involves stepwise selection, JAM offered greater power and specificity. We also present an application to real published results from MAGIC (meta-analysis of glucose and insulin related traits consortium) - a GWAS meta-analysis of more than 15,000 people. We re-analysed several genomic regions that produced multiple significant signals with glucose levels 2 hr after oral stimulation. Through joint multivariate modelling, JAM was able to formally rule out many SNPs, and for one gene, ADCY5, suggests that an additional SNP, which transpired to be more biologically plausible, should be followed up with equal priority to the reported index.
- CITATION : Newcombe PJ, Conti DV, Richardson S. (2016) JAM: A Scalable Bayesian Framework for Joint Analysis of Marginal SNP Effects Genet. Epidemiol., 40 (3) 188-201. doi:10.1002/gepi.21953. PMID 27027514
- JOURNAL_INFO : Genetic epidemiology ; Genet. Epidemiol. ; 2016 ; 40 ; 3 ; 188-201
- PUBMED_LINK : 27027514
MESuSiE
- NAME : MESuSiE
- SHORT NAME : MESuSiE
- FULL NAME : multi-ancestry sum of the single effects model
- DESCRIPTION : MESuSiE relies on GWAS summary statistics from multiple ancestries, properly accounts for the LD structure of the local genomic region in multiple ancestries, and explicitly models both shared and ancestry-specific causal signals to accommodate causal effect size similarity as well as heterogeneity across ancestries. MESuSiE outputs posterior inclusion probability of variant being shared or ancestry-specific causal variants.
- URL : https://github.com/borangao/MESuSiE
- KEYWORDS : multi-trait, fine-mapping
- TITLE : MESuSiE enables scalable and powerful multi-ancestry fine-mapping of causal variants in genome-wide association studies
- DOI : 10.1038/s41588-023-01604-7
- ABSTRACT : Fine-mapping in genome-wide association studies attempts to identify causal SNPs from a set of candidate SNPs in a local genomic region of interest and is commonly performed in one genetic ancestry at a time. Here, we present multi-ancestry sum of the single effects model (MESuSiE), a probabilistic multi-ancestry fine-mapping method, to improve the accuracy and resolution of fine-mapping by leveraging association information across ancestries. MESuSiE uses summary statistics as input, accounts for the diverse linkage disequilibrium pattern observed in different ancestries, explicitly models both shared and ancestry-specific causal SNPs, and relies on a variational inference algorithm for scalable computation. We evaluated the performance of MESuSiE through comprehensive simulations and multi-ancestry fine-mapping of four lipid traits with both European and African samples. In the real data, MESuSiE improves fine-mapping resolution by 19.0% to 72.0% compared to existing approaches, is an order of magnitude faster, and captures and categorizes shared and ancestry-specific causal signals with enhanced functional enrichment.
- CITATION : Gao B, Zhou X. (2024) MESuSiE enables scalable and powerful multi-ancestry fine-mapping of causal variants in genome-wide association studies Nat. Genet., 56 (1) 170-179. doi:10.1038/s41588-023-01604-7. PMID 38168930
- JOURNAL_INFO : Nature genetics ; Nat. Genet. ; 2024 ; 56 ; 1 ; 170-179
- PUBMED_LINK : 38168930
MR-MEGA
- NAME : MR-MEGA
- SHORT NAME : MR-MEGA
- FULL NAME : Meta-Regression of Multi-AncEstry Genetic Association
- DESCRIPTION : MR-MEGA (Meta-Regression of Multi-AncEstry Genetic Association) is a tool to detect and fine-map complex trait association signals via multi-ancestry meta-regression. This approach uses genome-wide metrics of diversity between populations to derive axes of genetic variation via multi-dimensional scaling [Purcell 2007]. Allelic effects of a variant across GWAS, weighted by their corresponding standard errors, can then be modelled in a linear regression framework, including the axes of genetic variation as covariates. The flexibility of this model enables partitioning of the heterogeneity into components due to ancestry and residual variation, which would be expected to improve fine-mapping resolution.
- URL : https://genomics.ut.ee/en/tools
- KEYWORDS : Multi-AncEstry
- TITLE : Trans-ethnic meta-regression of genome-wide association studies accounting for ancestry increases power for discovery and improves fine-mapping resolution
- DOI : 10.1093/hmg/ddx280
- ABSTRACT : Trans-ethnic meta-analysis of genome-wide association studies (GWAS) across diverse populations can increase power to detect complex trait loci when the underlying causal variants are shared between ancestry groups. However, heterogeneity in allelic effects between GWAS at these loci can occur that is correlated with ancestry. Here, a novel approach is presented to detect SNP association and quantify the extent of heterogeneity in allelic effects that is correlated with ancestry. We employ trans-ethnic meta-regression to model allelic effects as a function of axes of genetic variation, derived from a matrix of mean pairwise allele frequency differences between GWAS, and implemented in the MR-MEGA software. Through detailed simulations, we demonstrate increased power to detect association for MR-MEGA over fixed- and random-effects meta-analysis across a range of scenarios of heterogeneity in allelic effects between ethnic groups. We also demonstrate improved fine-mapping resolution, in loci containing a single causal variant, compared to these meta-analysis approaches and PAINTOR, and equivalent performance to MANTRA at reduced computational cost. Application of MR-MEGA to trans-ethnic GWAS of kidney function in 71,461 individuals indicates stronger signals of association than fixed-effects meta-analysis when heterogeneity in allelic effects is correlated with ancestry. Application of MR-MEGA to fine-mapping four type 2 diabetes susceptibility loci in 22,086 cases and 42,539 controls highlights: (i) strong evidence for heterogeneity in allelic effects that is correlated with ancestry only at the index SNP for the association signal at the CDKAL1 locus; and (ii) 99% credible sets with six or fewer variants for five distinct association signals.
- CITATION : Mägi R, Horikoshi M, Sofer T, Mahajan A, ...&, Morris AP. (2017) Trans-ethnic meta-regression of genome-wide association studies accounting for ancestry increases power for discovery and improves fine-mapping resolution Hum. Mol. Genet., 26 (18) 3639-3650. doi:10.1093/hmg/ddx280. PMID 28911207
- JOURNAL_INFO : Human molecular genetics ; Hum. Mol. Genet. ; 2017 ; 26 ; 18 ; 3639-3650
- PUBMED_LINK : 28911207
MsCAVIAR
- NAME : MsCAVIAR
- SHORT NAME : MsCAVIAR
- FULL NAME : multiple study causal variants identification in associated regions
- DESCRIPTION : MsCAVIAR is a method for fine-mapping (identifying causal variants among GWAS associated variants) by leveraging information from multiple studies. One important application area is trans-ethnic fine mapping.
- URL : https://github.com/nlapier2/MsCAVIAR
- KEYWORDS : multi-study finemapping
- TITLE : Identifying causal variants by fine mapping across multiple studies
- DOI : 10.1371/journal.pgen.1009733
- ABSTRACT : Increasingly large Genome-Wide Association Studies (GWAS) have yielded numerous variants associated with many complex traits, motivating the development of "fine mapping" methods to identify which of the associated variants are causal. Additionally, GWAS of the same trait for different populations are increasingly available, raising the possibility of refining fine mapping results further by leveraging different linkage disequilibrium (LD) structures across studies. Here, we introduce multiple study causal variants identification in associated regions (MsCAVIAR), a method that extends the popular CAVIAR fine mapping framework to a multiple study setting using a random effects model. MsCAVIAR only requires summary statistics and LD as input, accounts for uncertainty in association statistics using a multivariate normal model, allows for multiple causal variants at a locus, and explicitly models the possibility of different SNP effect sizes in different populations. We demonstrate the efficacy of MsCAVIAR in both a simulation study and a trans-ethnic, trans-biobank fine mapping analysis of High Density Lipoprotein (HDL).
- COPYRIGHT : http://creativecommons.org/licenses/by/4.0/
- CITATION : LaPierre N, Taraszka K, Huang H, He R, ...&, Eskin E. (2021) Identifying causal variants by fine mapping across multiple studies PLoS Genet., 17 (9) e1009733. doi:10.1371/journal.pgen.1009733. PMID 34543273
- JOURNAL_INFO : PLoS genetics ; PLoS Genet. ; 2021 ; 17 ; 9 ; e1009733
- PUBMED_LINK : 34543273
MultiSuSiE
- NAME : MultiSuSiE
- SHORT NAME : MultiSuSiE
- FULL NAME : MultiSuSiE
- DESCRIPTION : MultiSuSiE is a multi-ancestry extension of the Sum of Single Effects model (Wang et al. 2020 J. R. Statist. Soc. B, Zou et al. 2022 PLoS Genet.) implemented in Python.
- URL : https://github.com/jordanero/MultiSuSiE
- KEYWORDS : cross-ancestry, fine-mapping
- CITATION : Rossen, J. et al. MultiSuSiE improves multi-ancestry fine-mapping in All of Us whole-genome sequencing data. medRxiv 2024.05.13.24307291 (2024) doi:10.1101/2024.05.13.24307291
PAINTOR
- NAME : PAINTOR
- SHORT NAME : PAINTOR
- FULL NAME : Probabilistic Annotation INtegraTOR
- DESCRIPTION : Finding causal variants that underlie known risk loci is one of the main post-GWAS challenges. Here we present PAINTOR (Probabilistic Annotation INtegraTOR), a probabilistic framework that integrates association strength with genomic functional annotation data to improve accuracy in selecting plausible causal variants for functional validation. The main output of PAINTOR are probabilities for every variant to be causal that can be used for prioritization in functional assays to establish biological causality.
- URL : https://bogdan.dgsom.ucla.edu/pages/paintor/
- KEYWORDS : Empirical Bayes prior
- TITLE : Integrating functional data to prioritize causal variants in statistical fine-mapping studies
- DOI : 10.1371/journal.pgen.1004722
- ABSTRACT : Standard statistical approaches for prioritization of variants for functional testing in fine-mapping studies either use marginal association statistics or estimate posterior probabilities for variants to be causal under simplifying assumptions. Here, we present a probabilistic framework that integrates association strength with functional genomic annotation data to improve accuracy in selecting plausible causal variants for functional validation. A key feature of our approach is that it empirically estimates the contribution of each functional annotation to the trait of interest directly from summary association statistics while allowing for multiple causal variants at any risk locus. We devise efficient algorithms that estimate the parameters of our model across all risk loci to further increase performance. Using simulations starting from the 1000 Genomes data, we find that our framework consistently outperforms the current state-of-the-art fine-mapping methods, reducing the number of variants that need to be selected to capture 90% of the causal variants from an average of 13.3 to 10.4 SNPs per locus (as compared to the next-best performing strategy). Furthermore, we introduce a cost-to-benefit optimization framework for determining the number of variants to be followed up in functional assays and assess its performance using real and simulation data. We validate our findings using a large scale meta-analysis of four blood lipids traits and find that the relative probability for causality is increased for variants in exons and transcription start sites and decreased in repressed genomic regions at the risk loci of these traits. Using these highly predictive, trait-specific functional annotations, we estimate causality probabilities across all traits and variants, reducing the size of the 90% confidence set from an average of 17.5 to 13.5 variants per locus in this data.
- CITATION : Kichaev G, Yang WY, Lindstrom S, Hormozdiari F, ...&, Pasaniuc B. (2014) Integrating functional data to prioritize causal variants in statistical fine-mapping studies PLoS Genet., 10 (10) e1004722. doi:10.1371/journal.pgen.1004722. PMID 25357204
- JOURNAL_INFO : PLoS genetics ; PLoS Genet. ; 2014 ; 10 ; 10 ; e1004722
- PUBMED_LINK : 25357204
RFR SuSiE-inf FINEMAP-inf
- NAME : RFR SuSiE-inf FINEMAP-inf
- SHORT NAME : RFR
- FULL NAME : Replication Failure Rate
- DESCRIPTION : Replication Failure Rate (RFR), a metric to assess the consistency of fine-mapping results based on downsampling a large cohort. SuSiE-inf and FINEMAP-inf, that extend SuSiE and FINEMAP to incorporate a term for infinitesimal effects in addition to a small number of larger causal effects of interest.
- URL : https://github.com/FinucaneLab/fine-mapping-inf
- TITLE : Improving fine-mapping by modeling infinitesimal effects
- DOI : 10.1038/s41588-023-01597-3
- ABSTRACT : Fine-mapping aims to identify causal genetic variants for phenotypes. Bayesian fine-mapping algorithms (for example, SuSiE, FINEMAP, ABF and COJO-ABF) are widely used, but assessing posterior probability calibration remains challenging in real data, where model misspecification probably exists, and true causal variants are unknown. We introduce replication failure rate (RFR), a metric to assess fine-mapping consistency by downsampling. SuSiE, FINEMAP and COJO-ABF show high RFR, indicating potential overconfidence in their output. Simulations reveal that nonsparse genetic architecture can lead to miscalibration, while imputation noise, nonuniform distribution of causal variants and quality control filters have minimal impact. Here we present SuSiE-inf and FINEMAP-inf, fine-mapping methods modeling infinitesimal effects alongside fewer larger causal effects. Our methods show improved calibration, RFR and functional enrichment, competitive recall and computational efficiency. Notably, using our methods' posterior effect sizes substantially increases polygenic risk score accuracy over SuSiE and FINEMAP. Our work improves causal variant identification for complex traits, a fundamental goal of human genetics.
- CITATION : Cui R, Elzur RA, Kanai M, Ulirsch JC, ...&, Finucane HK. (2024) Improving fine-mapping by modeling infinitesimal effects Nat. Genet., 56 (1) 162-169. doi:10.1038/s41588-023-01597-3. PMID 38036779
- JOURNAL_INFO : Nature genetics ; Nat. Genet. ; 2024 ; 56 ; 1 ; 162-169
- PUBMED_LINK : 38036779
SUSIE
- NAME : SUSIE
- SHORT NAME : SUSIE
- FULL NAME : sum of single effects
- DESCRIPTION : The susieR package implements a simple new way to perform variable selection in multiple regression (y = Xb + e). The methods implemented here are particularly well-suited to settings where some of the X variables are highly correlated, and the true effects are highly sparse (e.g. <20 non-zero effects in the vector b). One example of this is genetic fine-mapping applications, and this application was a major motivation for developing these methods.
- URL : https://stephenslab.github.io/susieR/index.html
- KEYWORDS : fine-mapping, sum of single-effects (SuSiE) regression, iterative Bayesian stepwise selection (IBSS)
- TITLE : A simple new approach to variable selection in regression, with application to genetic fine mapping
- DOI : 10.1111/rssb.12388
- ABSTRACT : We introduce a simple new approach to variable selection in linear regression, with a particular focus on quantifying uncertainty in which variables should be selected. The approach is based on a new model - the "Sum of Single Effects" (SuSiE) model - which comes from writing the sparse vector of regression coefficients as a sum of "single-effect" vectors, each with one non-zero element. We also introduce a corresponding new fitting procedure - Iterative Bayesian Stepwise Selection (IBSS) - which is a Bayesian analogue of stepwise selection methods. IBSS shares the computational simplicity and speed of traditional stepwise methods, but instead of selecting a single variable at each step, IBSS computes a distribution on variables that captures uncertainty in which variable to select. We provide a formal justification of this intuitive algorithm by showing that it optimizes a variational approximation to the posterior distribution under the SuSiE model. Further, this approximate posterior distribution naturally yields convenient novel summaries of uncertainty in variable selection, providing a Credible Set of variables for each selection. Our methods are particularly well-suited to settings where variables are highly correlated and detectable effects are sparse, both of which are characteristics of genetic fine-mapping applications. We demonstrate through numerical experiments that our methods outperform existing methods for this task, and illustrate their application to fine-mapping genetic variants influencing alternative splicing in human cell-lines. We also discuss the potential and challenges for applying these methods to generic variable selection problems.
- COPYRIGHT : http://creativecommons.org/licenses/by/4.0/
- CITATION : Wang G, Sarkar A, Carbonetto P, Stephens M. (2020) A simple new approach to variable selection in regression, with application to genetic fine mapping J. R. Stat. Soc. Series B Stat. Methodol., 82 (5) 1273-1300. doi:10.1111/rssb.12388. PMID 37220626
- JOURNAL_INFO : Journal of the Royal Statistical Society. Series B, Statistical methodology ; J. R. Stat. Soc. Series B Stat. Methodol. ; 2020 ; 82 ; 5 ; 1273-1300
- PUBMED_LINK : 37220626
SUSIE-RSS
- NAME : SUSIE-RSS
- SHORT NAME : SUSIE-RSS
- FULL NAME : sum of single effects regression with summary statistics
- DESCRIPTION : The susieR package implements a simple new way to perform variable selection in multiple regression (y = Xb + e). The methods implemented here are particularly well-suited to settings where some of the X variables are highly correlated, and the true effects are highly sparse (e.g. <20 non-zero effects in the vector b). One example of this is genetic fine-mapping applications, and this application was a major motivation for developing these methods.
- URL : https://stephenslab.github.io/susieR/index.html
- KEYWORDS : fine-mapping, summary statistics
- TITLE : Fine-mapping from summary data with the "Sum of Single Effects" model
- DOI : 10.1371/journal.pgen.1010299
- ABSTRACT : In recent work, Wang et al introduced the "Sum of Single Effects" (SuSiE) model, and showed that it provides a simple and efficient approach to fine-mapping genetic variants from individual-level data. Here we present new methods for fitting the SuSiE model to summary data, for example to single-SNP z-scores from an association study and linkage disequilibrium (LD) values estimated from a suitable reference panel. To develop these new methods, we first describe a simple, generic strategy for extending any individual-level data method to deal with summary data. The key idea is to replace the usual regression likelihood with an analogous likelihood based on summary data. We show that existing fine-mapping methods such as FINEMAP and CAVIAR also (implicitly) use this strategy, but in different ways, and so this provides a common framework for understanding different methods for fine-mapping. We investigate other common practical issues in fine-mapping with summary data, including problems caused by inconsistencies between the z-scores and LD estimates, and we develop diagnostics to identify these inconsistencies. We also present a new refinement procedure that improves model fits in some data sets, and hence improves overall reliability of the SuSiE fine-mapping results. Detailed evaluations of fine-mapping methods in a range of simulated data sets show that SuSiE applied to summary data is competitive, in both speed and accuracy, with the best available fine-mapping methods for summary data.
- CITATION : Zou Y, Carbonetto P, Wang G, Stephens M. (2022) Fine-mapping from summary data with the "Sum of Single Effects" model PLoS Genet., 18 (7) e1010299. doi:10.1371/journal.pgen.1010299. PMID 35853082
- JOURNAL_INFO : PLoS genetics ; PLoS Genet. ; 2022 ; 18 ; 7 ; e1010299
- PUBMED_LINK : 35853082
SUSIEx
- NAME : SUSIEx
- SHORT NAME : SUSIEx
- FULL NAME : SUSIEx
- DESCRIPTION : SuSiEx is a Python based command line tool that performs cross-ethnic fine-mapping using GWAS summary statistics and LD reference panels. The method is built on the Sum of Single Effects (SuSiE) model.
- URL : https://github.com/getian107/SuSiEx
- KEYWORDS : cross-ancestry, fine-mapping
- CITATION : Yuan, K., Longchamps, R. J., Pardiñas, A. F., Yu, M., Chen, T. T., Lin, S. C., ... & Schizophrenia Workgroup of Psychiatric Genomics Consortium. (2023). Fine-mapping across diverse ancestries drives the discovery of putative causal variants underlying human complex traits and diseases. medRxiv.
SparsePro
- NAME : SparsePro
- SHORT NAME : SparsePro
- FULL NAME : SparsePro
- DESCRIPTION : SparsePro is a command line tool for efficiently conducting genome-wide fine-mapping. Our method has two key features: First, by creating a sparse low-dimensional projection of the high-dimensional genotype, we enable a linear search of causal variants instead of an exponential search of causal configurations in most existing methods; Second, we adopt a probabilistic framework with a highly efficient variational expectation-maximization algorithm to integrate statistical associations and functional priors.
- URL : https://github.com/zhwm/SparsePro
- TITLE : SparsePro: An efficient fine-mapping method integrating summary statistics and functional annotations
- DOI : 10.1371/journal.pgen.1011104
- ABSTRACT : Identifying causal variants from genome-wide association studies (GWAS) is challenging due to widespread linkage disequilibrium (LD) and the possible existence of multiple causal variants in the same genomic locus. Functional annotations of the genome may help to prioritize variants that are biologically relevant and thus improve fine-mapping of GWAS results. Classical fine-mapping methods conducting an exhaustive search of variant-level causal configurations have a high computational cost, especially when the underlying genetic architecture and LD patterns are complex. SuSiE provided an iterative Bayesian stepwise selection algorithm for efficient fine-mapping. In this work, we build connections between SuSiE and a paired mean field variational inference algorithm through the implementation of a sparse projection, and propose effective strategies for estimating hyperparameters and summarizing posterior probabilities. Moreover, we incorporate functional annotations into fine-mapping by jointly estimating enrichment weights to derive functionally-informed priors. We evaluate the performance of SparsePro through extensive simulations using resources from the UK Biobank. Compared to state-of-the-art methods, SparsePro achieved improved power for fine-mapping with reduced computation time. We demonstrate the utility of SparsePro through fine-mapping of five functional biomarkers of clinically relevant phenotypes. In summary, we have developed an efficient fine-mapping method for integrating summary statistics and functional annotations. Our method can have wide utility in understanding the genetics of complex traits and increasing the yield of functional follow-up studies of GWAS. SparsePro software is available on GitHub at https://github.com/zhwm/SparsePro.
- CITATION : Zhang W, Najafabadi H, Li Y. (2023) SparsePro: An efficient fine-mapping method integrating summary statistics and functional annotations PLoS Genet., 19 (12) e1011104. doi:10.1371/journal.pgen.1011104. PMID 38153934
- JOURNAL_INFO : PLoS genetics ; PLoS Genet. ; 2023 ; 19 ; 12 ; e1011104
- PUBMED_LINK : 38153934
mJAM
- NAME : mJAM
- SHORT NAME : mJAM
- FULL NAME : multi-population JAM
- URL : https://github.com/USCbiostats/hJAM/R
- KEYWORDS : multi-population
- CITATION : Shen, J., Jiang, L., Wang, K., Wang, A., Chen, F., Newcombe, P. J., ... & Conti, D. V. (2022). Fine-mapping and credible set construction using a multi-population joint analysis of marginal summary statistics from genome-wide association studies. bioRxiv, 2022-12.
mvSuSiE
- NAME : mvSuSiE
- SHORT NAME : mvSuSiE
- FULL NAME : mvSuSiE
- DESCRIPTION : Implements a multivariate generalization of the "Sum of Single Effects" (SuSiE) model for variable selection in multivariate linear regression.
- URL : https://github.com/stephenslab/mvsusieR
- KEYWORDS : multi-trait, fine-mapping
- CITATION : Zou, Y., Carbonetto, P., Xie, D., Wang, G., & Stephens, M. (2023). Fast and flexible joint fine-mapping of multiple traits via the Sum of Single Effects model. bioRxiv, 2023-04.