Tools Meta and Multi triat

Curation of Meta and Multi triat — listings under the GWAS Tools tab.

Summary Table

Click a column header to sort the table.

NAME	CATEGORY	Main citation	YEAR
REMETA	Gene-based	Joseph TA et al., Nat Genet, 2025	2025
GWAMA	Meta-analysis	Mägi R et al., BMC Bioinformatics, 2010	2010
MANTRA	Meta-analysis	Morris AP, Genet Epidemiol, 2011	2011
METAL	Meta-analysis	Willer CJ et al., Bioinformatics, 2010	2010
MR-MEGA	Meta-analysis	Mägi R et al., Hum Mol Genet, 2017	2017
ASSET	Multi-trait	Bhattacharjee S et al., Am J Hum Genet, 2012	2012
FactorGO	Multi-trait	Zhang Z et al., Am J Hum Genet, 2023	2023
GLEANR	Multi-trait	Omdahl AR et al., Am J Hum Genet, 2025	2025
Galesloot	Multi-trait	Galesloot TE et al., PLoS One, 2014	2014
Genomic-SEM	Multi-trait	Grotzinger AD et al., Nat Hum Behav, 2019	2019
HIPO	Multi-trait	Qi G et al., PLoS Genet, 2018	2018
JASS	Multi-trait	Julienne H et al., NAR Genom Bioinform, 2020	2020
LCP-GWAS	Multi-trait	Ruotsalainen SE et al., Eur J Hum Genet, 2021	2021
MANOVA	Multi-trait	1955	1955
MOSTest	Multi-trait	van der Meer D et al., Nat Commun, 2020	2020
MTAG	Multi-trait	Turley P et al., Nat Genet, 2018	2018
MV-PLINK (MQFAM)	Multi-trait	Ferreira MA et al., Bioinformatics, 2009	2009
MultiPhen	Multi-trait	O'Reilly PF et al., PLoS One, 2012	2012
PCHAT	Multi-trait	Klei L et al., Genet Epidemiol, 2008	2008
Porter	Multi-trait	Porter HF et al., Sci Rep, 2017	2017
Salinas	Multi-trait	Salinas YD et al., Am J Epidemiol, 2018	2018
Stephens	Multi-trait	Stephens M, PLoS One, 2013	2013
TATES	Multi-trait	van der Sluis S et al., PLoS Genet, 2013	2013
Yang	Multi-trait	Yang Q et al., J Probab Stat, 2012	2012
aMAT	Multi-trait	Wu C, Genetics, 2020	2020
condFDR	Multi-trait	Andreassen OA et al., PLoS Genet, 2013	2013
fastASSET	Multi-trait	Qi G et al., Nat Commun, 2024	2024
metaCCA	Multi-trait	Cichonska A et al., Bioinformatics, 2016	2016
metaUSAT/metaMANOVA	Multi-trait	Ray D et al., Genet Epidemiol, 2018	2018
mvGWAMA	Multi-trait	Jansen IE et al., Nat Genet, 2019	2019
Meta-SAIGE	Rare-variant	Park E et al., Nat Genet, 2025	2025
MetaSKAT	Rare-variant	Lee S et al., Am J Hum Genet, 2013	2013
MetaSTAAR	Rare-variant	Li X et al., Nat Genet, 2023	2023
RareMETAL	Rare-variant	Feng S et al., Bioinformatics, 2014	2014
SMMAT	Rare-variant	Chen H et al., Am J Hum Genet, 2019	2019

Gene-based

REMETA

Tool

PUBMED_LINK

41225158

DESCRIPTION

REMETA is a computationally efficient C++ toolkit for meta-analysis of gene-based association tests using single-variant summary statistics from REGENIE-style pipelines, including burden and variance-component tests, with sparse per-study LD references rescaled per phenotype.

Show full descriptionShow less

URL

https://github.com/rgcgithub/remeta ,https://rgcgithub.github.io/remeta/

KEYWORDS

gene-based test, meta-analysis, summary statistics, REGENIE, burden, SKAT-O

Show full keywordsShow less

TITLE

Computationally efficient meta-analysis of gene-based tests using summary statistics in large-scale genetic studies.

Main citation

Joseph TA, Mbatchou J, Ghosh A, Marcketta A, ...&, Marchini J. (2025) Computationally efficient meta-analysis of gene-based tests using summary statistics in large-scale genetic studies. Nat Genet, 57 (12) 3193-3200. doi:10.1038/s41588-025-02390-0. PMID 41225158

ABSTRACT

Meta-analysis of gene-based tests using single-variant summary statistics is a powerful strategy for genetic association studies. However, current approaches require sharing the covariance matrix between variants for each study and trait of interest. For large-scale studies with many phenotypes, these matrices can be cumbersome to calculate, store and share. Here, to address this challenge, we present REMETA-an efficient tool for meta-analysis of gene-based tests. REMETA uses a single sparse covariance reference file per study that is rescaled for each phenotype using single-variant summary statistics. We develop new methods for binary traits with case-control imbalance, and to estimate allele frequencies, genotype counts and effect sizes of burden tests. We demonstrate the performance and advantages of our approach through meta-analysis of five traits in 469,376 samples in UK Biobank. The open-source REMETA software will facilitate meta-analysis across large-scale exome sequencing studies from diverse studies that cannot easily be combined.

Show full abstractShow less

DOI

10.1038/s41588-025-02390-0

Meta-analysis

GWAMA

Tool

PUBMED_LINK

20509871

FULL NAME

Genome-Wide Association Meta-Analysis

DESCRIPTION

Software tool for meta analysis of whole genome association data

Show full descriptionShow less

URL

https://genomics.ut.ee/en/tools

TITLE

GWAMA: software for genome-wide association meta-analysis.

Main citation

Mägi R, Morris AP. (2010) GWAMA: software for genome-wide association meta-analysis. BMC Bioinformatics, 11 () 288. doi:10.1186/1471-2105-11-288. PMID 20509871

ABSTRACT

BACKGROUND: Despite the recent success of genome-wide association studies in identifying novel loci contributing effects to complex human traits, such as type 2 diabetes and obesity, much of the genetic component of variation in these phenotypes remains unexplained. One way to improving power to detect further novel loci is through meta-analysis of studies from the same population, increasing the sample size over any individual study. Although statistical software analysis packages incorporate routines for meta-analysis, they are ill equipped to meet the challenges of the scale and complexity of data generated in genome-wide association studies. RESULTS: We have developed flexible, open-source software for the meta-analysis of genome-wide association studies. The software incorporates a variety of error trapping facilities, and provides a range of meta-analysis summary statistics. The software is distributed with scripts that allow simple formatting of files containing the results of each association study and generate graphical summaries of genome-wide meta-analysis results. CONCLUSIONS: The GWAMA (Genome-Wide Association Meta-Analysis) software has been developed to perform meta-analysis of summary statistics generated from genome-wide association studies of dichotomous phenotypes or quantitative traits. Software with source files, documentation and example data files are freely available online at http://www.well.ox.ac.uk/GWAMA.

Show full abstractShow less

DOI

10.1186/1471-2105-11-288

MANTRA

Tool

PUBMED_LINK

22125221

FULL NAME

Meta-ANalysis of Transethnic Association studies

KEYWORDS

cross-population

Show full keywordsShow less

TITLE

Transethnic meta-analysis of genomewide association studies.

Main citation

Morris AP. (2011) Transethnic meta-analysis of genomewide association studies. Genet Epidemiol, 35 (8) 809-22. doi:10.1002/gepi.20630. PMID 22125221

ABSTRACT

The detection of loci contributing effects to complex human traits, and their subsequent fine-mapping for the location of causal variants, remains a considerable challenge for the genetics research community. Meta-analyses of genomewide association studies, primarily ascertained from European-descent populations, have made considerable advances in our understanding of complex trait genetics, although much of their heritability is still unexplained. With the increasing availability of genomewide association data from diverse populations, transethnic meta-analysis may offer an exciting opportunity to increase the power to detect novel complex trait loci and to improve the resolution of fine-mapping of causal variants by leveraging differences in local linkage disequilibrium structure between ethnic groups. However, we might also expect there to be substantial genetic heterogeneity between diverse populations, both in terms of the spectrum of causal variants and their allelic effects, which cannot easily be accommodated through traditional approaches to meta-analysis. In order to address this challenge, I propose novel transethnic meta-analysis methodology that takes account of the expected similarity in allelic effects between the most closely related populations, while allowing for heterogeneity between more diverse ethnic groups. This approach yields substantial improvements in performance, compared to fixed-effects meta-analysis, both in terms of power to detect association, and localization of the causal variant, over a range of models of heterogeneity between ethnic groups. Furthermore, when the similarity in allelic effects between populations is well captured by their relatedness, this approach has increased power and mapping resolution over random-effects meta-analysis.

Show full abstractShow less

DOI

10.1002/gepi.20630

METAL

Tool

PUBMED_LINK

20616382

DESCRIPTION

METAL is a tool for meta-analysis genomewide association scans. METAL can combine either (a) test statistics and standard errors or (b) p-values across studies (taking sample size and direction of effect into account). METAL analysis is a convenient alternative to a direct analysis of merged data from multiple studies. It is especially appropriate when data from the individual studies cannot be analyzed together because of differences in ethnicity, phenotype distribution, gender or constraints in sharing of individual level data imposed. Meta-analysis results in little or no loss of efficiency compared to analysis of a combined dataset including data from all individual studies.

Show full descriptionShow less

URL

https://genome.sph.umich.edu/wiki/METAL_Documentation

TITLE

METAL: fast and efficient meta-analysis of genomewide association scans.

Main citation

Willer CJ, Li Y, Abecasis GR. (2010) METAL: fast and efficient meta-analysis of genomewide association scans. Bioinformatics, 26 (17) 2190-1. doi:10.1093/bioinformatics/btq340. PMID 20616382

ABSTRACT

SUMMARY: METAL provides a computationally efficient tool for meta-analysis of genome-wide association scans, which is a commonly used approach for improving power complex traits gene mapping studies. METAL provides a rich scripting interface and implements efficient memory management to allow analyses of very large data sets and to support a variety of input file formats. AVAILABILITY AND IMPLEMENTATION: METAL, including source code, documentation, examples, and executables, is available at http://www.sph.umich.edu/csg/abecasis/metal/.

Show full abstractShow less

DOI

10.1093/bioinformatics/btq340

MR-MEGA

Tool

PUBMED_LINK

28911207

FULL NAME

Meta-Regression of Multi-AncEstry Genetic Association

DESCRIPTION

MR-MEGA (Meta-Regression of Multi-AncEstry Genetic Association) is a tool to detect and fine-map complex trait association signals via multi-ancestry meta-regression. This approach uses genome-wide metrics of diversity between populations to derive axes of genetic variation via multi-dimensional scaling [Purcell 2007]. Allelic effects of a variant across GWAS, weighted by their corresponding standard errors, can then be modelled in a linear regression framework, including the axes of genetic variation as covariates. The flexibility of this model enables partitioning of the heterogeneity into components due to ancestry and residual variation, which would be expected to improve fine-mapping resolution.

Show full descriptionShow less

URL

https://genomics.ut.ee/en/tools

KEYWORDS

cross-population, Meta-Regression

Show full keywordsShow less

TITLE

Trans-ethnic meta-regression of genome-wide association studies accounting for ancestry increases power for discovery and improves fine-mapping resolution.

Main citation

Mägi R, Horikoshi M, Sofer T, Mahajan A, ...&, Morris AP. (2017) Trans-ethnic meta-regression of genome-wide association studies accounting for ancestry increases power for discovery and improves fine-mapping resolution. Hum Mol Genet, 26 (18) 3639-3650. doi:10.1093/hmg/ddx280. PMID 28911207

ABSTRACT

Trans-ethnic meta-analysis of genome-wide association studies (GWAS) across diverse populations can increase power to detect complex trait loci when the underlying causal variants are shared between ancestry groups. However, heterogeneity in allelic effects between GWAS at these loci can occur that is correlated with ancestry. Here, a novel approach is presented to detect SNP association and quantify the extent of heterogeneity in allelic effects that is correlated with ancestry. We employ trans-ethnic meta-regression to model allelic effects as a function of axes of genetic variation, derived from a matrix of mean pairwise allele frequency differences between GWAS, and implemented in the MR-MEGA software. Through detailed simulations, we demonstrate increased power to detect association for MR-MEGA over fixed- and random-effects meta-analysis across a range of scenarios of heterogeneity in allelic effects between ethnic groups. We also demonstrate improved fine-mapping resolution, in loci containing a single causal variant, compared to these meta-analysis approaches and PAINTOR, and equivalent performance to MANTRA at reduced computational cost. Application of MR-MEGA to trans-ethnic GWAS of kidney function in 71,461 individuals indicates stronger signals of association than fixed-effects meta-analysis when heterogeneity in allelic effects is correlated with ancestry. Application of MR-MEGA to fine-mapping four type 2 diabetes susceptibility loci in 22,086 cases and 42,539 controls highlights: (i) strong evidence for heterogeneity in allelic effects that is correlated with ancestry only at the index SNP for the association signal at the CDKAL1 locus; and (ii) 99% credible sets with six or fewer variants for five distinct association signals.

Show full abstractShow less

DOI

10.1093/hmg/ddx280

Multi-trait

ASSET

Tool

PUBMED_LINK

22560090

FULL NAME

association analysis based on subsets

URL

https://github.com/sbstatgen/ASSET

TITLE

A subset-based approach improves power and interpretation for the combined analysis of genetic association studies of heterogeneous traits.

Main citation

Bhattacharjee S, Rajaraman P, Jacobs KB, Wheeler WA, ...&, Chatterjee N. (2012) A subset-based approach improves power and interpretation for the combined analysis of genetic association studies of heterogeneous traits. Am J Hum Genet, 90 (5) 821-35. doi:10.1016/j.ajhg.2012.03.015. PMID 22560090

ABSTRACT

Pooling genome-wide association studies (GWASs) increases power but also poses methodological challenges because studies are often heterogeneous. For example, combining GWASs of related but distinct traits can provide promising directions for the discovery of loci with small but common pleiotropic effects. Classical approaches for meta-analysis or pooled analysis, however, might not be suitable for such analysis because individual variants are likely to be associated with only a subset of the traits or might demonstrate effects in different directions. We propose a method that exhaustively explores subsets of studies for the presence of true association signals that are in either the same direction or possibly opposite directions. An efficient approximation is used for rapid evaluation of p values. We present two illustrative applications, one for a meta-analysis of separate case-control studies of six distinct cancers and another for pooled analysis of a case-control study of glioma, a class of brain tumors that contains heterogeneous subtypes. Both the applications and additional simulation studies demonstrate that the proposed methods offer improved power and more interpretable results when compared to traditional methods for the analysis of heterogeneous traits. The proposed framework has applications beyond genetic association studies.

Show full abstractShow less

DOI

10.1016/j.ajhg.2012.03.015

FactorGO

Tool

PUBMED_LINK

37879338

FULL NAME

Factor analysis model in Genetic assOciation

DESCRIPTION

FactorGo is a scalable variational factor analysis model that learns pleiotropic factors using GWAS summary statistics.

Show full descriptionShow less

URL

https://github.com/mancusolab/FactorGo

KEYWORDS

pleiotropy, factor analysis

Show full keywordsShow less

TITLE

A scalable approach to characterize pleiotropy across thousands of human diseases and complex traits using GWAS summary statistics.

Main citation

Zhang Z, Jung J, Kim A, Suboc N, ...&, Mancuso N. (2023) A scalable approach to characterize pleiotropy across thousands of human diseases and complex traits using GWAS summary statistics. Am J Hum Genet, 110 (11) 1863-1874. doi:10.1016/j.ajhg.2023.09.015. PMID 37879338

ABSTRACT

Genome-wide association studies (GWASs) across thousands of traits have revealed the pervasive pleiotropy of trait-associated genetic variants. While methods have been proposed to characterize pleiotropic components across groups of phenotypes, scaling these approaches to ultra-large-scale biobanks has been challenging. Here, we propose FactorGo, a scalable variational factor analysis model to identify and characterize pleiotropic components using biobank GWAS summary data. In extensive simulations, we observe that FactorGo outperforms the state-of-the-art (model-free) approach tSVD in capturing latent pleiotropic factors across phenotypes while maintaining a similar computational cost. We apply FactorGo to estimate 100 latent pleiotropic factors from GWAS summary data of 2,483 phenotypes measured in European-ancestry Pan-UK BioBank individuals (N = 420,531). Next, we find that factors from FactorGo are more enriched with relevant tissue-specific annotations than those identified by tSVD (p = 2.58E-10) and validate our approach by recapitulating brain-specific enrichment for BMI and the height-related connection between reproductive system and muscular-skeletal growth. Finally, our analyses suggest shared etiologies between rheumatoid arthritis and periodontal condition in addition to alkaline phosphatase as a candidate prognostic biomarker for prostate cancer. Overall, FactorGo improves our biological understanding of shared etiologies across thousands of GWASs.

Show full abstractShow less

DOI

10.1016/j.ajhg.2023.09.015

GLEANR

Tool

PUBMED_LINK

40730164

FULL NAME

GWAS latent embeddings accounting for noise and regularization

DESCRIPTION

GLEANER is a GWAS matrix factorization tool to estimate sparse latent pleiotropic genetic factors. Factors map traits to a distribution of SNP effects that may capture biological pathways or mechanisms shared by these traits.

Show full descriptionShow less

URL

https://github.com/aomdahl/gleanr

TITLE

Sparse matrix factorization robust to sample sharing across GWASs reveals interpretable genetic components.

Main citation

Omdahl AR, Weinstock JS, Keener R, Chhetri SB, ...&, Battle A. (2025) Sparse matrix factorization robust to sample sharing across GWASs reveals interpretable genetic components. Am J Hum Genet, 112 (9) 2178-2197. doi:10.1016/j.ajhg.2025.07.003. PMID 40730164

ABSTRACT

Complex trait-associated genetic variation is highly pleiotropic. This extensive pleiotropy implies that multi-phenotype analyses are informative for characterizing genetic associations, as they facilitate the discovery of trait-shared and trait-specific variants and pathways ("genetic factors"). Previous efforts have estimated genetic factors using matrix factorization (MF) applied to numerous genome-wide association studies (GWASs). However, existing methods are susceptible to spurious factors arising from residual confounding due to sample sharing in biobank GWASs. Furthermore, MF approaches have historically estimated dense factors, loaded on most traits and variants, that are challenging to map onto interpretable biological pathways. To address these shortcomings, we introduce "GWAS latent embeddings accounting for noise and regularization" (GLEANR), an MF method for detection of sparse genetic factors from summary statistics. GLEANR accounts for sample sharing between studies and uses regularization to estimate a data-driven number of interpretable factors. GLEANR is robust to confounding induced by shared samples and improves the replication of genetic factors derived from distinct biobanks. We used GLEANR to evaluate 137 diverse GWASs from the UK Biobank, identifying 58 factors that decompose the genetic architecture of input traits and have distinct signatures of negative selection and degrees of polygenicity. These sparse factors can be interpreted with respect to disease, cell type, and pathway enrichment. We highlight three such factors that captured platelet-measure phenotypes and were enriched for disease-relevant markers corresponding to distinct stages of platelet differentiation. Overall, GLEANR is a powerful tool for discovering both trait-specific and trait-shared pathways underlying complex traits from GWAS summary statistics.

Show full abstractShow less

DOI

10.1016/j.ajhg.2025.07.003

Galesloot

Tool

PUBMED_LINK

24763738

TITLE

A comparison of multivariate genome-wide association methods.

Main citation

Galesloot TE, van Steen K, Kiemeney LA, Janss LL, ...&, Vermeulen SH. (2014) A comparison of multivariate genome-wide association methods. PLoS One, 9 (4) e95923. doi:10.1371/journal.pone.0095923. PMID 24763738

ABSTRACT

Joint association analysis of multiple traits in a genome-wide association study (GWAS), i.e. a multivariate GWAS, offers several advantages over analyzing each trait in a separate GWAS. In this study we directly compared a number of multivariate GWAS methods using simulated data. We focused on six methods that are implemented in the software packages PLINK, SNPTEST, MultiPhen, BIMBAM, PCHAT and TATES, and also compared them to standard univariate GWAS, analysis of the first principal component of the traits, and meta-analysis of univariate results. We simulated data (N = 1000) for three quantitative traits and one bi-allelic quantitative trait locus (QTL), and varied the number of traits associated with the QTL (explained variance 0.1%), minor allele frequency of the QTL, residual correlation between the traits, and the sign of the correlation induced by the QTL relative to the residual correlation. We compared the power of the methods using empirically fixed significance thresholds (α = 0.05). Our results showed that the multivariate methods implemented in PLINK, SNPTEST, MultiPhen and BIMBAM performed best for the majority of the tested scenarios, with a notable increase in power for scenarios with an opposite sign of genetic and residual correlation. All multivariate analyses resulted in a higher power than univariate analyses, even when only one of the traits was associated with the QTL. Hence, use of multivariate GWAS methods can be recommended, even when genetic correlations between traits are weak.

Show full abstractShow less

DOI

10.1371/journal.pone.0095923

Genomic-SEM

Tool

PUBMED_LINK

30962613

FULL NAME

genomic structural equation modelling

DESCRIPTION

R-package which allows the user to fit structural equation models based on the summary statistics obtained from genome wide association studies (GWAS).

Show full descriptionShow less

URL

https://github.com/GenomicSEM/GenomicSEM

KEYWORDS

SEM

Show full keywordsShow less

TITLE

Genomic structural equation modelling provides insights into the multivariate genetic architecture of complex traits.

Main citation

Grotzinger AD, Rhemtulla M, de Vlaming R, Ritchie SJ, ...&, Tucker-Drob EM. (2019) Genomic structural equation modelling provides insights into the multivariate genetic architecture of complex traits. Nat Hum Behav, 3 (5) 513-525. doi:10.1038/s41562-019-0566-x. PMID 30962613

ABSTRACT

Genetic correlations estimated from genome-wide association studies (GWASs) reveal pervasive pleiotropy across a wide variety of phenotypes. We introduce genomic structural equation modelling (genomic SEM): a multivariate method for analysing the joint genetic architecture of complex traits. Genomic SEM synthesizes genetic correlations and single-nucleotide polymorphism heritabilities inferred from GWAS summary statistics of individual traits from samples with varying and unknown degrees of overlap. Genomic SEM can be used to model multivariate genetic associations among phenotypes, identify variants with effects on general dimensions of cross-trait liability, calculate more predictive polygenic scores and identify loci that cause divergence between traits. We demonstrate several applications of genomic SEM, including a joint analysis of summary statistics from five psychiatric traits. We identify 27 independent single-nucleotide polymorphisms not previously identified in the contributing univariate GWASs. Polygenic scores from genomic SEM consistently outperform those from univariate GWASs. Genomic SEM is flexible and open ended, and allows for continuous innovation in multivariate genetic analysis.

Show full abstractShow less

DOI

10.1038/s41562-019-0566-x

HIPO

Tool

PUBMED_LINK

30289880

FULL NAME

heritability informed power optimization

DESCRIPTION

hipo is an R package that performs heritability informed power optimization (HIPO) for conducting multi-trait association analysis on summary level data.

Show full descriptionShow less

URL

https://github.com/gqi/hipo

TITLE

Heritability informed power optimization (HIPO) leads to enhanced detection of genetic associations across multiple traits.

Main citation

Qi G, Chatterjee N. (2018) Heritability informed power optimization (HIPO) leads to enhanced detection of genetic associations across multiple traits. PLoS Genet, 14 (10) e1007549. doi:10.1371/journal.pgen.1007549. PMID 30289880

ABSTRACT

Genome-wide association studies have shown that pleiotropy is a common phenomenon that can potentially be exploited for enhanced detection of susceptibility loci. We propose heritability informed power optimization (HIPO) for conducting powerful pleiotropic analysis using summary-level association statistics. We find optimal linear combinations of association coefficients across traits that are expected to maximize non-centrality parameter for the underlying test statistics, taking into account estimates of heritability, sample size variations and overlaps across the traits. Simulation studies show that the proposed method has correct type I error, robust to population stratification and leads to desired genome-wide enrichment of association signals. Application of the proposed method to publicly available data for three groups of genetically related traits, lipids (N = 188,577), psychiatric diseases (Ncase = 33,332, Ncontrol = 27,888) and social science traits (N ranging between 161,460 to 298,420 across individual traits) increased the number of genome-wide significant loci by 12%, 200% and 50%, respectively, compared to those found by analysis of individual traits. Evidence of replication is present for many of these loci in subsequent larger studies for individual traits. HIPO can potentially be extended to high-dimensional phenotypes as a way of dimension reduction to maximize power for subsequent genetic association testing.

Show full abstractShow less

DOI

10.1371/journal.pgen.1007549

JASS

Tool

PUBMED_LINK

32002517

FULL NAME

Joint Analysis of Summary Statistics

DESCRIPTION

JASS is a python package that handles the computation of the joint statistics over sets of selected GWAS results, and the interactive exploration of the results through a web interface. The generation of joint statistics over a set of selected studies, and the generation of static plots to display the results, is easily performed using the command line interface. These functionalities can also be accessed through a web application embedded in the python package, which also enables the exploration of the results through a dynamic Javascript interface. The JASS analysis module handles the data processing, going from the import of the data up to the computation of the joint statistics and the generation of the various static plots to illustrate the results. However, we also briefly describe in the next section the pre-processing of raw GWAS data which can be performed through a companion script provided on behalf of the JASS package.

Show full descriptionShow less

URL

https://gitlab.pasteur.fr/statistical-genetics/jass

TITLE

JASS: command line and web interface for the joint analysis of GWAS results.

Main citation

Julienne H, Lechat P, Guillemot V, Lasry C, ...&, Aschard H. (2020) JASS: command line and web interface for the joint analysis of GWAS results. NAR Genom Bioinform, 2 (1) lqaa003. doi:10.1093/nargab/lqaa003. PMID 32002517

ABSTRACT

Genome-wide association study (GWAS) has been the driving force for identifying association between genetic variants and human phenotypes. Thousands of GWAS summary statistics covering a broad range of human traits and diseases are now publicly available. These GWAS have proven their utility for a range of secondary analyses, including in particular the joint analysis of multiple phenotypes to identify new associated genetic variants. However, although several methods have been proposed, there are very few large-scale applications published so far because of challenges in implementing these methods on real data. Here, we present JASS (Joint Analysis of Summary Statistics), a polyvalent Python package that addresses this need. Our package incorporates recently developed joint tests such as the omnibus approach and various weighted sum of Z-score tests while solving all practical and computational barriers for large-scale multivariate analysis of GWAS summary statistics. This includes data cleaning and harmonization tools, an efficient algorithm for fast derivation of joint statistics, an optimized data management process and a web interface for exploration purposes. Both benchmark analyses and real data applications demonstrated the robustness and strong potential of JASS for the detection of new associated genetic variants. Our package is freely available at https://gitlab.pasteur.fr/statistical-genetics/jass.

Show full abstractShow less

DOI

10.1093/nargab/lqaa003

LCP-GWAS

Tool

PUBMED_LINK

33110245

FULL NAME

Linear Combination Phenotype GWAS

KEYWORDS

multivariate GWAS follow-up analyses

Show full keywordsShow less

TITLE

An expanded analysis framework for multivariate GWAS connects inflammatory biomarkers to functional variants and disease.

Main citation

Ruotsalainen SE, Partanen JJ, Cichonska A, Lin J, ...&, Koskela J. (2021) An expanded analysis framework for multivariate GWAS connects inflammatory biomarkers to functional variants and disease. Eur J Hum Genet, 29 (2) 309-324. doi:10.1038/s41431-020-00730-8. PMID 33110245

ABSTRACT

Multivariate methods are known to increase the statistical power to detect associations in the case of shared genetic basis between phenotypes. They have, however, lacked essential analytic tools to follow-up and understand the biology underlying these associations. We developed a novel computational workflow for multivariate GWAS follow-up analyses, including fine-mapping and identification of the subset of traits driving associations (driver traits). Many follow-up tools require univariate regression coefficients which are lacking from multivariate results. Our method overcomes this problem by using Canonical Correlation Analysis to turn each multivariate association into its optimal univariate Linear Combination Phenotype (LCP). This enables an LCP-GWAS, which in turn generates the statistics required for follow-up analyses. We implemented our method on 12 highly correlated inflammatory biomarkers in a Finnish population-based study. Altogether, we identified 11 associations, four of which (F5, ABO, C1orf140 and PDGFRB) were not detected by biomarker-specific analyses. Fine-mapping identified 19 signals within the 11 loci and driver trait analysis determined the traits contributing to the associations. A phenome-wide association study on the 19 representative variants from the signals in 176,899 individuals from the FinnGen study revealed 53 disease associations (p < 1 × 10-4). Several reported pQTLs in the 11 loci provided orthogonal evidence for the biologically relevant functions of the representative variants. Our novel multivariate analysis workflow provides a powerful addition to standard univariate GWAS analyses by enabling multivariate GWAS follow-up and thus promoting the advancement of powerful multivariate methods in genomics.

Show full abstractShow less

DOI

10.1038/s41431-020-00730-8

MANOVA

Tool

FULL NAME

multivariate analysis of variance

MOSTest

Tool

PUBMED_LINK

32665545

FULL NAME

Multivariate Omnibus Statistical Test

DESCRIPTION

MOSTest is a tool for join genetical analysis of multiple traits, using multivariate analysis to boost the power of discovering associated loci.

Show full descriptionShow less

URL

https://github.com/precimed/mostest

TITLE

Understanding the genetic determinants of the brain with MOSTest.

Main citation

van der Meer D, Frei O, Kaufmann T, Shadrin AA, ...&, Dale AM. (2020) Understanding the genetic determinants of the brain with MOSTest. Nat Commun, 11 (1) 3512. doi:10.1038/s41467-020-17368-1. PMID 32665545

ABSTRACT

Regional brain morphology has a complex genetic architecture, consisting of many common polymorphisms with small individual effects. This has proven challenging for genome-wide association studies (GWAS). Due to the distributed nature of genetic signal across brain regions, multivariate analysis of regional measures may enhance discovery of genetic variants. Current multivariate approaches to GWAS are ill-suited for complex, large-scale data of this kind. Here, we introduce the Multivariate Omnibus Statistical Test (MOSTest), with an efficient computational design enabling rapid and reliable inference, and apply it to 171 regional brain morphology measures from 26,502 UK Biobank participants. At the conventional genome-wide significance threshold of α = 5 × 10-8, MOSTest identifies 347 genomic loci associated with regional brain morphology, more than any previous study, improving upon the discovery of established GWAS approaches more than threefold. Our findings implicate more than 5% of all protein-coding genes and provide evidence for gene sets involved in neuron development and differentiation.

Show full abstractShow less

DOI

10.1038/s41467-020-17368-1

MTAG

Tool

PUBMED_LINK

29292387

FULL NAME

Multi-Trait Analysis of GWAS

DESCRIPTION

mtag is a Python-based command line tool for jointly analyzing multiple sets of GWAS summary statistics as described by Turley et. al. (2018). It can also be used as a tool to meta-analyze GWAS results.

Show full descriptionShow less

URL

https://github.com/JonJala/mtag

KEYWORDS

Multi-trait

Show full keywordsShow less

TITLE

Multi-trait analysis of genome-wide association summary statistics using MTAG.

Main citation

Turley P, Walters RK, Maghzian O, Okbay A, ...&, Benjamin DJ. (2018) Multi-trait analysis of genome-wide association summary statistics using MTAG. Nat Genet, 50 (2) 229-237. doi:10.1038/s41588-017-0009-4. PMID 29292387

ABSTRACT

We introduce multi-trait analysis of GWAS (MTAG), a method for joint analysis of summary statistics from genome-wide association studies (GWAS) of different traits, possibly from overlapping samples. We apply MTAG to summary statistics for depressive symptoms (N eff = 354,862), neuroticism (N = 168,105), and subjective well-being (N = 388,538). As compared to the 32, 9, and 13 genome-wide significant loci identified in the single-trait GWAS (most of which are themselves novel), MTAG increases the number of associated loci to 64, 37, and 49, respectively. Moreover, association statistics from MTAG yield more informative bioinformatics analyses and increase the variance explained by polygenic scores by approximately 25%, matching theoretical expectations.

Show full abstractShow less

DOI

10.1038/s41588-017-0009-4

MV-PLINK (MQFAM)

Tool

PUBMED_LINK

19019849

TITLE

A multivariate test of association.

Main citation

Ferreira MA, Purcell SM. (2009) A multivariate test of association. Bioinformatics, 25 (1) 132-3. doi:10.1093/bioinformatics/btn563. PMID 19019849

ABSTRACT

UNLABELLED: Although genetic association studies often test multiple, related phenotypes, few formal multivariate tests of association are available. We describe a test of association that can be efficiently applied to large population-based designs. AVAILABILITY: A C++ implementation can be obtained from the authors.

Show full abstractShow less

DOI

10.1093/bioinformatics/btn563

MultiPhen

Tool

PUBMED_LINK

22567092

DESCRIPTION

Performs genetic association tests between SNPs (one-at-a-time) and multiple phenotypes (separately or in joint model).

Show full descriptionShow less

URL

https://cran.r-project.org/web/packages/MultiPhen/index.html

TITLE

MultiPhen: joint model of multiple phenotypes can increase discovery in GWAS.

Main citation

O'Reilly PF, Hoggart CJ, Pomyen Y, Calboli FC, ...&, Coin LJ. (2012) MultiPhen: joint model of multiple phenotypes can increase discovery in GWAS. PLoS One, 7 (5) e34861. doi:10.1371/journal.pone.0034861. PMID 22567092

ABSTRACT

The genome-wide association study (GWAS) approach has discovered hundreds of genetic variants associated with diseases and quantitative traits. However, despite clinical overlap and statistical correlation between many phenotypes, GWAS are generally performed one-phenotype-at-a-time. Here we compare the performance of modelling multiple phenotypes jointly with that of the standard univariate approach. We introduce a new method and software, MultiPhen, that models multiple phenotypes simultaneously in a fast and interpretable way. By performing ordinal regression, MultiPhen tests the linear combination of phenotypes most associated with the genotypes at each SNP, and thus potentially captures effects hidden to single phenotype GWAS. We demonstrate via simulation that this approach provides a dramatic increase in power in many scenarios. There is a boost in power for variants that affect multiple phenotypes and for those that affect only one phenotype. While other multivariate methods have similar power gains, we describe several benefits of MultiPhen over these. In particular, we demonstrate that other multivariate methods that assume the genotypes are normally distributed, such as canonical correlation analysis (CCA) and MANOVA, can have highly inflated type-1 error rates when testing case-control or non-normal continuous phenotypes, while MultiPhen produces no such inflation. To test the performance of MultiPhen on real data we applied it to lipid traits in the Northern Finland Birth Cohort 1966 (NFBC1966). In these data MultiPhen discovers 21% more independent SNPs with known associations than the standard univariate GWAS approach, while applying MultiPhen in addition to the standard approach provides 37% increased discovery. The most associated linear combinations of the lipids estimated by MultiPhen at the leading SNPs accurately reflect the Friedewald Formula, suggesting that MultiPhen could be used to refine the definition of existing phenotypes or uncover novel heritable phenotypes.

Show full abstractShow less

DOI

10.1371/journal.pone.0034861

PCHAT

Tool

PUBMED_LINK

17922480

FULL NAME

principal component of heritability association test

TITLE

Pleiotropy and principal components of heritability combine to increase power for association analysis.

Main citation

Klei L, Luca D, Devlin B, Roeder K. (2008) Pleiotropy and principal components of heritability combine to increase power for association analysis. Genet Epidemiol, 32 (1) 9-19. doi:10.1002/gepi.20257. PMID 17922480

ABSTRACT

When many correlated traits are measured the potential exists to discover the coordinated control of these traits via genotyped polymorphisms. A common statistical approach to this problem involves assessing the relationship between each phenotype and each single nucleotide polymorphism (SNP) individually (PHN); and taking a Bonferroni correction for the effective number of independent tests conducted. Alternatively, one can apply a dimension reduction technique, such as estimation of principal components, and test for an association with the principal components of the phenotypes (PCP) rather than the individual phenotypes. Building on the work of Lange and colleagues we develop an alternative method based on the principal component of heritability (PCH). For each SNP the PCH approach reduces the phenotypes to a single trait that has a higher heritability than any other linear combination of the phenotypes. As a result, the association between a SNP and derived trait is often easier to detect than an association with any of the individual phenotypes or the PCP. When applied to unrelated subjects, PCH has a drawback. For each SNP it is necessary to estimate the vector of loadings that maximize the heritability over all phenotypes. We develop a method of iterated sample splitting that uses one portion of the data for training and the remainder for testing. This cross-validation approach maintains the type I error control and yet utilizes the data efficiently, resulting in a powerful test for association.

Show full abstractShow less

DOI

10.1002/gepi.20257

Porter

Tool

PUBMED_LINK

28287610

TITLE

Multivariate simulation framework reveals performance of multi-trait GWAS methods.

Main citation

Porter HF, O'Reilly PF. (2017) Multivariate simulation framework reveals performance of multi-trait GWAS methods. Sci Rep, 7 () 38837. doi:10.1038/srep38837. PMID 28287610

ABSTRACT

Burgeoning availability of genome-wide association study (GWAS) results and national biobank data has led to growing interest in performing multi-trait genetic analyses. Numerous multi-trait GWAS methods that exploit either summary statistics or individual-level data have been developed, but their relative performance is unclear. Here we develop a simulation framework to model the complex networks underlying multivariate genetic epidemiology, enabling the vast model space of genetic effects on multiple correlated traits to be explored systematically. We perform a comprehensive comparison of the leading multi-trait GWAS methods, finding: (1) method performance is highly sensitive to the specific combination of genetic effects and phenotypic correlations, (2) most of the current multivariate methods have remarkably similar statistical power, and (3) multivariate methods may offer a substantial increase in the discovery of genetic variants over the standard univariate approach. We believe our findings offer the clearest picture to date of the relative performance of multi-trait GWAS methods and act as a guide for method selection. We provide a web application and open-source software program implementing our simulation framework, for: (i) further benchmarking of multivariate GWAS methods, (ii) power calculations for multivariate genetic studies, and (iii) generating data for testing any multivariate method in genetic epidemiology.

Show full abstractShow less

DOI

10.1038/srep38837

Salinas

Tool

PUBMED_LINK

29020254

TITLE

Statistical Analysis of Multiple Phenotypes in Genetic Epidemiologic Studies: From Cross-Phenotype Associations to Pleiotropy.

Main citation

Salinas YD, Wang Z, DeWan AT. (2018) Statistical Analysis of Multiple Phenotypes in Genetic Epidemiologic Studies: From Cross-Phenotype Associations to Pleiotropy. Am J Epidemiol, 187 (4) 855-863. doi:10.1093/aje/kwx296. PMID 29020254

ABSTRACT

In the context of genetics, pleiotropy refers to the phenomenon in which a single genetic locus affects more than 1 trait or disease. Genetic epidemiologic studies have identified loci associated with multiple phenotypes, and these cross-phenotype associations are often incorrectly interpreted as examples of pleiotropy. Pleiotropy is only one possible explanation for cross-phenotype associations. Cross-phenotype associations may also arise due to issues related to study design, confounder bias, or nongenetic causal links between the phenotypes under analysis. Therefore, it is necessary to dissect cross-phenotype associations carefully to uncover true pleiotropic loci. In this review, we describe statistical methods that can be used to identify robust statistical evidence of pleiotropy. First, we provide an overview of univariate and multivariate methods for discovery of cross-phenotype associations and highlight important considerations for choosing among available methods. Then, we describe how to dissect cross-phenotype associations by using mediation analysis. Pleiotropic loci provide insights into the mechanistic underpinnings of disease comorbidity, and they may serve as novel targets for interventions that simultaneously treat multiple diseases. Discerning between different types of cross-phenotype associations is necessary to realize the public health potential of pleiotropic loci.

Show full abstractShow less

DOI

10.1093/aje/kwx296

Stephens

Tool

PUBMED_LINK

23861737

TITLE

A unified framework for association analysis with multiple related phenotypes.

Main citation

Stephens M. (2013) A unified framework for association analysis with multiple related phenotypes. PLoS One, 8 (7) e65245. doi:10.1371/journal.pone.0065245. PMID 23861737

ABSTRACT

We consider the problem of assessing associations between multiple related outcome variables, and a single explanatory variable of interest. This problem arises in many settings, including genetic association studies, where the explanatory variable is genotype at a genetic variant. We outline a framework for conducting this type of analysis, based on Bayesian model comparison and model averaging for multivariate regressions. This framework unifies several common approaches to this problem, and includes both standard univariate and standard multivariate association tests as special cases. The framework also unifies the problems of testing for associations and explaining associations - that is, identifying which outcome variables are associated with genotype. This provides an alternative to the usual, but conceptually unsatisfying, approach of resorting to univariate tests when explaining and interpreting significant multivariate findings. The method is computationally tractable genome-wide for modest numbers of phenotypes (e.g. 5-10), and can be applied to summary data, without access to raw genotype and phenotype data. We illustrate the methods on both simulated examples, and to a genome-wide association study of blood lipid traits where we identify 18 potential novel genetic associations that were not identified by univariate analyses of the same data.

Show full abstractShow less

DOI

10.1371/journal.pone.0065245

TATES

Tool

PUBMED_LINK

23359524

FULL NAME

Trait-based Association Test that uses Extended Simes procedure

TITLE

TATES: efficient multivariate genotype-phenotype analysis for genome-wide association studies.

Main citation

van der Sluis S, Posthuma D, Dolan CV. (2013) TATES: efficient multivariate genotype-phenotype analysis for genome-wide association studies. PLoS Genet, 9 (1) e1003235. doi:10.1371/journal.pgen.1003235. PMID 23359524

ABSTRACT

To date, the genome-wide association study (GWAS) is the primary tool to identify genetic variants that cause phenotypic variation. As GWAS analyses are generally univariate in nature, multivariate phenotypic information is usually reduced to a single composite score. This practice often results in loss of statistical power to detect causal variants. Multivariate genotype-phenotype methods do exist but attain maximal power only in special circumstances. Here, we present a new multivariate method that we refer to as TATES (Trait-based Association Test that uses Extended Simes procedure), inspired by the GATES procedure proposed by Li et al (2011). For each component of a multivariate trait, TATES combines p-values obtained in standard univariate GWAS to acquire one trait-based p-value, while correcting for correlations between components. Extensive simulations, probing a wide variety of genotype-phenotype models, show that TATES's false positive rate is correct, and that TATES's statistical power to detect causal variants explaining 0.5% of the variance can be 2.5-9 times higher than the power of univariate tests based on composite scores and 1.5-2 times higher than the power of the standard MANOVA. Unlike other multivariate methods, TATES detects both genetic variants that are common to multiple phenotypes and genetic variants that are specific to a single phenotype, i.e. TATES provides a more complete view of the genetic architecture of complex traits. As the actual causal genotype-phenotype model is usually unknown and probably phenotypically and genetically complex, TATES, available as an open source program, constitutes a powerful new multivariate strategy that allows researchers to identify novel causal variants, while the complexity of traits is no longer a limiting factor.

Show full abstractShow less

DOI

10.1371/journal.pgen.1003235

Yang

Tool

PUBMED_LINK

24748889

TITLE

Methods for Analyzing Multivariate Phenotypes in Genetic Association Studies.

Main citation

Yang Q, Wang Y. (2012) Methods for Analyzing Multivariate Phenotypes in Genetic Association Studies. J Probab Stat, 2012 () 652569. doi:10.1155/2012/652569. PMID 24748889

ABSTRACT

This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Multivariate phenotypes are frequently encountered in genetic association studies. The purpose of analyzing multivariate phenotypes usually includes discovery of novel genetic variants of pleiotropy effects, that is, affecting multiple phenotypes, and the ultimate goal of uncovering the underlying genetic mechanism. In recent years, there have been new method development and application of existing statistical methods to such phenotypes. In this paper, we provide a review of the available methods for analyzing association between a single marker and a multivariate phenotype consisting of the same type of components (e.g., all continuous or all categorical) or different types of components (e.g., some are continuous and others are categorical). We also reviewed causal inference methods designed to test whether the detected association with the multivariate phenotype is truly pleiotropy or the genetic marker exerts its effects on some phenotypes through affecting the others.

Show full abstractShow less

DOI

10.1155/2012/652569

aMAT

Tool

PUBMED_LINK

32540950

FULL NAME

adaptive multi-trait association test

TITLE

Multi-trait Genome-Wide Analyses of the Brain Imaging Phenotypes in UK Biobank.

Main citation

Wu C. (2020) Multi-trait Genome-Wide Analyses of the Brain Imaging Phenotypes in UK Biobank. Genetics, 215 (4) 947-958. doi:10.1534/genetics.120.303242. PMID 32540950

ABSTRACT

Many genetic variants identified in genome-wide association studies (GWAS) are associated with multiple, sometimes seemingly unrelated, traits. This motivates multi-trait association analyses, which have successfully identified novel associated loci for many complex diseases. While appealing, most existing methods focus on analyzing a relatively small number of traits, and may yield inflated Type 1 error rates when a large number of traits need to be analyzed jointly. As deep phenotyping data are becoming rapidly available, we develop a novel method, referred to as aMAT (adaptive multi-trait association test), for multi-trait analysis of any number of traits. We applied aMAT to GWAS summary statistics for a set of 58 volumetric imaging derived phenotypes from the UK Biobank. aMAT had a genomic inflation factor of 1.04, indicating the Type 1 error rate was well controlled. More important, aMAT identified 24 distinct risk loci, 13 of which were ignored by standard GWAS. In comparison, the competing methods either had a suspicious genomic inflation factor or identified much fewer risk loci. Finally, four additional sets of traits have been analyzed and provided similar conclusions.

Show full abstractShow less

DOI

10.1534/genetics.120.303242

condFDR

Tool

PUBMED_LINK

23637625

FULL NAME

pleiotropy-informed conditional false discovery rate

DESCRIPTION

Uses GWAS summary statistics from two related traits to estimate conditional false discovery rates from conditional Q–Q curves, boosting discovery of variants that may fall below standard genome-wide thresholds in single-trait scans. The framework includes conjunction FDR for loci associated with both traits.

Show full descriptionShow less

URL

https://pmc.ncbi.nlm.nih.gov/articles/PMC3636100/

KEYWORDS

Pleiotropy,conditional FDR,conjunction FDR,summary statistics,multi-trait

Show full keywordsShow less

TITLE

Improved detection of common variants associated with schizophrenia and bipolar disorder using pleiotropy-informed conditional false discovery rate.

Main citation

Andreassen OA, Thompson WK, Schork AJ, Ripke S, Mattingsdal M, Kelsoe JR, Kendler KS, O'Donovan MC, Rujescu D, Werge T, Sklar P, et al. (2013) Improved detection of common variants associated with schizophrenia and bipolar disorder using pleiotropy-informed conditional false discovery rate. PLoS Genet, 9 (4) e1003455. doi:10.1371/journal.pgen.1003455. PMID 23637625

ABSTRACT

Several lines of evidence suggest that genome-wide association studies (GWAS) have the potential to explain more of the "missing heritability" of common complex phenotypes. However, reliable methods to identify a larger proportion of single nucleotide polymorphisms (SNPs) that impact disease risk are currently lacking. Here, we use a genetic pleiotropy-informed conditional false discovery rate (FDR) method on GWAS summary statistics data to identify new loci associated with schizophrenia (SCZ) and bipolar disorders (BD), two highly heritable disorders with significant missing heritability. Epidemiological and clinical evidence suggest similar disease characteristics and overlapping genes between SCZ and BD. Here, we computed conditional Q-Q curves of data from the Psychiatric Genome Consortium (SCZ; n = 9,379 cases and n = 7,736 controls; BD: n = 6,990 cases and n = 4,820 controls) to show enrichment of SNPs associated with SCZ as a function of association with BD and vice versa with a corresponding reduction in FDR. Applying the conditional FDR method, we identified 58 loci associated with SCZ and 35 loci associated with BD below the conditional FDR level of 0.05. Of these, 14 loci were associated with both SCZ and BD (conjunction FDR). Together, these findings show the feasibility of genetic pleiotropy-informed methods to improve gene discovery in SCZ and BD and indicate overlapping genetic mechanisms between these two disorders.

Show full abstractShow less

DOI

10.1371/journal.pgen.1003455

fastASSET

Tool

PUBMED_LINK

39143063

URL

https://github.com/gqi/fastASSET

TITLE

Genome-wide large-scale multi-trait analysis characterizes global patterns of pleiotropy and unique trait-specific variants.

Main citation

Qi G, Chhetri SB, Ray D, Dutta D, ...&, Chatterjee N. (2024) Genome-wide large-scale multi-trait analysis characterizes global patterns of pleiotropy and unique trait-specific variants. Nat Commun, 15 (1) 6985. doi:10.1038/s41467-024-51075-5. PMID 39143063

ABSTRACT

Genome-wide association studies (GWAS) have found widespread evidence of pleiotropy, but characterization of global patterns of pleiotropy remain highly incomplete due to insufficient power of current approaches. We develop fastASSET, a method that allows efficient detection of variant-level pleiotropic association across many traits. We analyze GWAS summary statistics of 116 complex traits of diverse types collected from the GRASP repository and large GWAS Consortia. We identify 2293 independent loci and find that the lead variants in nearly all these loci (~99%) to be associated with ≥ 2 traits (median = 6). We observe that degree of pleiotropy estimated from our study predicts that observed in the UK Biobank for a much larger number of traits (K = 4114) (correlation = 0.43, p-value < 2.2 × 10 - 16 ). Follow-up analyzes of 21 trait-specific variants indicate their link to the expression in trait-related tissues for a small number of genes involved in relevant biological processes. Our findings provide deeper insight into the nature of pleiotropy and leads to identification of highly trait-specific susceptibility variants.

Show full abstractShow less

DOI

10.1038/s41467-024-51075-5

metaCCA

Tool

PUBMED_LINK

27153689

FULL NAME

meta canonical
correlation analysis

DESCRIPTION

metaCCA performs multivariate analysis of a single or multiple GWAS based on univariate regression coefficients. It allows multivariate representation of both phenotype and genotype. metaCCA extends the statistical technique of canonical correlation analysis to the setting where original individual-level records are not available, and employs a covariance shrinkage algorithm to achieve robustness.

Show full descriptionShow less

URL

https://github.com/aalto-ics-kepaco/metaCCA-matlab

TITLE

metaCCA: summary statistics-based multivariate meta-analysis of genome-wide association studies using canonical correlation analysis.

Main citation

Cichonska A, Rousu J, Marttinen P, Kangas AJ, ...&, Pirinen M. (2016) metaCCA: summary statistics-based multivariate meta-analysis of genome-wide association studies using canonical correlation analysis. Bioinformatics, 32 (13) 1981-9. doi:10.1093/bioinformatics/btw052. PMID 27153689

ABSTRACT

MOTIVATION: A dominant approach to genetic association studies is to perform univariate tests between genotype-phenotype pairs. However, analyzing related traits together increases statistical power, and certain complex associations become detectable only when several variants are tested jointly. Currently, modest sample sizes of individual cohorts, and restricted availability of individual-level genotype-phenotype data across the cohorts limit conducting multivariate tests. RESULTS: We introduce metaCCA, a computational framework for summary statistics-based analysis of a single or multiple studies that allows multivariate representation of both genotype and phenotype. It extends the statistical technique of canonical correlation analysis to the setting where original individual-level records are not available, and employs a covariance shrinkage algorithm to achieve robustness.Multivariate meta-analysis of two Finnish studies of nuclear magnetic resonance metabolomics by metaCCA, using standard univariate output from the program SNPTEST, shows an excellent agreement with the pooled individual-level analysis of original data. Motivated by strong multivariate signals in the lipid genes tested, we envision that multivariate association testing using metaCCA has a great potential to provide novel insights from already published summary statistics from high-throughput phenotyping technologies. AVAILABILITY AND IMPLEMENTATION: Code is available at https://github.com/aalto-ics-kepaco CONTACTS: anna.cichonska@helsinki.fi or matti.pirinen@helsinki.fi SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Show full abstractShow less

DOI

10.1093/bioinformatics/btw052

metaUSAT/metaMANOVA

Tool

PUBMED_LINK

29226385

FULL NAME

unified score-based association test

DESCRIPTION

metaUSAT is a data-adaptive statistical approach for testing genetic associations of multiple traits from single/multiple studies using univariate GWAS summary statistics. This multivariate meta-analysis method can appropriately account for overlapping samples (if any) and can potentially test binary and/or continuous traits.

Show full descriptionShow less

URL

https://github.com/RayDebashree/metaUSAT

TITLE

Methods for meta-analysis of multiple traits using GWAS summary statistics.

Main citation

Ray D, Boehnke M. (2018) Methods for meta-analysis of multiple traits using GWAS summary statistics. Genet Epidemiol, 42 (2) 134-145. doi:10.1002/gepi.22105. PMID 29226385

ABSTRACT

Genome-wide association studies (GWAS) for complex diseases have focused primarily on single-trait analyses for disease status and disease-related quantitative traits. For example, GWAS on risk factors for coronary artery disease analyze genetic associations of plasma lipids such as total cholesterol, LDL-cholesterol, HDL-cholesterol, and triglycerides (TGs) separately. However, traits are often correlated and a joint analysis may yield increased statistical power for association over multiple univariate analyses. Recently several multivariate methods have been proposed that require individual-level data. Here, we develop metaUSAT (where USAT is unified score-based association test), a novel unified association test of a single genetic variant with multiple traits that uses only summary statistics from existing GWAS. Although the existing methods either perform well when most correlated traits are affected by the genetic variant in the same direction or are powerful when only a few of the correlated traits are associated, metaUSAT is designed to be robust to the association structure of correlated traits. metaUSAT does not require individual-level data and can test genetic associations of categorical and/or continuous traits. One can also use metaUSAT to analyze a single trait over multiple studies, appropriately accounting for overlapping samples, if any. metaUSAT provides an approximate asymptotic P-value for association and is computationally efficient for implementation at a genome-wide level. Simulation experiments show that metaUSAT maintains proper type-I error at low error levels. It has similar and sometimes greater power to detect association across a wide array of scenarios compared to existing methods, which are usually powerful for some specific association scenarios only. When applied to plasma lipids summary data from the METSIM and the T2D-GENES studies, metaUSAT detected genome-wide significant loci beyond the ones identified by univariate analyses. Evidence from larger studies suggest that the variants additionally detected by our test are, indeed, associated with lipid levels in humans. In summary, metaUSAT can provide novel insights into the genetic architecture of a common disease or traits.

Show full abstractShow less

DOI

10.1002/gepi.22105

mvGWAMA

Tool

PUBMED_LINK

30617256

FULL NAME

Multivariate Genome-Wide Association Meta-Analysis

DESCRIPTION

mvGWAMA is a python script to perform a GWAS meta-analysis when there are sample overlap.

Show full descriptionShow less

URL

https://github.com/Kyoko-wtnb/mvGWAMA

TITLE

Genome-wide meta-analysis identifies new loci and functional pathways influencing Alzheimer's disease risk.

Main citation

Jansen IE, Savage JE, Watanabe K, Bryois J, ...&, Posthuma D. (2019) Genome-wide meta-analysis identifies new loci and functional pathways influencing Alzheimer's disease risk. Nat Genet, 51 (3) 404-413. doi:10.1038/s41588-018-0311-9. PMID 30617256

ABSTRACT

Alzheimer's disease (AD) is highly heritable and recent studies have identified over 20 disease-associated genomic loci. Yet these only explain a small proportion of the genetic variance, indicating that undiscovered loci remain. Here, we performed a large genome-wide association study of clinically diagnosed AD and AD-by-proxy (71,880 cases, 383,378 controls). AD-by-proxy, based on parental diagnoses, showed strong genetic correlation with AD (rg = 0.81). Meta-analysis identified 29 risk loci, implicating 215 potential causative genes. Associated genes are strongly expressed in immune-related tissues and cell types (spleen, liver, and microglia). Gene-set analyses indicate biological mechanisms involved in lipid-related processes and degradation of amyloid precursor proteins. We show strong genetic correlations with multiple health-related outcomes, and Mendelian randomization results suggest a protective effect of cognitive ability on AD risk. These results are a step forward in identifying the genetic factors that contribute to AD risk and add novel insights into the neurobiology of AD.

Show full abstractShow less

DOI

10.1038/s41588-018-0311-9

Rare-variant

Meta-SAIGE

Tool

PUBMED_LINK

41266648

DESCRIPTION

Meta-SAIGE performs scalable cohort-level rare-variant meta-analysis from study-level outputs, emphasizing accurate null calibration (including low-prevalence binary traits), computational efficiency via reuse of LD structure across phenotypes, and power close to pooled individual-level analysis with SAIGE-GENE+.

Show full descriptionShow less

URL

https://meta-saige.leelabsg.org/ ,https://github.com/weizhouUMICH/SAIGE

KEYWORDS

rare variant, meta-analysis, SAIGE, summary statistics, type I error

Show full keywordsShow less

TITLE

Scalable and accurate rare variant meta-analysis with Meta-SAIGE.

Main citation

Park E, Nam K, Jeong S, Keat K, ...&, Lee S. (2025) Scalable and accurate rare variant meta-analysis with Meta-SAIGE. Nat Genet, 57 (12) 3185-3192. doi:10.1038/s41588-025-02403-y. PMID 41266648

ABSTRACT

Meta-analysis enhances the power of rare variant association tests by combining summary statistics across several cohorts. However, existing methods often fail to control type I error for low-prevalence binary traits and are computationally intensive. Here we introduce Meta-SAIGE-a scalable method for rare variant meta-analysis that accurately estimates the null distribution to control type I error and reuses the linkage disequilibrium matrix across phenotypes to boost computational efficiency in phenome-wide analyses. Simulations using UK Biobank whole-exome sequencing data show that Meta-SAIGE effectively controls type I error and achieves power comparable to pooled individual-level analysis with SAIGE-GENE+. Applying Meta-SAIGE to 83 low-prevalence phenotypes in UK Biobank and All of Us whole-exome sequencing data identified 237 gene-trait associations. Notably, 80 of these associations were not significant in either dataset alone, underscoring the power of our meta-analysis.

Show full abstractShow less

DOI

10.1038/s41588-025-02403-y

MetaSKAT

Tool

PUBMED_LINK

23768515

DESCRIPTION

MetaSKAT is a R package for multiple marker meta-analysis. It can carry out meta-analysis of SKAT, SKAT-O and burden tests with individual level genotype data or gene level summary statistics.

Show full descriptionShow less

URL

https://www.hsph.harvard.edu/skat/metaskat/

TITLE

General framework for meta-analysis of rare variants in sequencing association studies.

Main citation

Lee S, Teslovich TM, Boehnke M, Lin X. (2013) General framework for meta-analysis of rare variants in sequencing association studies. Am J Hum Genet, 93 (1) 42-53. doi:10.1016/j.ajhg.2013.05.010. PMID 23768515

ABSTRACT

We propose a general statistical framework for meta-analysis of gene- or region-based multimarker rare variant association tests in sequencing association studies. In genome-wide association studies, single-marker meta-analysis has been widely used to increase statistical power by combining results via regression coefficients and standard errors from different studies. In analysis of rare variants in sequencing studies, region-based multimarker tests are often used to increase power. We propose meta-analysis methods for commonly used gene- or region-based rare variants tests, such as burden tests and variance component tests. Because estimation of regression coefficients of individual rare variants is often unstable or not feasible, the proposed method avoids this difficulty by calculating score statistics instead that only require fitting the null model for each study and then aggregating these score statistics across studies. Our proposed meta-analysis rare variant association tests are conducted based on study-specific summary statistics, specifically score statistics for each variant and between-variant covariance-type (linkage disequilibrium) relationship statistics for each gene or region. The proposed methods are able to incorporate different levels of heterogeneity of genetic effects across studies and are applicable to meta-analysis of multiple ancestry groups. We show that the proposed methods are essentially as powerful as joint analysis by directly pooling individual level genotype data. We conduct extensive simulations to evaluate the performance of our methods by varying levels of heterogeneity across studies, and we apply the proposed methods to meta-analysis of rare variant effects in a multicohort study of the genetics of blood lipid levels.

Show full abstractShow less

DOI

10.1016/j.ajhg.2013.05.010

MetaSTAAR

Tool

PUBMED_LINK

36564505

DESCRIPTION

MetaSTAAR is an R package for performing Meta-analysis of variant-Set Test for Association using Annotation infoRmation (MetaSTAAR) procedure in whole-genome sequencing (WGS) studies. MetaSTAAR enables functionally-informed rare variant meta-analysis of large WGS studies using an efficient, sparse matrix approach for storing summary statistic, while protecting data privacy of study participants and avoiding sharing subject-level data. MetaSTAAR accounts for relatedness and population structure of continuous and dichotomous traits, and boosts the power of rare variant meta-analysis by incorporating multiple variant functional annotations.

Show full descriptionShow less

URL

https://github.com/xihaoli/MetaSTAAR

TITLE

Powerful, scalable and resource-efficient meta-analysis of rare variant associations in large whole genome sequencing studies.

Main citation

Li X, Quick C, Zhou H, Gaynor SM, ...&, Lin X. (2023) Powerful, scalable and resource-efficient meta-analysis of rare variant associations in large whole genome sequencing studies. Nat Genet, 55 (1) 154-164. doi:10.1038/s41588-022-01225-6. PMID 36564505

ABSTRACT

Meta-analysis of whole genome sequencing/whole exome sequencing (WGS/WES) studies provides an attractive solution to the problem of collecting large sample sizes for discovering rare variants associated with complex phenotypes. Existing rare variant meta-analysis approaches are not scalable to biobank-scale WGS data. Here we present MetaSTAAR, a powerful and resource-efficient rare variant meta-analysis framework for large-scale WGS/WES studies. MetaSTAAR accounts for relatedness and population structure, can analyze both quantitative and dichotomous traits and boosts the power of rare variant tests by incorporating multiple variant functional annotations. Through meta-analysis of four lipid traits in 30,138 ancestrally diverse samples from 14 studies of the Trans Omics for Precision Medicine (TOPMed) Program, we show that MetaSTAAR performs rare variant meta-analysis at scale and produces results comparable to using pooled data. Additionally, we identified several conditionally significant rare variant associations with lipid traits. We further demonstrate that MetaSTAAR is scalable to biobank-scale cohorts through meta-analysis of TOPMed WGS data and UK Biobank WES data of ~200,000 samples.

Show full abstractShow less

DOI

10.1038/s41588-022-01225-6

RareMETAL

Tool

PUBMED_LINK

24894501

DESCRIPTION

RAREMETAL is a program that facilitates the meta-analysis of rare variants from genotype arrays or sequencing (manuscript in preparation).

Show full descriptionShow less

URL

https://genome.sph.umich.edu/wiki/RAREMETAL

KEYWORDS

rare variants

Show full keywordsShow less

TITLE

RAREMETAL: fast and powerful meta-analysis for rare variants.

Main citation

Feng S, Liu D, Zhan X, Wing MK, ...&, Abecasis GR. (2014) RAREMETAL: fast and powerful meta-analysis for rare variants. Bioinformatics, 30 (19) 2828-9. doi:10.1093/bioinformatics/btu367. PMID 24894501

ABSTRACT

SUMMARY: RAREMETAL is a computationally efficient tool for meta-analysis of rare variants genotyped using sequencing or arrays. RAREMETAL facilitates analyses of individual studies, accommodates a variety of input file formats, handles related and unrelated individuals, executes both single variant and burden tests and performs conditional association analyses. AVAILABILITY AND IMPLEMENTATION: http://genome.sph.umich.edu/wiki/RAREMETAL for executables, source code, documentation and tutorial.

Show full abstractShow less

DOI

10.1093/bioinformatics/btu367

SMMAT

Tool

PUBMED_LINK

30639324

FULL NAME

variant set mixed model association tests

DESCRIPTION

For rare variant analysis from sequencing association studies, GMMAT performs the variant Set Mixed Model Association Tests (SMMAT) as proposed in Chen et al. (2019), including the burden test, the sequence kernel association test (SKAT), SKAT-O and an efficient hybrid test of the burden test and SKAT, based on user-defined variant sets.

Show full descriptionShow less

URL

https://github.com/hanchenphd/GMMAT

TITLE

Efficient Variant Set Mixed Model Association Tests for Continuous and Binary Traits in Large-Scale Whole-Genome Sequencing Studies.

Main citation

Chen H, Huffman JE, Brody JA, Wang C, ...&, Lin X. (2019) Efficient Variant Set Mixed Model Association Tests for Continuous and Binary Traits in Large-Scale Whole-Genome Sequencing Studies. Am J Hum Genet, 104 (2) 260-274. doi:10.1016/j.ajhg.2018.12.012. PMID 30639324

ABSTRACT

With advances in whole-genome sequencing (WGS) technology, more advanced statistical methods for testing genetic association with rare variants are being developed. Methods in which variants are grouped for analysis are also known as variant-set, gene-based, and aggregate unit tests. The burden test and sequence kernel association test (SKAT) are two widely used variant-set tests, which were originally developed for samples of unrelated individuals and later have been extended to family data with known pedigree structures. However, computationally efficient and powerful variant-set tests are needed to make analyses tractable in large-scale WGS studies with complex study samples. In this paper, we propose the variant-set mixed model association tests (SMMAT) for continuous and binary traits using the generalized linear mixed model framework. These tests can be applied to large-scale WGS studies involving samples with population structure and relatedness, such as in the National Heart, Lung, and Blood Institute's Trans-Omics for Precision Medicine (TOPMed) program. SMMATs share the same null model for different variant sets, and a virtue of this null model, which includes covariates only, is that it needs to be fit only once for all tests in each genome-wide analysis. Simulation studies show that all the proposed SMMATs correctly control type I error rates for both continuous and binary traits in the presence of population structure and relatedness. We also illustrate our tests in a real data example of analysis of plasma fibrinogen levels in the TOPMed program (n = 23,763), using the Analysis Commons, a cloud-based computing platform.

Show full abstractShow less

DOI

10.1016/j.ajhg.2018.12.012