Sumstats General

MAIN ANCESTRY

EUR

FinnGen R10 (December 18 2023)

Summary statistics

PUBMED_LINK

DESCRIPTION

FinnGen data freeze R10 (18 Dec 2023) GWAS summary statistics; flagship FinnGen resource described in Kurki et al., Nature 2023.

Show full descriptionShow less

URL

https://r10.finngen.fi/

TITLE

FinnGen provides genetic insights from a well-phenotyped isolated population.

Main citation

Kurki MI, Karjalainen J, Palta P, Sipilä TP, ...&, Palotie A. (2023) FinnGen provides genetic insights from a well-phenotyped isolated population. Nature, 613 (7944) 508-518. doi:10.1038/s41586-022-05473-8. PMID 36653562

ABSTRACT

Population isolates such as those in Finland benefit genetic research because deleterious alleles are often concentrated on a small number of low-frequency variants (0.1% ≤ minor allele frequency < 5%). These variants survived the founding bottleneck rather than being distributed over a large number of ultrarare variants. Although this effect is well established in Mendelian genetics, its value in common disease genetics is less explored1,2. FinnGen aims to study the genome and national health register data of 500,000 Finnish individuals. Given the relatively high median age of participants (63 years) and the substantial fraction of hospital-based recruitment, FinnGen is enriched for disease end points. Here we analyse data from 224,737 participants from FinnGen and study 15 diseases that have previously been investigated in large genome-wide association studies (GWASs). We also include meta-analyses of biobank data from Estonia and the United Kingdom. We identified 30 new associations, primarily low-frequency variants, enriched in the Finnish population. A GWAS of 1,932 diseases also identified 2,733 genome-wide significant associations (893 phenome-wide significant (PWS), P < 2.6 × 10-11) at 2,496 (771 PWS) independent loci with 807 (247 PWS) end points. Among these, fine-mapping implicated 148 (73 PWS) coding variants associated with 83 (42 PWS) end points. Moreover, 91 (47 PWS) had an allele frequency of <5% in non-Finnish European individuals, of which 62 (32 PWS) were enriched by more than twofold in Finland. These findings demonstrate the power of bottlenecked populations to find entry points into the biology of common diseases through low-frequency, high impact variants.

Show full abstractShow less

DOI

10.1038/s41586-022-05473-8

RELATED_BIOBANK

MAIN ANCESTRY

EUR

FinnGen R10-UKBB meta-analysis

Summary statistics

PUBMED_LINK

https://public-metaresults-fg-ukbb.finngen.fi

DESCRIPTION

Meta-analysis of FinnGen R10 with UK Biobank GWAS summary statistics (FinnGen distribution).

Show full descriptionShow less

URL

TITLE

FinnGen provides genetic insights from a well-phenotyped isolated population.

Main citation

Kurki MI, Karjalainen J, Palta P, Sipilä TP, ...&, Palotie A. (2023) FinnGen provides genetic insights from a well-phenotyped isolated population. Nature, 613 (7944) 508-518. doi:10.1038/s41586-022-05473-8. PMID 36653562

ABSTRACT

Population isolates such as those in Finland benefit genetic research because deleterious alleles are often concentrated on a small number of low-frequency variants (0.1% ≤ minor allele frequency < 5%). These variants survived the founding bottleneck rather than being distributed over a large number of ultrarare variants. Although this effect is well established in Mendelian genetics, its value in common disease genetics is less explored1,2. FinnGen aims to study the genome and national health register data of 500,000 Finnish individuals. Given the relatively high median age of participants (63 years) and the substantial fraction of hospital-based recruitment, FinnGen is enriched for disease end points. Here we analyse data from 224,737 participants from FinnGen and study 15 diseases that have previously been investigated in large genome-wide association studies (GWASs). We also include meta-analyses of biobank data from Estonia and the United Kingdom. We identified 30 new associations, primarily low-frequency variants, enriched in the Finnish population. A GWAS of 1,932 diseases also identified 2,733 genome-wide significant associations (893 phenome-wide significant (PWS), P < 2.6 × 10-11) at 2,496 (771 PWS) independent loci with 807 (247 PWS) end points. Among these, fine-mapping implicated 148 (73 PWS) coding variants associated with 83 (42 PWS) end points. Moreover, 91 (47 PWS) had an allele frequency of <5% in non-Finnish European individuals, of which 62 (32 PWS) were enriched by more than twofold in Finland. These findings demonstrate the power of bottlenecked populations to find entry points into the biology of common diseases through low-frequency, high impact variants.

Show full abstractShow less

DOI

10.1038/s41586-022-05473-8

RELATED_BIOBANK

UK Biobank ,FinnGen

MAIN ANCESTRY

EUR

FinnGen R11 (June 24 2024)

Summary statistics

PUBMED_LINK

DESCRIPTION

FinnGen data freeze R11 (24 Jun 2024) GWAS summary statistics; resource overview in Kurki et al., Nature 2023.

Show full descriptionShow less

URL

https://r11.finngen.fi/

TITLE

FinnGen provides genetic insights from a well-phenotyped isolated population.

Main citation

Kurki MI, Karjalainen J, Palta P, Sipilä TP, ...&, Palotie A. (2023) FinnGen provides genetic insights from a well-phenotyped isolated population. Nature, 613 (7944) 508-518. doi:10.1038/s41586-022-05473-8. PMID 36653562

ABSTRACT

Population isolates such as those in Finland benefit genetic research because deleterious alleles are often concentrated on a small number of low-frequency variants (0.1% ≤ minor allele frequency < 5%). These variants survived the founding bottleneck rather than being distributed over a large number of ultrarare variants. Although this effect is well established in Mendelian genetics, its value in common disease genetics is less explored1,2. FinnGen aims to study the genome and national health register data of 500,000 Finnish individuals. Given the relatively high median age of participants (63 years) and the substantial fraction of hospital-based recruitment, FinnGen is enriched for disease end points. Here we analyse data from 224,737 participants from FinnGen and study 15 diseases that have previously been investigated in large genome-wide association studies (GWASs). We also include meta-analyses of biobank data from Estonia and the United Kingdom. We identified 30 new associations, primarily low-frequency variants, enriched in the Finnish population. A GWAS of 1,932 diseases also identified 2,733 genome-wide significant associations (893 phenome-wide significant (PWS), P < 2.6 × 10-11) at 2,496 (771 PWS) independent loci with 807 (247 PWS) end points. Among these, fine-mapping implicated 148 (73 PWS) coding variants associated with 83 (42 PWS) end points. Moreover, 91 (47 PWS) had an allele frequency of <5% in non-Finnish European individuals, of which 62 (32 PWS) were enriched by more than twofold in Finland. These findings demonstrate the power of bottlenecked populations to find entry points into the biology of common diseases through low-frequency, high impact variants.

Show full abstractShow less

DOI

10.1038/s41586-022-05473-8

RELATED_BIOBANK

MAIN ANCESTRY

EUR

FinnGen R12 (November 4 2024)

Summary statistics

PUBMED_LINK

DESCRIPTION

FinnGen data freeze R12 (4 Nov 2024) GWAS summary statistics; resource overview in Kurki et al., Nature 2023.

Show full descriptionShow less

URL

https://r12.finngen.fi/

TITLE

FinnGen provides genetic insights from a well-phenotyped isolated population.

Main citation

Kurki MI, Karjalainen J, Palta P, Sipilä TP, ...&, Palotie A. (2023) FinnGen provides genetic insights from a well-phenotyped isolated population. Nature, 613 (7944) 508-518. doi:10.1038/s41586-022-05473-8. PMID 36653562

ABSTRACT

Population isolates such as those in Finland benefit genetic research because deleterious alleles are often concentrated on a small number of low-frequency variants (0.1% ≤ minor allele frequency < 5%). These variants survived the founding bottleneck rather than being distributed over a large number of ultrarare variants. Although this effect is well established in Mendelian genetics, its value in common disease genetics is less explored1,2. FinnGen aims to study the genome and national health register data of 500,000 Finnish individuals. Given the relatively high median age of participants (63 years) and the substantial fraction of hospital-based recruitment, FinnGen is enriched for disease end points. Here we analyse data from 224,737 participants from FinnGen and study 15 diseases that have previously been investigated in large genome-wide association studies (GWASs). We also include meta-analyses of biobank data from Estonia and the United Kingdom. We identified 30 new associations, primarily low-frequency variants, enriched in the Finnish population. A GWAS of 1,932 diseases also identified 2,733 genome-wide significant associations (893 phenome-wide significant (PWS), P < 2.6 × 10-11) at 2,496 (771 PWS) independent loci with 807 (247 PWS) end points. Among these, fine-mapping implicated 148 (73 PWS) coding variants associated with 83 (42 PWS) end points. Moreover, 91 (47 PWS) had an allele frequency of <5% in non-Finnish European individuals, of which 62 (32 PWS) were enriched by more than twofold in Finland. These findings demonstrate the power of bottlenecked populations to find entry points into the biology of common diseases through low-frequency, high impact variants.

Show full abstractShow less

DOI

10.1038/s41586-022-05473-8

RELATED_BIOBANK

MAIN ANCESTRY

EUR

FinnGen R12-UKBB meta-analysis

Summary statistics

PUBMED_LINK

https://metaresults-ukbb.finngen.fi/

DESCRIPTION

Meta-analysis of FinnGen R12 with UK Biobank GWAS summary statistics (FinnGen distribution).

Show full descriptionShow less

URL

TITLE

FinnGen provides genetic insights from a well-phenotyped isolated population.

Main citation

Kurki MI, Karjalainen J, Palta P, Sipilä TP, ...&, Palotie A. (2023) FinnGen provides genetic insights from a well-phenotyped isolated population. Nature, 613 (7944) 508-518. doi:10.1038/s41586-022-05473-8. PMID 36653562

ABSTRACT

Population isolates such as those in Finland benefit genetic research because deleterious alleles are often concentrated on a small number of low-frequency variants (0.1% ≤ minor allele frequency < 5%). These variants survived the founding bottleneck rather than being distributed over a large number of ultrarare variants. Although this effect is well established in Mendelian genetics, its value in common disease genetics is less explored1,2. FinnGen aims to study the genome and national health register data of 500,000 Finnish individuals. Given the relatively high median age of participants (63 years) and the substantial fraction of hospital-based recruitment, FinnGen is enriched for disease end points. Here we analyse data from 224,737 participants from FinnGen and study 15 diseases that have previously been investigated in large genome-wide association studies (GWASs). We also include meta-analyses of biobank data from Estonia and the United Kingdom. We identified 30 new associations, primarily low-frequency variants, enriched in the Finnish population. A GWAS of 1,932 diseases also identified 2,733 genome-wide significant associations (893 phenome-wide significant (PWS), P < 2.6 × 10-11) at 2,496 (771 PWS) independent loci with 807 (247 PWS) end points. Among these, fine-mapping implicated 148 (73 PWS) coding variants associated with 83 (42 PWS) end points. Moreover, 91 (47 PWS) had an allele frequency of <5% in non-Finnish European individuals, of which 62 (32 PWS) were enriched by more than twofold in Finland. These findings demonstrate the power of bottlenecked populations to find entry points into the biology of common diseases through low-frequency, high impact variants.

Show full abstractShow less

DOI

10.1038/s41586-022-05473-8

RELATED_BIOBANK

UK Biobank ,FinnGen

MAIN ANCESTRY

EUR

FinnGen R4 (November 30 2020)

Summary statistics

PUBMED_LINK

https://r4.finngen.fi/about

DESCRIPTION

FinnGen data freeze R4 (30 Nov 2020) GWAS summary statistics; resource overview in Kurki et al., Nature 2023.

Show full descriptionShow less

URL

TITLE

FinnGen provides genetic insights from a well-phenotyped isolated population.

Main citation

Kurki MI, Karjalainen J, Palta P, Sipilä TP, ...&, Palotie A. (2023) FinnGen provides genetic insights from a well-phenotyped isolated population. Nature, 613 (7944) 508-518. doi:10.1038/s41586-022-05473-8. PMID 36653562

ABSTRACT

Population isolates such as those in Finland benefit genetic research because deleterious alleles are often concentrated on a small number of low-frequency variants (0.1% ≤ minor allele frequency < 5%). These variants survived the founding bottleneck rather than being distributed over a large number of ultrarare variants. Although this effect is well established in Mendelian genetics, its value in common disease genetics is less explored1,2. FinnGen aims to study the genome and national health register data of 500,000 Finnish individuals. Given the relatively high median age of participants (63 years) and the substantial fraction of hospital-based recruitment, FinnGen is enriched for disease end points. Here we analyse data from 224,737 participants from FinnGen and study 15 diseases that have previously been investigated in large genome-wide association studies (GWASs). We also include meta-analyses of biobank data from Estonia and the United Kingdom. We identified 30 new associations, primarily low-frequency variants, enriched in the Finnish population. A GWAS of 1,932 diseases also identified 2,733 genome-wide significant associations (893 phenome-wide significant (PWS), P < 2.6 × 10-11) at 2,496 (771 PWS) independent loci with 807 (247 PWS) end points. Among these, fine-mapping implicated 148 (73 PWS) coding variants associated with 83 (42 PWS) end points. Moreover, 91 (47 PWS) had an allele frequency of <5% in non-Finnish European individuals, of which 62 (32 PWS) were enriched by more than twofold in Finland. These findings demonstrate the power of bottlenecked populations to find entry points into the biology of common diseases through low-frequency, high impact variants.

Show full abstractShow less

DOI

10.1038/s41586-022-05473-8

RELATED_BIOBANK

MAIN ANCESTRY

EUR

FinnGen R5 (May 11 2021)

Summary statistics

PUBMED_LINK

https://r5.finngen.fi/about

DESCRIPTION

FinnGen data freeze R5 (11 May 2021) GWAS summary statistics; resource overview in Kurki et al., Nature 2023.

Show full descriptionShow less

URL

TITLE

FinnGen provides genetic insights from a well-phenotyped isolated population.

Main citation

Kurki MI, Karjalainen J, Palta P, Sipilä TP, ...&, Palotie A. (2023) FinnGen provides genetic insights from a well-phenotyped isolated population. Nature, 613 (7944) 508-518. doi:10.1038/s41586-022-05473-8. PMID 36653562

ABSTRACT

Population isolates such as those in Finland benefit genetic research because deleterious alleles are often concentrated on a small number of low-frequency variants (0.1% ≤ minor allele frequency < 5%). These variants survived the founding bottleneck rather than being distributed over a large number of ultrarare variants. Although this effect is well established in Mendelian genetics, its value in common disease genetics is less explored1,2. FinnGen aims to study the genome and national health register data of 500,000 Finnish individuals. Given the relatively high median age of participants (63 years) and the substantial fraction of hospital-based recruitment, FinnGen is enriched for disease end points. Here we analyse data from 224,737 participants from FinnGen and study 15 diseases that have previously been investigated in large genome-wide association studies (GWASs). We also include meta-analyses of biobank data from Estonia and the United Kingdom. We identified 30 new associations, primarily low-frequency variants, enriched in the Finnish population. A GWAS of 1,932 diseases also identified 2,733 genome-wide significant associations (893 phenome-wide significant (PWS), P < 2.6 × 10-11) at 2,496 (771 PWS) independent loci with 807 (247 PWS) end points. Among these, fine-mapping implicated 148 (73 PWS) coding variants associated with 83 (42 PWS) end points. Moreover, 91 (47 PWS) had an allele frequency of <5% in non-Finnish European individuals, of which 62 (32 PWS) were enriched by more than twofold in Finland. These findings demonstrate the power of bottlenecked populations to find entry points into the biology of common diseases through low-frequency, high impact variants.

Show full abstractShow less

DOI

10.1038/s41586-022-05473-8

RELATED_BIOBANK

MAIN ANCESTRY

EUR

FinnGen R6 (January 24 2022)

Summary statistics

PUBMED_LINK

https://r6.finngen.fi/about

DESCRIPTION

FinnGen data freeze R6 (24 Jan 2022) GWAS summary statistics; resource overview in Kurki et al., Nature 2023.

Show full descriptionShow less

URL

TITLE

FinnGen provides genetic insights from a well-phenotyped isolated population.

Main citation

Kurki MI, Karjalainen J, Palta P, Sipilä TP, ...&, Palotie A. (2023) FinnGen provides genetic insights from a well-phenotyped isolated population. Nature, 613 (7944) 508-518. doi:10.1038/s41586-022-05473-8. PMID 36653562

ABSTRACT

Population isolates such as those in Finland benefit genetic research because deleterious alleles are often concentrated on a small number of low-frequency variants (0.1% ≤ minor allele frequency < 5%). These variants survived the founding bottleneck rather than being distributed over a large number of ultrarare variants. Although this effect is well established in Mendelian genetics, its value in common disease genetics is less explored1,2. FinnGen aims to study the genome and national health register data of 500,000 Finnish individuals. Given the relatively high median age of participants (63 years) and the substantial fraction of hospital-based recruitment, FinnGen is enriched for disease end points. Here we analyse data from 224,737 participants from FinnGen and study 15 diseases that have previously been investigated in large genome-wide association studies (GWASs). We also include meta-analyses of biobank data from Estonia and the United Kingdom. We identified 30 new associations, primarily low-frequency variants, enriched in the Finnish population. A GWAS of 1,932 diseases also identified 2,733 genome-wide significant associations (893 phenome-wide significant (PWS), P < 2.6 × 10-11) at 2,496 (771 PWS) independent loci with 807 (247 PWS) end points. Among these, fine-mapping implicated 148 (73 PWS) coding variants associated with 83 (42 PWS) end points. Moreover, 91 (47 PWS) had an allele frequency of <5% in non-Finnish European individuals, of which 62 (32 PWS) were enriched by more than twofold in Finland. These findings demonstrate the power of bottlenecked populations to find entry points into the biology of common diseases through low-frequency, high impact variants.

Show full abstractShow less

DOI

10.1038/s41586-022-05473-8

RELATED_BIOBANK

MAIN ANCESTRY

EUR

FinnGen R7 (June 1 2022)

Summary statistics

PUBMED_LINK

https://r7.finngen.fi/about

DESCRIPTION

FinnGen data freeze R7 (1 Jun 2022) GWAS summary statistics; resource overview in Kurki et al., Nature 2023.

Show full descriptionShow less

URL

TITLE

FinnGen provides genetic insights from a well-phenotyped isolated population.

Main citation

Kurki MI, Karjalainen J, Palta P, Sipilä TP, ...&, Palotie A. (2023) FinnGen provides genetic insights from a well-phenotyped isolated population. Nature, 613 (7944) 508-518. doi:10.1038/s41586-022-05473-8. PMID 36653562

ABSTRACT

Population isolates such as those in Finland benefit genetic research because deleterious alleles are often concentrated on a small number of low-frequency variants (0.1% ≤ minor allele frequency < 5%). These variants survived the founding bottleneck rather than being distributed over a large number of ultrarare variants. Although this effect is well established in Mendelian genetics, its value in common disease genetics is less explored1,2. FinnGen aims to study the genome and national health register data of 500,000 Finnish individuals. Given the relatively high median age of participants (63 years) and the substantial fraction of hospital-based recruitment, FinnGen is enriched for disease end points. Here we analyse data from 224,737 participants from FinnGen and study 15 diseases that have previously been investigated in large genome-wide association studies (GWASs). We also include meta-analyses of biobank data from Estonia and the United Kingdom. We identified 30 new associations, primarily low-frequency variants, enriched in the Finnish population. A GWAS of 1,932 diseases also identified 2,733 genome-wide significant associations (893 phenome-wide significant (PWS), P < 2.6 × 10-11) at 2,496 (771 PWS) independent loci with 807 (247 PWS) end points. Among these, fine-mapping implicated 148 (73 PWS) coding variants associated with 83 (42 PWS) end points. Moreover, 91 (47 PWS) had an allele frequency of <5% in non-Finnish European individuals, of which 62 (32 PWS) were enriched by more than twofold in Finland. These findings demonstrate the power of bottlenecked populations to find entry points into the biology of common diseases through low-frequency, high impact variants.

Show full abstractShow less

DOI

10.1038/s41586-022-05473-8

RELATED_BIOBANK

MAIN ANCESTRY

EUR

FinnGen R8 (Dec 1 2022)

Summary statistics

PUBMED_LINK

https://r8.finngen.fi/about

DESCRIPTION

FinnGen data freeze R8 (1 Dec 2022) GWAS summary statistics; resource overview in Kurki et al., Nature 2023.

Show full descriptionShow less

URL

TITLE

FinnGen provides genetic insights from a well-phenotyped isolated population.

Main citation

Kurki MI, Karjalainen J, Palta P, Sipilä TP, ...&, Palotie A. (2023) FinnGen provides genetic insights from a well-phenotyped isolated population. Nature, 613 (7944) 508-518. doi:10.1038/s41586-022-05473-8. PMID 36653562

ABSTRACT

Population isolates such as those in Finland benefit genetic research because deleterious alleles are often concentrated on a small number of low-frequency variants (0.1% ≤ minor allele frequency < 5%). These variants survived the founding bottleneck rather than being distributed over a large number of ultrarare variants. Although this effect is well established in Mendelian genetics, its value in common disease genetics is less explored1,2. FinnGen aims to study the genome and national health register data of 500,000 Finnish individuals. Given the relatively high median age of participants (63 years) and the substantial fraction of hospital-based recruitment, FinnGen is enriched for disease end points. Here we analyse data from 224,737 participants from FinnGen and study 15 diseases that have previously been investigated in large genome-wide association studies (GWASs). We also include meta-analyses of biobank data from Estonia and the United Kingdom. We identified 30 new associations, primarily low-frequency variants, enriched in the Finnish population. A GWAS of 1,932 diseases also identified 2,733 genome-wide significant associations (893 phenome-wide significant (PWS), P < 2.6 × 10-11) at 2,496 (771 PWS) independent loci with 807 (247 PWS) end points. Among these, fine-mapping implicated 148 (73 PWS) coding variants associated with 83 (42 PWS) end points. Moreover, 91 (47 PWS) had an allele frequency of <5% in non-Finnish European individuals, of which 62 (32 PWS) were enriched by more than twofold in Finland. These findings demonstrate the power of bottlenecked populations to find entry points into the biology of common diseases through low-frequency, high impact variants.

Show full abstractShow less

DOI

10.1038/s41586-022-05473-8

RELATED_BIOBANK

MAIN ANCESTRY

EUR

FinnGen R9 (May 11 2023)

Summary statistics

PUBMED_LINK

https://r9.finngen.fi/about

DESCRIPTION

FinnGen data freeze R9 (11 May 2023) GWAS summary statistics; resource overview in Kurki et al., Nature 2023.

Show full descriptionShow less

URL

TITLE

FinnGen provides genetic insights from a well-phenotyped isolated population.

Main citation

Kurki MI, Karjalainen J, Palta P, Sipilä TP, ...&, Palotie A. (2023) FinnGen provides genetic insights from a well-phenotyped isolated population. Nature, 613 (7944) 508-518. doi:10.1038/s41586-022-05473-8. PMID 36653562

ABSTRACT

Population isolates such as those in Finland benefit genetic research because deleterious alleles are often concentrated on a small number of low-frequency variants (0.1% ≤ minor allele frequency < 5%). These variants survived the founding bottleneck rather than being distributed over a large number of ultrarare variants. Although this effect is well established in Mendelian genetics, its value in common disease genetics is less explored1,2. FinnGen aims to study the genome and national health register data of 500,000 Finnish individuals. Given the relatively high median age of participants (63 years) and the substantial fraction of hospital-based recruitment, FinnGen is enriched for disease end points. Here we analyse data from 224,737 participants from FinnGen and study 15 diseases that have previously been investigated in large genome-wide association studies (GWASs). We also include meta-analyses of biobank data from Estonia and the United Kingdom. We identified 30 new associations, primarily low-frequency variants, enriched in the Finnish population. A GWAS of 1,932 diseases also identified 2,733 genome-wide significant associations (893 phenome-wide significant (PWS), P < 2.6 × 10-11) at 2,496 (771 PWS) independent loci with 807 (247 PWS) end points. Among these, fine-mapping implicated 148 (73 PWS) coding variants associated with 83 (42 PWS) end points. Moreover, 91 (47 PWS) had an allele frequency of <5% in non-Finnish European individuals, of which 62 (32 PWS) were enriched by more than twofold in Finland. These findings demonstrate the power of bottlenecked populations to find entry points into the biology of common diseases through low-frequency, high impact variants.

Show full abstractShow less

DOI

10.1038/s41586-022-05473-8

RELATED_BIOBANK

https://datashare.ed.ac.uk/handle/10283/844

MAIN ANCESTRY

EUR

Generation Scotland

Summary statistics

DESCRIPTION

Generation Scotland cohort GWAS summary statistics and related downloads.

Show full descriptionShow less

URL

MAIN ANCESTRY

EUR

Global Biobank

Summary statistics

PUBMED_LINK

36777996

DESCRIPTION

Global Biobank Meta-analysis Initiative (GBMI) harmonized GWAS across many biobanks.

Show full descriptionShow less

URL

http://results.globalbiobankmeta.org/

TITLE

Global Biobank Meta-analysis Initiative: Powering genetic discovery across human disease.

Main citation

Zhou W, Kanai M, Wu KH, Rasheed H, ...&, Neale BM. (2022) Global Biobank Meta-analysis Initiative: Powering genetic discovery across human disease. Cell Genom, 2 (10) 100192. doi:10.1016/j.xgen.2022.100192. PMID 36777996

ABSTRACT

Biobanks facilitate genome-wide association studies (GWASs), which have mapped genomic loci across a range of human diseases and traits. However, most biobanks are primarily composed of individuals of European ancestry. We introduce the Global Biobank Meta-analysis Initiative (GBMI)-a collaborative network of 23 biobanks from 4 continents representing more than 2.2 million consented individuals with genetic data linked to electronic health records. GBMI meta-analyzes summary statistics from GWASs generated using harmonized genotypes and phenotypes from member biobanks for 14 exemplar diseases and endpoints. This strategy validates that GWASs conducted in diverse biobanks can be integrated despite heterogeneity in case definitions, recruitment strategies, and baseline characteristics. This collaborative effort improves GWAS power for diseases, benefits understudied diseases, and improves risk prediction while also enabling the nomination of disease genes and drug candidates by incorporating gene and protein expression data and providing insight into the underlying biology of human diseases and traits.

Show full abstractShow less

DOI

10.1016/j.xgen.2022.100192

MAIN ANCESTRY

ALL

KoGES Pheweb

Summary statistics

DESCRIPTION

PheWeb instance for KoGES (Korean Genome and Epidemiology Study) GWAS summary statistics.

Show full descriptionShow less

URL

https://koges.leelabsg.org/

MAIN ANCESTRY

EAS

KoreanChip

Summary statistics

PUBMED_LINK

30718733

DESCRIPTION

GWAS summary statistics based on the Korea Biobank Array (KoreanChip / KoGES).

Show full descriptionShow less

URL

https://www.koreanchip.org/downloads

TITLE

The Korea Biobank Array: Design and Identification of Coding Variants Associated with Blood Biochemical Traits.

Main citation

Moon S, Kim YJ, Han S, Hwang MY, ...&, Kim BJ. (2019) The Korea Biobank Array: Design and Identification of Coding Variants Associated with Blood Biochemical Traits. Sci Rep, 9 (1) 1382. doi:10.1038/s41598-018-37832-9. PMID 30718733

ABSTRACT

We introduce the design and implementation of a new array, the Korea Biobank Array (referred to as KoreanChip), optimized for the Korean population and demonstrate findings from GWAS of blood biochemical traits. KoreanChip comprised >833,000 markers including >247,000 rare-frequency or functional variants estimated from >2,500 sequencing data in Koreans. Of the 833 K markers, 208 K functional markers were directly genotyped. Particularly, >89 K markers were presented in East Asians. KoreanChip achieved higher imputation performance owing to the excellent genomic coverage of 95.38% for common and 73.65% for low-frequency variants. From GWAS (Genome-wide association study) using 6,949 individuals, 28 associations were successfully recapitulated. Moreover, 9 missense variants were newly identified, of which we identified new associations between a common population-specific missense variant, rs671 (p.Glu457Lys) of ALDH2, and two traits including aspartate aminotransferase (P = 5.20 × 10-13) and alanine aminotransferase (P = 4.98 × 10-8). Furthermore, two novel missense variants of GPT with rare frequency in East Asians but extreme rarity in other populations were associated with alanine aminotransferase (rs200088103; p.Arg133Trp, P = 2.02 × 10-9 and rs748547625; p.Arg143Cys, P = 1.41 × 10-6). These variants were successfully replicated in 6,000 individuals (P = 5.30 × 10-8 and P = 1.24 × 10-6). GWAS results suggest the promising utility of KoreanChip with a substantial number of damaging variants to identify new population-specific disease-associated rare/functional variants.

Show full abstractShow less

DOI

10.1038/s41598-018-37832-9

MAIN ANCESTRY

EAS

MANE PheWeb

Summary statistics

PUBMED_LINK

39389017

DESCRIPTION

MANE PheWeb — Chinese maternal cohort GWAS summary statistics browser.

Show full descriptionShow less

URL

https://db.cngb.org/MANE.PheWeb/

TITLE

Genetic analyses of 104 phenotypes in 20,900 Chinese pregnant women reveal pregnancy-specific discoveries.

Main citation

Xiao H, Li L, Yang M, Zhang X, ...&, Jin X. (2024) Genetic analyses of 104 phenotypes in 20,900 Chinese pregnant women reveal pregnancy-specific discoveries. Cell Genom, 4 (10) 100633. doi:10.1016/j.xgen.2024.100633. PMID 39389017

ABSTRACT

Monitoring biochemical phenotypes during pregnancy is vital for maternal and fetal health, allowing early detection and management of pregnancy-related conditions to ensure safety for both. Here, we conducted a genetic analysis of 104 pregnancy phenotypes in 20,900 Chinese women. The genome-wide association study (GWAS) identified a total of 410 trait-locus associations, with 71.71% reported previously. Among the 116 novel hits for 45 phenotypes, 83 were successfully replicated. Among them, 31 were defined as potentially pregnancy-specific associations, including creatine and HELLPAR and neutrophils and ESR1, with subsequent analysis revealing enrichments in estrogen-related pathways and female reproductive tissues. The partitioning heritability underscored the significant roles of fetal blood, embryoid bodies, and female reproductive organs in pregnancy hematology and birth outcomes. Pathway analysis confirmed the intricate interplay of hormone and immune regulation, metabolism, and cell cycle during pregnancy. This study contributes to the understanding of genetic influences on pregnancy phenotypes and their implications for maternal health.

Show full abstractShow less

DOI

10.1016/j.xgen.2024.100633

MAIN ANCESTRY

EAS

MGI 1

Summary statistics

DESCRIPTION

Michigan Genomics Initiative PheWeb freeze 1 — GWAS summary statistics.

Show full descriptionShow less

URL

https://pheweb.org/MGI-freeze1/

MAIN ANCESTRY

EUR

MGI 2

Summary statistics

DESCRIPTION

Michigan Genomics Initiative PheWeb freeze 2 — GWAS summary statistics.

Show full descriptionShow less

URL

https://pheweb.org/MGI-freeze2/

MAIN ANCESTRY

EUR

MGI BioUV

Summary statistics

DESCRIPTION

Michigan Genomics Initiative PheWeb BioUV freeze — GWAS summary statistics.

Show full descriptionShow less

URL

https://pheweb.org/MGI-BioVU/

MAIN ANCESTRY

EUR

MVP-Finngen-UKBB meta-analysis

Summary statistics

PUBMED_LINK

39974076

DESCRIPTION

Cross-biobank GWAS meta-analysis across MVP, FinnGen, and UK Biobank (phenome-wide association resource).

Show full descriptionShow less

URL

https://mvp-ukbb.finngen.fi/

TITLE

Prevalence and disease risks for male and female sex chromosome trisomies: a registry-based phenome-wide association study in 1.5 million participants of MVP, FinnGen, and UK Biobank.

Main citation

Davis SM, Liu A, Teerlink CC, Lapato DM, ...&, Hauger RL. (2025) Prevalence and disease risks for male and female sex chromosome trisomies: a registry-based phenome-wide association study in 1.5 million participants of MVP, FinnGen, and UK Biobank. medRxiv, () . doi:10.1101/2025.01.31.25321488. PMID 39974076

ABSTRACT

Sex chromosome trisomies (SCT) are the most common whole chromosome aneuploidy in humans. Yet, our understanding of the prevalence and associated health outcomes is largely driven by observational studies of clinically diagnosed cases, resulting in a disproportionate focus on 47,XXY and associated hypogonadism. We analyzed microarray intensity data of sex chromosomes for 1.5 million individuals enrolled in three large cohorts-Million Veteran Program, FinnGen, and UK Biobank-to identify individuals with 47,XXY, 47,XYY, and 47,XXX. We examined disease conditions associated with SCTs by performing phenome-wide association studies (PheWAS) using electronic health records (EHR) data for each cohort, followed by meta-analysis across cohorts. Association results are presented for each SCT and also stratified by presence or absence of a documented clinical diagnosis for 47,XXY. We identified 2,769 individuals with (47,XXY: 1,319; 47,XYY: 1,108; 47,XXX: 342), most of whom had no documented clinical diagnosis (47,XXY: 73.8%; 47,XYY: 98.6%; 47,XXX: 93.6%). The identified phenotypic associations with SCT spanned all PheWAS disease categories except neoplasms. Many associations are shared among three SCT subtypes, particularly for vascular diseases (e.g., chronic venous insufficiency (OR [95% CI] for 47,XXY 4.7 [3.9,5.8]; 47,XYY 5.6 [4.5,7.0]; 4 7,XXX 4.6 [2.7,7.6], venous thromboembolism (47,XXY 4.6 [3.7-5.6]; 47,XYY 4.1 [3.3-5.0]; 47,XXX 8.1 [4.2-15.4]), and glaucoma (47,XXY 2.5 [2.1-2.9]; 47,XYY 2.4 [2.0-2.8]; 47,XXX 2.3 [1.4-3.5]). A third sex chromosome confers an increased risk for systemic comorbidities, even if the SCT is not documented. SCT phenotypes largely overlap, suggesting one or more X/Y homolog genes may underlie pathophysiology and comorbidities across SCTs.

Show full abstractShow less

DOI

10.1101/2025.01.31.25321488

MAIN ANCESTRY

EUR

PLATLAS

Summary statistics

PUBMED_LINK

40313291

FULL NAME

PLeiotropic ATLAS

DESCRIPTION

PLATLAS — pleiotropy atlas with GWAS summary statistics across >1000 phenotypes (multi-biobank).

Show full descriptionShow less

URL

https://platlas.cels.anl.gov/

TITLE

Genome-Wide Assessment of Pleiotropy Across >1000 Traits from Global Biobanks.

Main citation

Levin MG, Koyama S, Woerner J, Zhang DY, ...&, Natarajan P. (2025) Genome-Wide Assessment of Pleiotropy Across >1000 Traits from Global Biobanks. medRxiv, () . doi:10.1101/2025.04.18.25326074. PMID 40313291

ABSTRACT

Large-scale genetic association studies have identified thousands of trait-associated risk loci, establishing the polygenic basis for common complex traits and diseases. Although prior studies suggest that many trait-associated loci are pleiotropic, the extent to which this pleiotropy reflects shared causal variants or confounding by linkage disequilibrium remains poorly characterized. To define a set of candidate loci with potentially pleiotropic associations, we performed genome-wide association study (GWAS) meta-analyses of up to 1,167 clinically relevant traits and diseases across 1,789,365 diverse individuals genetically similar to Admixed American (AMR, NMax = 60,756), African (AFR, NMax = 128,361), East Asian (EAS, NMax = 307,465), European (EUR, NMax = 1,283,907), and South Asian (SAS, NMax = 8,876) reference populations from the VA Million Veteran Program (MVP), UK Biobank (UKB), FinnGen, Biobank Japan (BBJ), Tohoku Medical Megabank (ToMMo), and Korean Genome and Epidemiology Study (KoGES). We identified 27,193 genome-wide significant locus-trait pairs (1MB region with PGWAMA < 5 × 10-8) in within-population analysis and 29,139 in multi-population analysis (PMR-MEGA < 5 × 10-8). Among these, 11.5% (n = 3,149) of locus-trait pairs in population-wise and 6.4% (n = 1,875) in multi-population analyses did not reach genome-wide significance in previously published GWAS. In aggregate, the genome-wide significant loci fell within 2,624 non-overlapping autosomal genomic windows on average ~600kb in size. Each locus contained genome-wide significant signals for a median of 6 traits (IQR 2 to 18), including 2,110 (80%) pleiotropic loci associated with >1 trait. Multi-trait colocalization identified 1,902 (72%) loci with high-confidence (posterior probability > 0.9) evidence of a shared causal variant across two or more traits. Variants in pleiotropic loci were significantly enriched for a broad spectrum of functional annotations compared to non-pleiotropic counterparts. Polygenic scores (PGS) developed from these data generally improved prediction compared to existing PGS, and were broadly associated with both primary and pleiotropic phenotypes. These results provide a contemporary map of genetic pleiotropy across the spectrum of human traits/diseases and diverse genetic backgrounds.

Show full abstractShow less

DOI

10.1101/2025.04.18.25326074

MAIN ANCESTRY

ALL

Pan-UKB

Summary statistics

PUBMED_LINK

40968291

DESCRIPTION

Pan-UK Biobank — multi-ancestry GWAS in UK Biobank across thousands of phenotypes.

Show full descriptionShow less

URL

https://pan.ukbb.broadinstitute.org/

TITLE

Pan-UK Biobank genome-wide association analyses enhance discovery and resolution of ancestry-enriched effects.

Main citation

Karczewski KJ, Gupta R, Kanai M, Lu W, ...&, Martin AR. (2025) Pan-UK Biobank genome-wide association analyses enhance discovery and resolution of ancestry-enriched effects. Nat Genet, 57 (10) 2408-2417. doi:10.1038/s41588-025-02335-7. PMID 40968291

ABSTRACT

Large biobanks, such as the UK Biobank (UKB), enable massive phenome by genome-wide association studies that elucidate genetic etiology of complex traits. However, people from diverse genetic ancestry groups are often excluded from association analyses due to concerns about population structure introducing false positive associations. Here we generate mixed model associations and meta-analyses across genetic ancestry groups, inclusive of a larger fraction of the UK Biobank than previous efforts, to produce freely available summary statistics for 7,266 traits. We build a quality control and analysis framework informed by genetic architecture. Overall, we identify 14,676 significant loci (P < 5 × 10-8) in the meta-analysis that were not found in the EUR genetic ancestry group alone, including new associations, for example between CAMK2D and triglycerides. We also highlight associations from ancestry-enriched variation, including a known pleiotropic missense variant in G6PD associated with several biomarker traits. We release these results publicly alongside frequently asked questions that describe caveats for interpretation of results, enhancing available resources for interpretation of risk variants across diverse populations.

Show full abstractShow less

DOI

10.1038/s41588-025-02335-7

RELATED_BIOBANK

https://pheweb.ibms.sinica.edu.tw/

MAIN ANCESTRY

EUR

TPMI PheWeb

Summary statistics

PUBMED_LINK

41092961

DESCRIPTION

Taiwan Precision Medicine Initiative PheWeb — cohort GWAS summary statistics.

Show full descriptionShow less

URL

TITLE

The Taiwan Precision Medicine Initiative provides a cohort for large-scale studies.

Main citation

Yang HC, Kwok PY, Li LH, Liu YM, ...&, Wu JY. (2025) The Taiwan Precision Medicine Initiative provides a cohort for large-scale studies. Nature, 648 (8092) 117-127. doi:10.1038/s41586-025-09680-x. PMID 41092961

ABSTRACT

Han Chinese people comprise nearly 20% of the global population but remain under-represented in genetic studies1,2, so there is an urgent need for large-scale cohorts to advance precision medicine. Here we present the Taiwan Precision Medicine Initiative (TPMI), established by Academia Sinica in collaboration with 16 major medical centres around Taiwan, which has recruited 565,390 participants who consent to provide DNA samples for genetic profiling and grant access to their electronic medical records (EMRs) for research. EMR access is both retrospective and prospective, allowing longitudinal studies. Genetic profiling is done with population-optimized arrays of single-nucleotide polymorphisms for people of Han Chinese ancestry, which enable genome-wide association3,4, phenome-wide association5,6 and polygenic risk score7,8 studies to be performed to evaluate common disease risk and pharmacogenetic response. Participants also agreed to be re-contacted for future research and receive personalized genetic risk profiles with health management recommendations. The TPMI has established the TPMI Data Access Platform, a central database and analysis platform that both safeguards the security of the data and facilitates academic research. As a large cohort of individuals with non-European ancestry that merges genetic profiles with EMR data and enables longitudinal follow-up, TPMI provides a unique resource that could be used to validate genetic risk prediction models, perform clinical trials of risk-based health management and inform health policies. Ultimately, the TPMI cohort will contribute to global genetic research and serve as a model for population-based precision medicine.

Show full abstractShow less

DOI

10.1038/s41586-025-09680-x

RELATED_BIOBANK

Taiwan Precision Medicine Initiative

MAIN ANCESTRY

EAS

Taiwan BioBank Pheweb

Summary statistics

PUBMED_LINK

29149267

DESCRIPTION

Taiwan Biobank PheWeb — GWAS summary statistics for Taiwanese participants.

Show full descriptionShow less

URL

https://taiwanview.twbiobank.org.tw/pheweb.php

TITLE

Taiwan Biobank: making cross-database convergence possible in the Big Data era.

Main citation

Lin JC, Fan CT, Liao CC, Chen YS. (2018) Taiwan Biobank: making cross-database convergence possible in the Big Data era. Gigascience, 7 (1) 1-4. doi:10.1093/gigascience/gix110. PMID 29149267

ABSTRACT

The Taiwan Biobank (TWB) is a biomedical research database of biopsy data from 200 000 participants. Access to this database has been granted to research communities taking part in the development of precision medicines; however, this has raised issues surrounding TWB's access to electronic medical records (EMRs). The Personal Data Protection Act of Taiwan restricts access to EMRs for purposes not covered by patients' original consent. This commentary explores possible legal solutions to help ensure that the access TWB has to EMR abides with legal obligations, and with governance frameworks associated with ethical, legal, and social implications. We suggest utilizing "hash function" algorithms to create nonretrospective, anonymized data for the purpose of cross-transmission and/or linkage with EMR.

Show full abstractShow less

DOI

10.1093/gigascience/gix110

RELATED_BIOBANK

Taiwan Biobank

MAIN ANCESTRY

EAS

Tohoku Medical Megabank (TMM) Jmorp

Summary statistics

PUBMED_LINK

37930845

DESCRIPTION

Tohoku Medical Megabank / jMorp multi-omics reference and GWAS-related summary data portal.

Show full descriptionShow less

URL

https://jmorp.megabank.tohoku.ac.jp/202109/gwas/

TITLE

jMorp: Japanese Multi-Omics Reference Panel update report 2023.

Main citation

Tadaka S, Kawashima J, Hishinuma E, Saito S, ...&, Kinoshita K. (2024) jMorp: Japanese Multi-Omics Reference Panel update report 2023. Nucleic Acids Res, 52 (D1) D622-D632. doi:10.1093/nar/gkad978. PMID 37930845

ABSTRACT

Modern medicine is increasingly focused on personalized medicine, and multi-omics data is crucial in understanding biological phenomena and disease mechanisms. Each ethnic group has its unique genetic background with specific genomic variations influencing disease risk and drug response. Therefore, multi-omics data from specific ethnic populations are essential for the effective implementation of personalized medicine. Various prospective cohort studies, such as the UK Biobank, All of Us and Lifelines, have been conducted worldwide. The Tohoku Medical Megabank project was initiated after the Great East Japan Earthquake in 2011. It collects biological specimens and conducts genome and omics analyses to build a basis for personalized medicine. Summary statistical data from these analyses are available in the jMorp web database (https://jmorp.megabank.tohoku.ac.jp), which provides a multidimensional approach to the diversity of the Japanese population. jMorp was launched in 2015 as a public database for plasma metabolome and proteome analyses and has been continuously updated. The current update will significantly expand the scale of the data (metabolome, genome, transcriptome, and metagenome). In addition, the user interface and backend server implementations were rewritten to improve the connectivity between the items stored in jMorp. This paper provides an overview of the new version of the jMorp.

Show full abstractShow less

DOI

10.1093/nar/gkad978

RELATED_BIOBANK

Tohoku Medical Megabank

MAIN ANCESTRY

EAS

UKB Neale

Summary statistics

DESCRIPTION

Neale lab UK Biobank GWAS summary statistics (round-2 style phenome-wide results via PheWeb).

Show full descriptionShow less

URL

https://pheweb.org/UKB-Neale/

RELATED_BIOBANK

https://pheweb.org/UKB-TOPMed/

MAIN ANCESTRY

EUR

UKB TOPMed

Summary statistics

PUBMED_LINK

33568819

DESCRIPTION

UK Biobank GWAS using TOPMed-imputed genotypes (multi-ancestry imputation panel).

Show full descriptionShow less

URL

TITLE

Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program.

Main citation

Taliun D, Harris DN, Kessler MD, Carlson J, ...&, Abecasis GR. (2021) Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. Nature, 590 (7845) 290-299. doi:10.1038/s41586-021-03205-y. PMID 33568819

ABSTRACT

The Trans-Omics for Precision Medicine (TOPMed) programme seeks to elucidate the genetic architecture and biology of heart, lung, blood and sleep disorders, with the ultimate goal of improving diagnosis, treatment and prevention of these diseases. The initial phases of the programme focused on whole-genome sequencing of individuals with rich phenotypic data and diverse backgrounds. Here we describe the TOPMed goals and design as well as the available resources and early insights obtained from the sequence data. The resources include a variant browser, a genotype imputation server, and genomic and phenotypic data that are available through dbGaP (Database of Genotypes and Phenotypes)1. In the first 53,831 TOPMed samples, we detected more than 400 million single-nucleotide and insertion or deletion variants after alignment with the reference genome. Additional previously undescribed variants were detected through assembly of unmapped reads and customized analysis in highly variable loci. Among the more than 400 million detected variants, 97% have frequencies of less than 1% and 46% are singletons that are present in only one individual (53% among unrelated individuals). These rare variants provide insights into mutational processes and recent human evolutionary history. The extensive catalogue of genetic variation in TOPMed studies provides unique opportunities for exploring the contributions of rare and noncoding sequence variants to phenotypic variation. Furthermore, combining TOPMed haplotypes with modern imputation methods improves the power and reach of genome-wide association studies to include variants down to a frequency of approximately 0.01%.

Show full abstractShow less

DOI

10.1038/s41586-021-03205-y

RELATED_BIOBANK

MAIN ANCESTRY

EUR

UKB exome

Summary statistics

PUBMED_LINK

34375979

DESCRIPTION

UK Biobank exome sequence-based GWAS summary statistics (gene- and variant-level association resource).

Show full descriptionShow less

URL

https://azphewas.com/

TITLE

Rare variant contribution to human disease in 281,104 UK Biobank exomes.

Main citation

Wang Q, Dhindsa RS, Carss K, Harper AR, ...&, Petrovski S. (2021) Rare variant contribution to human disease in 281,104 UK Biobank exomes. Nature, 597 (7877) 527-532. doi:10.1038/s41586-021-03855-y. PMID 34375979

ABSTRACT

Genome-wide association studies have uncovered thousands of common variants associated with human disease, but the contribution of rare variants to common disease remains relatively unexplored. The UK Biobank contains detailed phenotypic data linked to medical records for approximately 500,000 participants, offering an unprecedented opportunity to evaluate the effect of rare variation on a broad collection of traits1,2. Here we study the relationships between rare protein-coding variants and 17,361 binary and 1,419 quantitative phenotypes using exome sequencing data from 269,171 UK Biobank participants of European ancestry. Gene-based collapsing analyses revealed 1,703 statistically significant gene-phenotype associations for binary traits, with a median odds ratio of 12.4. Furthermore, 83% of these associations were undetectable via single-variant association tests, emphasizing the power of gene-based collapsing analysis in the setting of high allelic heterogeneity. Gene-phenotype associations were also significantly enriched for loss-of-function-mediated traits and approved drug targets. Finally, we performed ancestry-specific and pan-ancestry collapsing analyses using exome sequencing data from 11,933 UK Biobank participants of African, East Asian or South Asian ancestry. Our results highlight a significant contribution of rare variants to common disease. Summary statistics are publicly available through an interactive portal ( http://azphewas.com/ ).

Show full abstractShow less

DOI

10.1038/s41586-021-03855-y

RELATED_BIOBANK

https://yanglab.westlake.edu.cn/data/ukb_fastgwa/imp/

MAIN ANCESTRY

EUR

UKB fastgwa (Imputation)

Summary statistics

PUBMED_LINK

31768069

DESCRIPTION

UK Biobank GWAS from fastGWA on imputed genotype data (continuous and binary traits).

Show full descriptionShow less

URL

TITLE

A resource-efficient tool for mixed model association analysis of large-scale data.

Main citation

Jiang L, Zheng Z, Qi T, Kemper KE, ...&, Yang J. (2019) A resource-efficient tool for mixed model association analysis of large-scale data. Nat Genet, 51 (12) 1749-1755. doi:10.1038/s41588-019-0530-8. PMID 31768069

ABSTRACT

The genome-wide association study (GWAS) has been widely used as an experimental design to detect associations between genetic variants and a phenotype. Two major confounding factors, population stratification and relatedness, could potentially lead to inflated GWAS test statistics and hence to spurious associations. Mixed linear model (MLM)-based approaches can be used to account for sample structure. However, genome-wide association (GWA) analyses in biobank samples such as the UK Biobank (UKB) often exceed the capability of most existing MLM-based tools especially if the number of traits is large. Here, we develop an MLM-based tool (fastGWA) that controls for population stratification by principal components and for relatedness by a sparse genetic relationship matrix for GWA analyses of biobank-scale data. We demonstrate by extensive simulations that fastGWA is reliable, robust and highly resource-efficient. We then apply fastGWA to 2,173 traits on array-genotyped and imputed samples from 456,422 individuals and to 2,048 traits on whole-exome-sequenced samples from 46,191 individuals in the UKB.

Show full abstractShow less

DOI

10.1038/s41588-019-0530-8

RELATED_BIOBANK

https://yanglab.westlake.edu.cn/data/ukb_fastgwa/wes/

MAIN ANCESTRY

EUR

UKB fastgwa (WES)

Summary statistics

PUBMED_LINK

31768069

DESCRIPTION

UK Biobank GWAS from fastGWA on whole-exome sequence data.

Show full descriptionShow less

URL

TITLE

A resource-efficient tool for mixed model association analysis of large-scale data.

Main citation

Jiang L, Zheng Z, Qi T, Kemper KE, ...&, Yang J. (2019) A resource-efficient tool for mixed model association analysis of large-scale data. Nat Genet, 51 (12) 1749-1755. doi:10.1038/s41588-019-0530-8. PMID 31768069

ABSTRACT

The genome-wide association study (GWAS) has been widely used as an experimental design to detect associations between genetic variants and a phenotype. Two major confounding factors, population stratification and relatedness, could potentially lead to inflated GWAS test statistics and hence to spurious associations. Mixed linear model (MLM)-based approaches can be used to account for sample structure. However, genome-wide association (GWA) analyses in biobank samples such as the UK Biobank (UKB) often exceed the capability of most existing MLM-based tools especially if the number of traits is large. Here, we develop an MLM-based tool (fastGWA) that controls for population stratification by principal components and for relatedness by a sparse genetic relationship matrix for GWA analyses of biobank-scale data. We demonstrate by extensive simulations that fastGWA is reliable, robust and highly resource-efficient. We then apply fastGWA to 2,173 traits on array-genotyped and imputed samples from 456,422 individuals and to 2,048 traits on whole-exome-sequenced samples from 46,191 individuals in the UKB.

Show full abstractShow less

DOI

10.1038/s41588-019-0530-8

RELATED_BIOBANK

https://yanglab.westlake.edu.cn/data/ukb_fastgwa/imp_binary/

MAIN ANCESTRY

EUR

UKB fastgwa-glmm (Binary)

Summary statistics

PUBMED_LINK

34737426

DESCRIPTION

UK Biobank binary-trait GWAS from SAIGE-style GLMM analysis (fastGWA-glmm pipeline).

Show full descriptionShow less

URL

TITLE

A generalized linear mixed model association tool for biobank-scale data.

Main citation

Jiang L, Zheng Z, Fang H, Yang J. (2021) A generalized linear mixed model association tool for biobank-scale data. Nat Genet, 53 (11) 1616-1621. doi:10.1038/s41588-021-00954-4. PMID 34737426

ABSTRACT

Compared with linear mixed model-based genome-wide association (GWA) methods, generalized linear mixed model (GLMM)-based methods have better statistical properties when applied to binary traits but are computationally much slower. In the present study, leveraging efficient sparse matrix-based algorithms, we developed a GLMM-based GWA tool, fastGWA-GLMM, that is severalfold to orders of magnitude faster than the state-of-the-art tools when applied to the UK Biobank (UKB) data and scalable to cohorts with millions of individuals. We show by simulation that the fastGWA-GLMM test statistics of both common and rare variants are well calibrated under the null, even for traits with extreme case-control ratios. We applied fastGWA-GLMM to the UKB data of 456,348 individuals, 11,842,647 variants and 2,989 binary traits (full summary statistics available at http://fastgwa.info/ukbimpbin ), and identified 259 rare variants associated with 75 traits, demonstrating the use of imputed genotype data in a large cohort to discover rare variants for binary complex traits.

Show full abstractShow less

DOI

10.1038/s41588-021-00954-4

RELATED_BIOBANK

MAIN ANCESTRY

EUR

UKB gene-based (Genebass)

Summary statistics

PUBMED_LINK

36778668

DESCRIPTION

UK Biobank gene-based association results from the Genebass / exome analysis resource.

Show full descriptionShow less

URL

https://genebass.org/

TITLE

Systematic single-variant and gene-based association testing of thousands of phenotypes in 394,841 UK Biobank exomes.

Main citation

Karczewski KJ, Solomonson M, Chao KR, Goodrich JK, ...&, Neale BM. (2022) Systematic single-variant and gene-based association testing of thousands of phenotypes in 394,841 UK Biobank exomes. Cell Genom, 2 (9) 100168. doi:10.1016/j.xgen.2022.100168. PMID 36778668

ABSTRACT

Genome-wide association studies have successfully discovered thousands of common variants associated with human diseases and traits, but the landscape of rare variations in human disease has not been explored at scale. Exome-sequencing studies of population biobanks provide an opportunity to systematically evaluate the impact of rare coding variations across a wide range of phenotypes to discover genes and allelic series relevant to human health and disease. Here, we present results from systematic association analyses of 4,529 phenotypes using single-variant and gene tests of 394,841 individuals in the UK Biobank with exome-sequence data. We find that the discovery of genetic associations is tightly linked to frequency and is correlated with metrics of deleteriousness and natural selection. We highlight biological findings elucidated by these data and release the dataset as a public resource alongside the Genebass browser for rapidly exploring rare-variant association results.

Show full abstractShow less

DOI

10.1016/j.xgen.2022.100168

RELATED_BIOBANK

https://pheweb.org/UKB-SAIGE/

MAIN ANCESTRY

EUR

UKB saige

Summary statistics

PUBMED_LINK

30104761

DESCRIPTION

UK Biobank GWAS with SAIGE (mixed-model association for biobank-scale binary and quantitative traits).

Show full descriptionShow less

URL

TITLE

Efficiently controlling for case-control imbalance and sample relatedness in large-scale genetic association studies.

Main citation

Zhou W, Nielsen JB, Fritsche LG, Dey R, ...&, Lee S. (2018) Efficiently controlling for case-control imbalance and sample relatedness in large-scale genetic association studies. Nat Genet, 50 (9) 1335-1341. doi:10.1038/s41588-018-0184-y. PMID 30104761

ABSTRACT

In genome-wide association studies (GWAS) for thousands of phenotypes in large biobanks, most binary traits have substantially fewer cases than controls. Both of the widely used approaches, the linear mixed model and the recently proposed logistic mixed model, perform poorly; they produce large type I error rates when used to analyze unbalanced case-control phenotypes. Here we propose a scalable and accurate generalized mixed model association test that uses the saddlepoint approximation to calibrate the distribution of score test statistics. This method, SAIGE (Scalable and Accurate Implementation of GEneralized mixed model), provides accurate P values even when case-control ratios are extremely unbalanced. SAIGE uses state-of-art optimization strategies to reduce computational costs; hence, it is applicable to GWAS for thousands of phenotypes by large biobanks. Through the analysis of UK Biobank data of 408,961 samples from white British participants with European ancestry for > 1,400 binary phenotypes, we show that SAIGE can efficiently analyze large sample data, controlling for unbalanced case-control ratios and sample relatedness.

Show full abstractShow less

DOI

10.1038/s41588-018-0184-y

RELATED_BIOBANK