Skip to content

Biobank / cohort

Catalog entries using this tag (links open the entry card on its page):

Entries

45 and Up Study (45 Up)

Biobank / cohort
DESCRIPTION
Australian population cohort of adults ≥45 with long-term linkage to health administrative data; supports large observational and genetic studies including GWAS when genotype data are available to approved projects.
URL
https://www.saxinstitute.org.au/our-work/45-up-study/
MAIN ANCESTRY
AMR,EAS,EUR,SAS
PARTICIPANTS
Australian adults aged 45 and over (NSW)
SAMPLE SIZE
~250k+

All of Us (AoU)

Biobank / cohort
DESCRIPTION
The All of Us Research Program is a historic effort to collect and study data from one million or more people living in the United States. The goal of the program is better health for all of us. The program began national enrollment in 2018 and is expected to last at least 10 years.
URL
https://precisionhealth.umich.edu/our-research/michigangenomics/
MAIN ANCESTRY
AFR,AMR,EAS,EUR,SAS
PARTICIPANTS
US residents (nationally diverse recruitment)
SAMPLE SIZE
~413k

ALSPAC (Children of the 90s) (ALSPAC)

Biobank / cohort
DESCRIPTION
Avon Longitudinal Study of Parents and Children: UK birth cohort with deep phenotyping from pregnancy onward; genotype and sequence data support developmental and complex-trait GWAS.
URL
https://www.bristol.ac.uk/alspac/
MAIN ANCESTRY
EUR
PARTICIPANTS
UK pregnant women and children (Bristol area; Children of the 90s)
SAMPLE SIZE
~14k children / ~30k total enrolled

ARIC

Biobank / cohort
DESCRIPTION
Atherosclerosis Risk in Communities: four U.S. community cohorts with longitudinal cardio-metabolic phenotyping and genome-wide genotype data; foundational for many cardiovascular and multi-ancestry GWAS.
URL
https://sites.cscc.unc.edu/aric/
MAIN ANCESTRY
AFR,EUR
PARTICIPANTS
US Black and white adults (community-based cohorts)
SAMPLE SIZE
~15.7k

AWI-Gen

Biobank / cohort
DESCRIPTION
Africa Wits–INDEPTH Partnership for Genomic Research: harmonized obesity/diabetes/T2D-related phenotypes and genome-wide data across rural and urban sites; widely cited in African GWAS.
URL
https://www.awigen.org/
MAIN ANCESTRY
AFR
PARTICIPANTS
Sub-Saharan African adults (multi-site East/West/South)
SAMPLE SIZE
~12k

Biobank Graz

Biobank / cohort
DESCRIPTION
Biobank Graz is one of the largest and most well-known clinical biobanks in the world. Around 20 million individual specimens of body fluids and human tissue are stored here. Biobank Graz allows access to these specimens and associated data for scientific research purposes. The common goal is to develop approaches to diagnosing and treating disease.
URL
https://biobank.medunigraz.at/en/?link=http%3A%2F%2F169.254.169.254%2Flatest%2Fmeta-data%2F&cHash=3b3a94b34935e2b8509a838b4a34b0eb
MAIN ANCESTRY
EUR
PARTICIPANTS
Austrian clinical and research participants (Graz)
SAMPLE SIZE
~1200k

BioBank Japan (BBJ)

Biobank / cohort
DESCRIPTION
In 2003, BioBank Japan (BBJ) started developing one of the world’s largest disease biobanks, creating a foundation for research aimed at achieving medical care tailored to the individual traits of each patient. From a total of 260,000 patients representing 440,000 cases of 51 primarily multifactorial (common) diseases, BBJ has collected DNA, serum, medical records (clinical information), etc. with their consent. No less than 5,800 items of screened information are available for research, including the patients’ survival information, with 95% of the patients tracked over an average of 10 years. In addition to large-scale genomic analyses, omics analyses including whole genome sequencing and metabolome/proteome analyses have been performed on the DNA, serum and other biological samples collected, producing significant research findings. The genomic information acquired through the analyses continues to be used as data. The biological samples and data are widely distributed and used by researchers.
URL
https://biobankjp.org/
MAIN ANCESTRY
EAS
PARTICIPANTS
Japanese adults (hospital and population network)
SAMPLE SIZE
~270k

Biobank of the Americas

Biobank / cohort
DESCRIPTION
Biobank consented samples with associated clinical data from diverse populations from throughout the United States and Latin America via healthcare and biopharma partnerships.
URL
https://www.galatea.bio/#main-biobank
MAIN ANCESTRY
AMR,EUR
PARTICIPANTS
US and Latin American clinical / biopharma-linked participants
SAMPLE SIZE
~20k

Biobank Russia (BBRU)

Biobank / cohort
DESCRIPTION
BioBank Russia (BBRU) is a prospective biobank, managed by V. A. Almazov National Medical Research Center.
URL
https://biobank.almazovcentre.ru/
Main citation
Usoltsev, D., Kolosov, N., Rotar, O., Loboda, A., Boyarinova, M., Moguchaya, E., ... & Artomov, M. (2023). Understanding Complex Trait Susceptibilities and Ethnical Diversity in a Sample of 4,145 Russians Through Analysis of Clinical and Genetic Data. bioRxiv, 2023-03.
MAIN ANCESTRY
EUR
PARTICIPANTS
Russian adults (Almazov Centre biobank)
SAMPLE SIZE
~4K

BioMe

Biobank / cohort
DESCRIPTION
The Institute for Personalized Medicine at the Icahn School of Medicine at Mount Sinai is leading the movement toward diagnosis and classification of disease according to the patient’s molecular profile. This approach accommodates differences at all possible levels of exposure (genome, environment, and lifestyle) and at all stages of the process, from prevention to post-treatment follow-up. At the center of this effort is BioMe, an electronic medical record-linked biobank that enables researchers to rapidly and efficiently conduct genetic, epidemiologic, molecular, and genomic studies on large collections of research specimens linked with medical information.
URL
https://www.vumc.org/dbmi/biovu
MAIN ANCESTRY
AFR,AMR,EAS,EUR,SAS
PARTICIPANTS
New York City area health-system patients
SAMPLE SIZE
~32k

BioPortal

Biobank / cohort
DESCRIPTION
BioPortal is a unique research platform at the Jewish General Hospital (JGH)/Lady Davis Institute in Montreal built in partnership with the CERC Chair in Genomic Medicine at McGill.
URL
https://www.mcgill.ca/genepi/bioportal
MAIN ANCESTRY
AMR,EUR
PARTICIPANTS
Montreal-area Canadian research participants
SAMPLE SIZE
N/A (multi-study platform)

BioVU

Biobank / cohort
DESCRIPTION
Planning for BioVU began in mid-2004 and the first samples were collected in February 2007. Prior to collecting DNA samples, all aspects of the BioVU project were extensively tested. BioVU now accrues 500-1000 samples per week, totaling more than 275,000 DNA samples as of January 2022. Vanderbilt clinic patients may sign the BioVU Consent Form if they wish to donate their excess blood samples, or not sign the form if they do not wish to participate.
URL
https://bbofa.org/
MAIN ANCESTRY
EUR
PARTICIPANTS
Vanderbilt University Medical Center patients
SAMPLE SIZE
~120k

Born in Guangzhou Cohort Study (BIGCS)

Biobank / cohort
DESCRIPTION
The Born in Guangzhou Cohort Study (BIGCS) is a large-scale prospective observational study investigating the role of social, biological and environmental influences on pregnancy and child health and development in an urban setting in southern China.
URL
http://www.bigcs.com.cn/en_index.html
MAIN ANCESTRY
EAS
PARTICIPANTS
Han Chinese mothers and infants (Guangzhou birth cohort)
SAMPLE SIZE
~50K

CanPath - Ontario Health Study

Biobank / cohort
DESCRIPTION
The Ontario Health Study (OHS) is a resource for investigating the ways in which lifestyle, the environment and genetics affect people’s health. It is one of the regional cohorts that collectively form the Canadian Partnership for Tomorrow’s Health (CanPath)—a pan-Canadian cohort with >330 000 participants. The linking of Canada’s rich collection of administrative health data with the cohort’s data represents a powerful means to disseminate high-quality, timely data.
URL
https://canpath.ca/cohort/ontario-health-study/
MAIN ANCESTRY
AMR,EUR
PARTICIPANTS
Canadian adults (Ontario; CanPath)
SAMPLE SIZE
~7k

CARTaGENE biobank

Biobank / cohort
DESCRIPTION
CARTaGENE is a public research platform of the CHU Sainte-Justine aiming to accelerate health research. CARTaGENE is made up of both biological samples and data on the health and lifestyle of 43,000 Quebec men and women between the ages of 40 and 69 at recruitment.
URL
https://cartagene.qc.ca/en/
MAIN ANCESTRY
AMR,EUR
PARTICIPANTS
Quebec residents (CartaGENE)
SAMPLE SIZE
~30K
DATA ACCESS
https://cartagene.qc.ca/en/researchers.html

China Kadoorie Biobank (CKB)

Biobank / cohort
DESCRIPTION
The China Kadoorie Biobank is one of the world’s largest prospective cohort studies. A long-term collaboration between the UK and China, it aims to generate reliable evidence about the lifestyle, environmental and genetic determinants of a wide range of common diseases that can inform disease prevention, risk prediction and treatment worldwide.
URL
https://www.ckbiobank.org/
MAIN ANCESTRY
EAS
PARTICIPANTS
Chinese adults (10 regions; prospective cohort)
SAMPLE SIZE
~512k

Chinese Millionome Database (CMDB)

Biobank / cohort
DESCRIPTION
the largest and the most representative Chinese genome variation database to date. The CMDB database contains 9.04 million single nucleotide variants (SNVs) and the allele frequency information from low-coverage (0.06×–0.1×) WGS data of 141 431 unrelated healthy Chinese individuals.
URL
https://db.cngb.org/cmdb/
MAIN ANCESTRY
EAS
PARTICIPANTS
Chinese individuals (aggregated genome database)
SAMPLE SIZE
~141k

CoLaus / PsyCoLaus (CoLaus)

Biobank / cohort
DESCRIPTION
Swiss population-based cohort (Lausanne) focused on cardiovascular risk (CoLaus) with parallel PsyCoLaus mental-health deep phenotyping; genotype data used in GWAS and omics analyses.
URL
https://www.colaus-psycolaus.ch/
MAIN ANCESTRY
EUR
PARTICIPANTS
Swiss adults (Lausanne population cohort)
SAMPLE SIZE
~6.7k (CoLaus); PsyCoLaus psychiatric sub-study

Colorado Center for Personalized Medicine

Biobank / cohort
DESCRIPTION
Established in 2014 as a partnership between UCHealth and University of Colorado Anschutz Medical Campus, the Colorado Center for Personalized Medicine (CCPM) brings together multiple disciplines and institutions to uncover advancements in genomics that can improve diagnosis and treatment of disease, and identify more tailored approaches to population health management.To facilitate discoveries in personalized medicine, CCPM has created a Biobank that aims to be one of the largest academic medicine biospecimen repositories in the mountain and midwest regions of the U.S. The CCPM Biobank is able to link biospecimens and genotype information with patient health information from electronic medical records in an enterprise data warehouse (Health Data Compass) to support a broad range of research, operational, and clinical quality improvement agendas.
URL
https://medschool.cuanschutz.edu/cobiobank
MAIN ANCESTRY
AFR,AMR,EAS,EUR,SAS
PARTICIPANTS
University of Colorado health-system patients
SAMPLE SIZE
~34k

Danish Blood Donor Study (DBDS)

Biobank / cohort
DESCRIPTION
Prospective cohort of Danish blood donors with biosamples and questionnaire data; large-scale genome-wide genotyping supports trait and omics GWAS.
URL
https://www.dbds.dk/
MAIN ANCESTRY
EUR
PARTICIPANTS
Danish blood donors
SAMPLE SIZE
~150k+ donors

Danish National Biobank (DNB)

Biobank / cohort
DESCRIPTION
Nationwide biobank infrastructure integrating samples from the health system; enables large-scale genetic and epidemiological studies when linked to Danish registers.
URL
https://www.danishnationalbiobank.com/
MAIN ANCESTRY
EUR
PARTICIPANTS
Danish national biobank participants
SAMPLE SIZE
millions of stored samples (registry-linked)

deCODE Genetics

Biobank / cohort
DESCRIPTION
deCODE leads the world in the discovery of genetic risk factors for common diseases. Our gene discovery engine is driven by our unique approach and resources, including detailed genetic and medical information on some 500,000 individuals from around the globe taking part in our discovery work and proprietary statistical algorithms and informatics tools for gathering, analyzing, visualizing and storing large amounts of data.
URL
https://www.decode.com/
MAIN ANCESTRY
EUR
PARTICIPANTS
Icelandic participants (population and genealogy)
SAMPLE SIZE
~250k

East London Genes & Health

Biobank / cohort
DESCRIPTION
Genes & Health is a huge long-term study of 100,000 people of Bangladeshi and Pakistani origin. We will link genes with health records, to study disease and treatments. Some volunteers may be invited for further studies. We are inviting volunteers to take part in two regions of the UK: East London (East London Genes & Health) and Bradford (Bradford Genes & Health).
URL
https://www.genesandhealth.org/
MAIN ANCESTRY
SAS
PARTICIPANTS
British Bangladeshi and Pakistani adults (East London)
SAMPLE SIZE
~100k

ELSA-Brasil (ELSA)

Biobank / cohort
DESCRIPTION
Brazilian Longitudinal Study of Adult Health: six centers with deep phenotyping and biosamples; genome-wide data used for cardiometabolic and population-genetics GWAS in Brazil.
URL
https://elsa.org.br/
MAIN ANCESTRY
AFR,AMR,EUR
PARTICIPANTS
Brazilian civil servants and families (longitudinal aging)
SAMPLE SIZE
~15k

Estonian Biobank

Biobank / cohort
DESCRIPTION
The Estonian Biobank has established a population-based biobank of Estonia with a current cohort size of more than 200,000 individuals (genotyped with genome-wide arrays), reflecting the age, sex and geographical distribution of the adult Estonian population. Considering the fact that about 20% of Estonia's adult population has joined the programme, it is indeed a database that is very important for the development of medical science both domestically and internationally.
URL
https://genomics.ut.ee/en/content/estonian-biobank
MAIN ANCESTRY
EUR
PARTICIPANTS
Estonian adults (national biobank)
SAMPLE SIZE
~200k

Fenland Study

Biobank / cohort
DESCRIPTION
The Fenland Study investigates the interaction between environmental and genetic factors in determining obesity, type 2 diabetes, and related metabolic disorders. These conditions are a considerable public health concern, but their causes and factors that predict who will be affected by them are not completely understood. What makes the Fenland Study unique is the level of detail it collects about the health and lifestyle of participants, and the objective measurement techniques used in the screening. The first phase of the Fenland Study is now complete, and we are now inviting participants who attended an initial Fenland Study visit between 2005 and 2015 to return for a second visit in Phase 2.
URL
https://www.mrc-epid.cam.ac.uk/research/studies/fenland/
MAIN ANCESTRY
EUR
PARTICIPANTS
English adults (Cambridgeshire Fenland)
SAMPLE SIZE
~12k

FinnGen

Biobank / cohort
DESCRIPTION
FinnGen study launched in Finland in the autumn of 2017 is a unique study that combines genome information with digital health care data. The FinnGen study is an unprecedented global research project representing one of the largest studies of this type. Project aims to improve human health through genetic research, and ultimately identify new therapeutic targets and diagnostics for treating numerous diseases. The collaborative nature of the project is exceptional compare to many ongoing studies, and all the partners are working closely together to ensure appropriate transparency, data security and ownership.
URL
https://www.finngen.fi/en
Main citation
Kurki, M. I., Karjalainen, J., Palta, P., Sipilä, T. P., Kristiansson, K., Donner, K., ... & Nelis, M. (2022). FinnGen: Unique genetic insights from combining isolated population and national health register data. medRxiv.
MAIN ANCESTRY
EUR
PARTICIPANTS
Finnish adults (national FinnGen)
SAMPLE SIZE
~500k

Generation Scotland

Biobank / cohort
DESCRIPTION
Generation Scotland is a research study looking at the health and well-being of volunteers and their families. Generation Scotland combines responses to questionnaires of health and well-being from birth through life. We combine this with NHS health records and innovative laboratory science to understand health trajectories. We work closely with researchers and our volunteers to create a rich evidence base for understanding health. Through this rigorous, ethical and safe approach to research, we seek to enable meaningful change in public health.
URL
https://www.ed.ac.uk/generation-scotland
MAIN ANCESTRY
EUR
PARTICIPANTS
Scottish adults and families
SAMPLE SIZE
~24k

H3Africa

Biobank / cohort
DESCRIPTION
Human Heredity and Health in Africa: pan-African genomics and epidemiology initiative producing genome-wide data, reference panels, and phenotype-linked biosamples for disease and population-genetics GWAS.
URL
https://h3africa.org/
MAIN ANCESTRY
AFR
PARTICIPANTS
African participants (H3Africa consortium studies)
SAMPLE SIZE
50k+ (initiative-wide)

HCHS/SOL

Biobank / cohort
DESCRIPTION
Hispanic Community Health Study / Study of Latinos: four U.S. field centers with deep cardiometabolic phenotyping; central to GWAS in admixed Latino populations.
URL
https://www.cscc.unc.edu/hchs/
MAIN ANCESTRY
AMR
PARTICIPANTS
Hispanic/Latino adults (US community sites)
SAMPLE SIZE
~16k

IndiGenomes

Biobank / cohort
DESCRIPTION
IndiGenomes is a curated resource of genetic variants from 1000+ whole genomes from across India, supporting South Asian reference genomics and downstream association and functional studies.
URL
http://clingen.igib.res.in/indigen/
MAIN ANCESTRY
SAS
PARTICIPANTS
Indian subcontinent individuals (genome resource)
SAMPLE SIZE
~10k

INTERVAL Study

Biobank / cohort
DESCRIPTION
Between June 2012 and June 2014, the INTERVAL study recruited about 25,000 men and about 25,000 women at NHS Blood and Transplant (NSHBT) blood donation centres across England. During the study participants are asked to give blood either at usual donation intervals or more frequently. Men donate every 12, 10 or 8 weeks and women every 16, 14 or 12 weeks.
URL
https://www.intervalstudy.org.uk/
MAIN ANCESTRY
EUR
PARTICIPANTS
English blood donors (INTERVAL)
SAMPLE SIZE
~50k

KoGES

Biobank / cohort
DESCRIPTION
Korean Genome and Epidemiology Study: prospective population-based cohorts (Ansan–Ansung, etc.) with biosamples and follow-up; genome-wide data widely used in Korean and trans-ethnic GWAS.
URL
https://koges.leelabsg.org/about
MAIN ANCESTRY
EAS
PARTICIPANTS
Korean adults (population-based sub-cohorts)
SAMPLE SIZE
~210k (across sub-cohorts)

Lifelines

Biobank / cohort
DESCRIPTION
Lifelines is a large, multigenerational cohort study that includes over 167,000 participants (10%) from the northern population of the Netherlands. We included participants from three generations, who are followed for at least 30 years, to obtain insight into healthy ageing. The aim of Lifelines is to be a resource for the national and international scientific community.
URL
https://www.lifelines.nl/researcher
MAIN ANCESTRY
EUR
PARTICIPANTS
Dutch residents (three-generation cohort)
SAMPLE SIZE
~167k

Massachusetts General Brigham Biobank

Biobank / cohort
DESCRIPTION
The Mass General Brigham Biobank is a large research program designed to help researchers understand how people’s health is affected by their genes, lifestyle, and environment. By participating in the Mass General Brigham Biobank, you can help us better understand, treat, and even prevent the diseases that might affect your health and the health of future generations.
URL
https://www.massgeneralbrigham.org/en/research-and-innovation/participate-in-research/biobank
MAIN ANCESTRY
AFR,AMR,EAS,EUR,SAS
PARTICIPANTS
Mass General Brigham patients
SAMPLE SIZE
~26K

MESA

Biobank / cohort
DESCRIPTION
Multi-Ethnic Study of Atherosclerosis: six U.S. sites with subclinical CVD imaging and omics; genome-wide data widely used in multi-ethnic GWAS and cardiometabolic genetics.
URL
https://www.mesa-nhlbi.org/
MAIN ANCESTRY
AFR,AMR,EAS,EUR
PARTICIPANTS
US adults (Black, white, Hispanic/Latino, Chinese cohorts)
SAMPLE SIZE
~6.8k

Mexico City Prospective Study

Biobank / cohort
DESCRIPTION
Between 1998 and 2004, CTSU, in collaboration with the Mexican Ministry of Health, established a study in Mexico City, in which over 150,000 middle-aged adults (including 100,000 women and 50,000 men) provided information about their lifestyle and disease history, had physical measurements recorded (including weight, waist and hip circumference, blood pressure) and had a blood sample taken.
URL
https://www.ctsu.ox.ac.uk/research/prospective-blood-based-study-of-150-000-individuals-in-mexico
Main citation
Ziyatdinov, A., Torres, J., Alegre-Diaz, J., Backman, J., Mbatchou, J., Turner, M., ... & Tapia-Conyer, R. (2022). Genotyping, sequencing and analysis of 140,000 adults from the Mexico City Prospective Study. bioRxiv.
MAIN ANCESTRY
AMR
PARTICIPANTS
Mexico City metropolitan adults
SAMPLE SIZE
~150k

Michigan Genomics Initiative

Biobank / cohort
DESCRIPTION
The Michigan Genomics Initiative (MGI) is a collaborative research effort among physicians, researchers, and patients at the University of Michigan (U-M) with the goal of combining patient electronic health record (EHR) data with corresponding genetic data to gain novel biomedical insights. There are currently ~84K consented participants through the MGI and partner studies and the addition of ~10K new participants per year is anticipated. Currently, all MGI participants with available genetic data have received care at the University of Michigan Health System.
URL
https://pmbb.med.upenn.edu/
Main citation
Zawistowski, M., Fritsche, L. G., Pandit, A., Vanderwerff, B., Patil, S., Scmidt, E. M., ... & Zoellner, S. (2021). The Michigan Genomics Initiative: a biobank linking genotypes and electronic clinical records in Michigan Medicine patients. medRxiv.
MAIN ANCESTRY
AFR,AMR,EAS,EUR,SAS
PARTICIPANTS
Michigan Medicine patients
SAMPLE SIZE
~55k

Million Veteran Program (MVP)

Biobank / cohort
DESCRIPTION
The Million Veteran Program (MVP) is a national research program to learn how genes, lifestyle, and military exposures affect health and illness. Since launching in 2011, over 900,000 Veteran partners have joined one of the world's largest programs on genetics and health.
URL
https://www.mvp.va.gov/pwa/
MAIN ANCESTRY
AFR,AMR,EAS,EUR,SAS
PARTICIPANTS
US military veterans (VA healthcare)
SAMPLE SIZE
~900k

NAKO Health Study (NAKO)

Biobank / cohort
DESCRIPTION
German National Cohort (NAKO) is a large prospective study with biosamples and extensive phenotyping; genome-wide genotype data support GWAS and omics analyses under controlled access.
URL
https://www.nako.de/en/
MAIN ANCESTRY
EUR
PARTICIPANTS
German adults (NAKO national cohort)
SAMPLE SIZE
~205k

National Biobank of Korea (NBK)

Biobank / cohort
DESCRIPTION
The NBK is the national control center for the collection, management, and utilization of human bioresources in Korea. And NBK manages KBN, it contributes to the development of policies related to human bioresources, standardization of human bioresource management, and advancement of domestic biobanks through developing and providing support for human bioresource technologies. For guaranteeing the fairness in bioresource distribution and development of an efficient distribution system, the NBK also serves as the human bioresource supply hub that supports national healthcare and medical R&D.
URL
https://nih.go.kr/NIH/cms/content/eng/14/65714_view.html
MAIN ANCESTRY
EAS
PARTICIPANTS
Korean participants (linked with KoGES / national program)
SAMPLE SIZE
~210K
DATA ACCESS
https://koges.leelabsg.org/ , https://zenodo.org/record/7042518

National Center Biobank Network (NCBN)

Biobank / cohort
DESCRIPTION
Six National Centers in Japan conduct specialized medical research under the coordination of the National Center Biobank Network (NCBN) and develop therapeutics to improve and protect national health. They actively collaborate to establish a shared biobank and are developing a structure to facilitate industry-academia-government cooperation regarding bioresources through broad joint research. NCBN strives to promote the success of the National Centers and to create bright future for health and human life.
URL
https://ncbiobank.org/en/home.php
MAIN ANCESTRY
EAS
PARTICIPANTS
Japanese patients (national hospital biobank network)
SAMPLE SIZE
~120K

Nigerian 100K Genome Project

Biobank / cohort
DESCRIPTION
Genomic studies in African populations provide unique opportunities to understand disease aetiology, human genetic diversity and population history in a regional and a global context. To leverage the relative benefits of different strategies, we undertook a combined approach of genotyping and whole-genome sequencing (WGS) in a population-based study of 6,400 individuals from a geographically defined rural community in South-West Uganda. We present data from 4,778 individuals with genotypes for ~2.2 million SNPs from the Uganda GWAS resource (UGWAS), and sequence data on up to 1,978 individuals spanning 41.5M SNPs and 4.5M indels (UG2G); 343 individuals overlap between the two datasets. We highlight the value of the largest sequence panel from Africa to date as a global resource for variant discovery, imputation and understanding the mutational spectrum and its clinical relevance in African populations. Alongside phenotype data, we provide a rich new genomic resource for researchers in Africa and globally
URL
https://allofus.nih.gov/ ,https://www.researchallofus.org/register/
MAIN ANCESTRY
AFR
PARTICIPANTS
Nigerian participants (national genome project)
SAMPLE SIZE
~100k

NyuWa genome resource

Biobank / cohort
DESCRIPTION
NyuWa, or NüWa, is the mother goddess who was the creator of the human population in Chinese mythology. Here we presented the NyuWa genome resource based on high depth (median 26X) WGS of 2,999 Chinese individuals from 23 out of 34 administrative divisions in China. NyuWa Genome Resource present in this website mainly contains two parts as NyuWa Chinese Population Variant Database and NyuWa reference panel server.
URL
http://bigdata.ibp.ac.cn/NyuWa/
MAIN ANCESTRY
EAS
PARTICIPANTS
Han Chinese individuals (NyuWa reference genomes)
SAMPLE SIZE
~3k

PAGE Study (PAGE)

Biobank / cohort
DESCRIPTION
Population Architecture using Genomics and Epidemiology (PAGE): NIH consortium coordinating genome-wide association studies in African American, Hispanic/Latino, Asian, Pacific Islander, and Native American participants across multiple cohorts.
URL
https://www.pagestudy.org/
MAIN ANCESTRY
AFR,AMR,EAS,EUR
PARTICIPANTS
US minority cohorts (NH Black, Hispanic/Latino, Asian, Native American)
SAMPLE SIZE
50k+ (aggregated across studies)

Penn Medicine Biobank

Biobank / cohort
DESCRIPTION
The Penn Medicine BioBank (PMBB) is a research program created to study the causes and treatments of many diseases. Any Penn Medicine patient (age 18 and up) can sign up. The PMBB is a collection of biological samples, such as blood or tissue, that are donated by patient volunteers. These samples are then connected to clinical information, such as diseases or lab measures. These data are then used by researchers to discover new ways to detect, treat, and maybe even prevent or cure disease. Some of these studies may be about how genes affect health and disease. Other studies look at how genes affect response to medicines.
URL
https://www.uclahealth.org/precision-health/programs/ucla-atlas-community-health-initiative/ucla-atlas-precision-health-biobank
MAIN ANCESTRY
AFR,AMR,EAS,EUR,SAS
PARTICIPANTS
University of Pennsylvania Health System patients
SAMPLE SIZE
~40k

Qatar Biobank

Biobank / cohort
DESCRIPTION
Qatar Biobank is a national long-term health research initiative that recruits consenting adults in Qatar, collects biological samples and health data, and supports population and genomic research on diseases relevant to the region (often analyzed together with the Qatar Genome Program).
URL
https://www.qatarbiobank.org.qa/
MAIN ANCESTRY
Middle Eastern (Qatari Arab)
PARTICIPANTS
Consenting adults in Qatar (national biobank)
SAMPLE SIZE
~80K

Qatar Genome Program (QGP)

Biobank / cohort
DESCRIPTION
National sequencing initiative profiling thousands of Qatari genomes (linked with Qatar Biobank) to study migration history and improve imputation and analysis of Arab haplotypes.
URL
https://www.qatargenome.org.qa/
MAIN ANCESTRY
Middle Eastern (Qatari Arab)
PARTICIPANTS
Qatari individuals (national sequencing; linked with Qatar Biobank)
SAMPLE SIZE
~6K

SAPALDIA

Biobank / cohort
DESCRIPTION
Swiss Cohort Study on Air Pollution and Lung and Heart Diseases in Adults; longitudinal respiratory and cardiovascular phenotypes with genetic data for population and environmental health GWAS.
URL
https://www.sapaldia.ch/en/
MAIN ANCESTRY
EUR
PARTICIPANTS
Swiss adults (respiratory population cohort)
SAMPLE SIZE
~10k

SG10K_Health (SG10K)

Biobank / cohort
DESCRIPTION
SG10K_Health is the headline project of the Singapore National Precision Medicine programme (NPM Phase I). Comprising 10,000 whole-genome sequences from healthy Chinese, Indian, and Malay consented volunteers. SG10K_Health involved a research collaboration across multiple institutions in Singapore, enabling the country to develop the necessary infrastructure and deep capabilities to process, store, and analyse genetic data at the population scale in a safe, secure, and rapid manner. SG10K_Health provides near complete assessment of common genetic variants in Singapore’s three major ethnic groups, which can be used by clinicians to better manage Asian patients with genetic disease and as a control data set to compare against disease studies. Work is ongoing to link the SG10K_Health genomic data to research traits (e.g., height, weight, blood pressure) and clinical records.
URL
https://npm.a-star.edu.sg/
MAIN ANCESTRY
EAS,SAS
PARTICIPANTS
Singapore residents (Chinese, Malay, Indian, other)
SAMPLE SIZE
~10k

Taiwan Biobank (TWB)

Biobank / cohort
DESCRIPTION
The Taiwan Biobank (TWB) is an ongoing prospective study of over 150,000 individuals aged 30-70 recruited from across Taiwan beginning in 2012. A comprehensive list of phenotypes was collected for each consented participant at recruitment and follow-up visits through structured interviews and physical measurements. Biomarkers and genetic data were also generated for all participants from blood and urine samples.
URL
https://www.twbiobank.org.tw/
Main citation
Feng, Y. C. A., Chen, C. Y., Chen, T. T., Kuo, P. H., Hsu, Y. H., Yang, H. I., ... & Lin, Y. F. (2021). Taiwan Biobank: a rich biomedical research database of the Taiwanese population. medRxiv. Feng, Y. C. A., Chen, C. Y., Chen, T. T., Kuo, P. H., Hsu, Y. H., Yang, H. I., ... & Lin, Y. F. (2022). Taiwan Biobank: a rich biomedical research database of the Taiwanese population. Cell Genomics, 100197.
MAIN ANCESTRY
EAS
PARTICIPANTS
Han Chinese adults (general-population recruitment)
SAMPLE SIZE
~150k
DATA ACCESS
https://taiwanview.twbiobank.org.tw/data_appl (application required)

Taiwan Precision Medicine Initiative (TMPI)

Biobank / cohort
DESCRIPTION
The Taiwan Precision Medicine Initiative (TPMI) is a genomic research program designed to advance precision healthcare in Taiwan. This initiative focuses on collecting and analyzing genetic and clinical data from Taiwanese individuals to develop personalized healthcare solutions tailored specifically to Taiwan’s population. With over 500,000 Taiwanese residents already enrolled, TPMI maintains the most comprehensive dataset of genotypes and electronic medical records for Han Chinese populations.

Given that Han Chinese make up about 20% of the global population, TPMI’s findings have the potential to impact healthcare for over 1.4 billion people worldwide, setting a model for precision medicine initiatives tailored to specific populations.
URL
https://tpmi.ibms.sinica.edu.tw/
MAIN ANCESTRY
EAS
PARTICIPANTS
Han Chinese adults (precision medicine initiative)
SAMPLE SIZE
~460k

The Canadian Longitudinal Study on Aging (CLSA)

Biobank / cohort
DESCRIPTION
The Canadian Longitudinal Study on Aging (CLSA) is a large, national, long-term study that will follow approximately 50,000 individuals who are between the ages of 45 and 85 when recruited, for at least 20 years. The CLSA will collect information on the changing biological, medical, psychological, social, lifestyle and economic aspects of people’s lives. These factors will be studied to understand how, individually and in combination, they have an impact in both maintaining health and in the development of disease and disability as people age.
URL
https://www.clsa-elcv.ca/
MAIN ANCESTRY
AMR,EUR
PARTICIPANTS
Canadian adults (national aging cohort)
SAMPLE SIZE
~50k

The China Metabolic Analytics Project (ChinaMAP)

Biobank / cohort
DESCRIPTION
The ChinaMAP is based on three large-scale cohorts: The China Noncommunicable Disease Surveillance 2010, a nationally representative study with 150,000 participants; the Risk Evaluation of cAncers in Chinese diabeTic Individuals: a lONgitudinal (REACTION) study with 250,000 participants15 and the Community-based Cardiovascular Risk During Urbanization in Shanghai with 50,000 participants.
URL
http://www.mbiobank.com/
MAIN ANCESTRY
EAS
PARTICIPANTS
Chinese adults (metabolic disease cohort)
SAMPLE SIZE
~10k

The Egypt Genome Project

Biobank / cohort
DESCRIPTION
EGYPT CENTER FOR RESEARCH AND REGENERATIVE MEDICINE (ECRRM) IS ONE OF THE RESEARCH UNITS ASSOCIATED WITH THE MINISTRY OF DEFENSE. IT WAS ESTABLISHED BY A PRESIDENTIAL DECREE IN 2017 AND HAS A LEGAL ENTITY AFFILIATED WITH THE MINISTRY OF DEFENSE. THE CENTER WILL INITIATE THE EGYPTIAN GENOME REFERENCE PROJECT, UPON THE PRESIDENT AUTHORIZATION, IN COLLABORATION WITH THE ACADEMY OF SCIENTIFIC RESEARCH AND TECHNOLOGY AND THE MINISTRY OF HIGHER EDUCATION, TO HELP IN THE OVERALL ENHANCEMENT OF THE GENERAL HEALTH CARE IN EGYPT.
URL
https://egp.sci.eg/
Main citation
Elmonem, M.A., Soliman, N.A., Moustafa, A. et al. The Egypt Genome Project. Nat Genet (2024). https://doi.org/10.1038/s41588-024-01739-1
MAIN ANCESTRY
AFR
PARTICIPANTS
Egyptian participants (national reference effort)
SAMPLE SIZE
~100K

The International Agency for Research on Cancer (IARC) Biobank (IARC)

Biobank / cohort
DESCRIPTION
The IARC BioBank (IBB) is one of the largest, most varied and richest International collections of samples in the world. The Biobank is publicly funded, (approximately 60% of its budget is provided by IARC Participating States through the regular budget and the remainder is from research grants) and hosts over 50 different studies, led or coordinated by IARC scientists. The IBB contains both population-based collections from research projects focusing on gene-environment interactions (as in the European Prospective Investigation into Cancer and Nutrition (EPIC) study) and disease-based collections which focus on biomarkers (as in the International Head and Neck Cancer Epidemiology (INHANCE)). Study designs include case-series, prevalence studies, case-control and cohort studies, etc. The IBB contains 5.1 million biological samples from 562,000 individuals. 4 million of the samples are from the EPIC study (over 370,000 individuals) and about one million samples from other collections (close to 200,000 individuals). Most of the samples are body fluids, including plasma, serum and urine as well as extracted DNA samples.
URL
https://ibb.iarc.fr/
MAIN ANCESTRY
Global (ancestry varies by contributing study)
PARTICIPANTS
Participants from many international IARC-led studies
SAMPLE SIZE
~560k

The Japan COVID-19 Task Force study (JCTF)

Biobank / cohort
DESCRIPTION
Multi-institution Japanese consortium formed during the COVID-19 pandemic; contributes multi-omics and GWAS-related data surfaced through resources such as the Japan Omics Browser.
URL
https://japan-omics.jp/
MAIN ANCESTRY
EAS
PARTICIPANTS
Japanese individuals (COVID-19 host genetics)
SAMPLE SIZE
~1.4K

The Japan Prospective Studies Collaboration for Aging and Dementia (JPSC-AD)

Biobank / cohort
DESCRIPTION
Japan Prospective Studies Collaboration for Aging and Dementia (JPSC-AD) study is a collaborative prospective cohort study of approximately 10,000 elderly people from 8 newly-established community-based dementia cohort studies in Japan, in which the data is prospectively collected by using the pre-specified standardized protocol. The purpose of this study is to evaluate quantitatively environmental and genomic risk factors for dementia in Japanese and to establish effective preventive strategies for dementia, in order to realize healthy aging society.
URL
https://www.eph.med.kyushu-u.ac.jp/jpsc/en/
MAIN ANCESTRY
EAS
PARTICIPANTS
Japanese adults (aging and dementia collaboration)
SAMPLE SIZE
~11K

The Malaysian Cohort (TMC)

Biobank / cohort
DESCRIPTION
The Malaysian Cohort (TMC) is a prospective study of non-communicable diseases in a multi-ethnic Malaysian population, recruiting adults across rural and urban settings with biosamples and phenotypes for gene–environment and biomarker research.
URL
https://www.ukm.my/mycohort/ms/
MAIN ANCESTRY
EAS,SAS
PARTICIPANTS
Malaysian adults (multi-ethnic national cohort)
SAMPLE SIZE
~100k

The Nagahama Study

Biobank / cohort
DESCRIPTION
The Nagahama Primary Prevention Cohort Project is a joint project based on an agreement between Kyoto University Graduate School of Medicine and Nagahama City, Shiga Prefecture, with the cooperation of approximately 10,000 Nagahama residents. In addition, the project conducts follow-up surveys on morbidity and mortality, special tests and surveys on sleep, brain imaging, memory, motor function, skin condition, socioeconomic status, etc., during health checkups and periodic surveys conducted every five years after that. Furthermore, we have completed a multi-omics analysis focusing on genome analysis of approximately 9,000 people (including whole genome sequencing of roughly 2,500 people), comprehensive metabolite analysis of 3-time points, and comprehensive protein analysis of 2,000 people (as of August 2021), and based on these rich and diverse data, we have been searching for health risk Based on these abundant and varied data, we aim to search for health risk factors and elucidate their interactions.
URL
https://zeroji-cohort.com/english/
MAIN ANCESTRY
EAS
PARTICIPANTS
Japanese adults (Nagahama City, Shiga)
SAMPLE SIZE
~10K

The STROMICS genome study (STROMICS)

Biobank / cohort
DESCRIPTION
The Stroke Omics Atlas (STROMICS) is committed to using multi-omics and clinical big data to achieve accurate diagnosis and treatment for stroke patients, reduce treatment costs, and contribute to the health of the people.
Using artificial intelligence and cutting-edge high-throughput omics technologies (genomics, transcriptomics, epigenomics, proteomics, metabolomics, metagenomics, etc.), potential drug targets for stroke can be found on a large scale and with high efficiency, providing strong technical support for clinical transformation.
Relying on the China National Clinical Research Center for Neurological Diseases and Center of excellence for Omics Research (CORe), STROMICS has realized the interdisciplinary integration of clinical medicine, bioinformatics, and multi-omics, creating a new paradigm of drug research and development.
URL
http://www.stromics.org.cn/
MAIN ANCESTRY
EAS
PARTICIPANTS
Chinese adults (acute ischemic stroke registry)
SAMPLE SIZE
~10k

The Trøndelag Health Study (HUNT)

Biobank / cohort
DESCRIPTION
HUNT Biobank is an established and modern research biobank with high-technology equipment for storage, analysis, sample handling and delivery of samples. Our samples satisfy high quality standards and are stored in accordance with the Data Inspectorates laws and regulations. HUNT Biobank engages in sample handling from The Nord-Trøndelag Health Study (HUNT), Cohort of Norway (CONOR), and can receive samples from other researchers and research projects for storage, analysis and processing of DNA. We do not store samples from private individuals.
URL
https://www.ntnu.edu/hunt/hunt-biobank
MAIN ANCESTRY
EUR
PARTICIPANTS
Norwegian adults (HUNT, Trøndelag)
SAMPLE SIZE
~229k

Tohoku Medical Megabank (TMM)

Biobank / cohort
DESCRIPTION
Tohoku University Tohoku Medical Megabank Organization was founded to establish an advanced medical system to foster the reconstruction from the Great East Japan Earthquake. The organization has been developing a biobank that combines medical and genome information during the process of rebuilding the community medical system and supporting health and welfare in the Tohoku area. The information from the brand-new biobank will create a new medical system, and, based on the findings of its analysis, the organization aims to attract more medical practitioners from all over the country to the area, promote industry-academic partnerships, create employment in related fields, and restore the medical system in Tohoku.
URL
https://www.megabank.tohoku.ac.jp/english/
MAIN ANCESTRY
EAS
PARTICIPANTS
Japanese adults (Tōhoku region; megabank)
SAMPLE SIZE
~157k
DATA ACCESS
https://jmorp.megabank.tohoku.ac.jp/

TwinsUK

Biobank / cohort
DESCRIPTION
UK adult twin registry with deep multi-omics and phenotyping; genotype and sequence data widely used in GWAS, heritability, and multi-trait genetic analyses.
URL
https://twinsuk.ac.uk/
MAIN ANCESTRY
EUR
PARTICIPANTS
UK adult female twins
SAMPLE SIZE
~14k twins

UCLA Precision Health Biobank

Biobank / cohort
DESCRIPTION
The UCLA ATLAS Precision Health Biobank, under the supervision of the Translational Pathology Core Laboratory (TCPL), collects biological samples from patients who have consented to participate in the UCLA ATLAS Community Health Initiative. As a collaborator with UCLA ATLAS Community Health Initiative, the UCLA ATLAS Precision Health Biobank manages the collection and distribution of biological samples by removing the personally identifiable information.
URL
https://icahn.mssm.edu/research/ipm/programs/biome-biobank
Main citation
Johnson, R. D., Ding, Y., Bhattacharya, A., Chiu, A., Lajonchere, C., Geschwind, D. H., & Pasaniuc, B. (2022). The UCLA ATLAS Community Health Initiative: promoting precision health research in a diverse biobank. medRxiv.
MAIN ANCESTRY
AFR,AMR,EAS,EUR,SAS
PARTICIPANTS
UCLA Health patients
SAMPLE SIZE
~27k

UK Biobank (UKB)

Biobank / cohort
DESCRIPTION
UK Biobank is a large-scale biomedical database and research resource, containing in-depth genetic and health information from half a million UK participants. The database is regularly augmented with additional data and is globally accessible to approved researchers undertaking vital research into the most common and life-threatening diseases. It is a major contributor to the advancement of modern medicine and treatment and has enabled several scientific discoveries that improve human health.
URL
https://www.ukbiobank.ac.uk/
MAIN ANCESTRY
EUR
PARTICIPANTS
UK adults (aged ~40–69 at baseline)
SAMPLE SIZE
~500k

Westlake BioBank for Chinese (WBBC)

Biobank / cohort
DESCRIPTION
The Westlake BioBank for Chinese (WBBC) cohort is a population-based prospective study with its major purpose to better understand the effect of genetic and environmental factors on growth and development from youngster to elderly. The dataset comprises a wide range of demographics and anthropometric measures, serological tests, physical activity, sleep quality, age at menarche and bone mineral density. WBBC is designed as a prospective cohort study and will recruit at least 100,000 Chinese samples. The pilot project of WBBC has recruited a total of 14,726 participants (4,751 males and 9,975 females) and the baseline survey was carried out from 2017 to 2019.
URL
https://wbbc.westlake.edu.cn/
MAIN ANCESTRY
EAS
PARTICIPANTS
Han Chinese adults (Westlake biobank)
SAMPLE SIZE
~14k

Women's Health Initiative (WHI)

Biobank / cohort
DESCRIPTION
Large U.S. women’s health trial and observational study with biospecimens and long follow-up; genotype data support GWAS on cancer, cardiovascular, and aging-related traits.
URL
https://www.whi.org/
MAIN ANCESTRY
AFR,AMR,EAS,EUR
PARTICIPANTS
US postmenopausal women (multi-ethnic)
SAMPLE SIZE
~161k