Admixture

Summary Table

NAME	CITATION	YEAR
ADMIXTURE	Alexander DH, Novembre J, Lange K. (2009) Fast model-based estimation of ancestry in unrelated individuals Genome Res., 19 (9) 1655-1664. doi:10.1101/gr.094052.109. PMID 19648217	2009
OpenADMIXTURE	Ko S, Chu BB, Peterson D, Okenwa C, ...&, Lange KL. (2023) Unsupervised discovery of ancestry-informative markers and genetic admixture proportions in biobank-scale datasets Am. J. Hum. Genet., 110 (2) 314-325. doi:10.1016/j.ajhg.2022.12.008. PMID 36610401	2023

ADMIXTURE

NAME : ADMIXTURE
SHORT NAME : ADMIXTURE
FULL NAME : ADMIXTURE
DESCRIPTION : ADMIXTURE is a software tool for maximum likelihood estimation of individual ancestries from multilocus SNP genotype datasets. It uses the same statistical model as STRUCTURE but calculates estimates much more rapidly using a fast numerical optimization algorithm.
URL : https://dalexander.github.io/admixture/
TITLE : Fast model-based estimation of ancestry in unrelated individuals
DOI : 10.1101/gr.094052.109
ABSTRACT : Population stratification has long been recognized as a confounding factor in genetic association studies. Estimated ancestries, derived from multi-locus genotype data, can be used to perform a statistical correction for population stratification. One popular technique for estimation of ancestry is the model-based approach embodied by the widely applied program structure. Another approach, implemented in the program EIGENSTRAT, relies on Principal Component Analysis rather than model-based estimation and does not directly deliver admixture fractions. EIGENSTRAT has gained in popularity in part owing to its remarkable speed in comparison to structure. We present a new algorithm and a program, ADMIXTURE, for model-based estimation of ancestry in unrelated individuals. ADMIXTURE adopts the likelihood model embedded in structure. However, ADMIXTURE runs considerably faster, solving problems in minutes that take structure hours. In many of our experiments, we have found that ADMIXTURE is almost as fast as EIGENSTRAT. The runtime improvements of ADMIXTURE rely on a fast block relaxation scheme using sequential quadratic programming for block updates, coupled with a novel quasi-Newton acceleration of convergence. Our algorithm also runs faster and with greater accuracy than the implementation of an Expectation-Maximization (EM) algorithm incorporated in the program FRAPPE. Our simulations show that ADMIXTURE's maximum likelihood estimates of the underlying admixture coefficients and ancestral allele frequencies are as accurate as structure's Bayesian estimates. On real-world data sets, ADMIXTURE's estimates are directly comparable to those from structure and EIGENSTRAT. Taken together, our results show that ADMIXTURE's computational speed opens up the possibility of using a much larger set of markers in model-based ancestry estimation and that its estimates are suitable for use in correcting for population stratification in association studies.
CITATION : Alexander DH, Novembre J, Lange K. (2009) Fast model-based estimation of ancestry in unrelated individuals Genome Res., 19 (9) 1655-1664. doi:10.1101/gr.094052.109. PMID 19648217
JOURNAL_INFO : Genome research ; Genome Res. ; 2009 ; 19 ; 9 ; 1655-1664
PUBMED_LINK : 19648217

OpenADMIXTURE

NAME : OpenADMIXTURE
SHORT NAME : OpenADMIXTURE
FULL NAME : OpenADMIXTURE
DESCRIPTION : This software package is an open-source Julia reimplementation of the ADMIXTURE package. It estimates ancestry with maximum-likelihood method for a large SNP genotype datasets, where individuals are assumed to be unrelated.
URL : https://github.com/OpenMendel/OpenADMIXTURE.jl
TITLE : Unsupervised discovery of ancestry-informative markers and genetic admixture proportions in biobank-scale datasets
DOI : 10.1016/j.ajhg.2022.12.008
ABSTRACT : Admixture estimation plays a crucial role in ancestry inference and genome-wide association studies (GWASs). Computer programs such as ADMIXTURE and STRUCTURE are commonly employed to estimate the admixture proportions of sample individuals. However, these programs can be overwhelmed by the computational burdens imposed by the 105 to 106 samples and millions of markers commonly found in modern biobanks. An attractive strategy is to run these programs on a set of ancestry-informative SNP markers (AIMs) that exhibit substantially different frequencies across populations. Unfortunately, existing methods for identifying AIMs require knowing ancestry labels for a subset of the sample. This supervised learning approach creates a chicken and the egg scenario. In this paper, we present an unsupervised, scalable framework that seamlessly carries out AIM selection and likelihood-based estimation of admixture proportions. Our simulated and real data examples show that this approach is scalable to modern biobank datasets. OpenADMIXTURE, our Julia implementation of the method, is open source and available for free.
COPYRIGHT : http://creativecommons.org/licenses/by-nc-nd/4.0/
CITATION : Ko S, Chu BB, Peterson D, Okenwa C, ...&, Lange KL. (2023) Unsupervised discovery of ancestry-informative markers and genetic admixture proportions in biobank-scale datasets Am. J. Hum. Genet., 110 (2) 314-325. doi:10.1016/j.ajhg.2022.12.008. PMID 36610401
JOURNAL_INFO : The American Journal of Human Genetics ; Am. J. Hum. Genet. ; 2023 ; 110 ; 2 ; 314-325
PUBMED_LINK : 36610401