Tools Population Genetics Selection

Curation of Selection within Population Genetics — listings under the GWAS Tools tab.

Summary Table

Click a column header to sort the table.

NAME	Main citation	YEAR
AGES	Akbari A et al., Nature, 2026	2026
CMS	Grossman SR et al., Science, 2010	2010
EHH	Klassmann A et al., PLoS One, 2022	2022
GeneBayes	Zeng T et al., Nat Genet, 2024	2024
HWE	Wigginton JE et al., Am J Hum Genet, 2005	2005
Review-Fst	Holsinger KE et al., Nat Rev Genet, 2009	2009
SDS	Field Y et al., Science, 2016	2016
XP-EHH	Klassmann A et al., PLoS One, 2022	2022
f	Moon S et al., Genome Res, 2016	2016
iHS	Voight BF et al., PLoS Biol, 2006	2006

AGES

Tool

FULL NAME

Ancient GEnome Selection

DESCRIPTION

Akbari, A., Perry, A., Barton, A. R., Kariminejad, M., Gazal, S., Li, Z., ... & Reich, D. (2026). Ancient DNA reveals pervasive directional selection across West Eurasia. Nature.

Show full descriptionShow less

URL

http://reich-ages.rc.hms.harvard.edu

KEYWORDS

ancient DNA, directional selection, time series, allele frequency

Show full keywordsShow less

USE

AGES detects directional selection in ancient DNA time-series data by testing whether allele frequencies show consistent temporal trends after accounting for structure and migration-related confounding.

TITLE

Ancient DNA reveals pervasive directional selection across West Eurasia.

Main citation

Akbari A, Perry A, Barton AR, Kariminejad M, Gazal S, Li Z, ... Reich D. (2026) Ancient DNA reveals pervasive directional selection across West Eurasia. Nature. doi:10.1038/s41586-026-10358-1.

ABSTRACT

The study introduces a method for detecting directional selection from ancient DNA time-series by testing for consistent allele-frequency changes over time across 15,836 West Eurasians. The framework estimates selection coefficients genome-wide and helps distinguish sustained adaptive change from shifts caused by migration, structure, or non-adaptive forces.

Show full abstractShow less

DOI

10.1038/s41586-026-10358-1

CMS

Tool

PUBMED_LINK

20056855

FULL NAME

Composite of multiple signals

DESCRIPTION

Grossman, S. R., Shylakhter, I., Karlsson, E. K., Byrne, E. H., Morales, S., Frieden, G., ... & Sabeti, P. C. (2010). A composite of multiple signals distinguishes causal variants in regions of positive selection. Science, 327(5967), 883-886.

Show full descriptionShow less

TITLE

A composite of multiple signals distinguishes causal variants in regions of positive selection.

Main citation

Grossman SR, Shlyakhter I, Karlsson EK, Byrne EH, ...&, Sabeti PC. (2010) A composite of multiple signals distinguishes causal variants in regions of positive selection. Science, 327 (5967) 883-6. doi:10.1126/science.1183863. PMID 20056855

ABSTRACT

The human genome contains hundreds of regions whose patterns of genetic variation indicate recent positive natural selection, yet for most the underlying gene and the advantageous mutation remain unknown. We developed a method, composite of multiple signals (CMS), that combines tests for multiple signals of selection and increases resolution by up to 100-fold. By applying CMS to candidate regions from the International Haplotype Map, we localized population-specific selective signals to 55 kilobases (median), identifying known and novel causal variants. CMS can not just identify individual loci but implicates precise variants selected by evolution.

Show full abstractShow less

DOI

10.1126/science.1183863

EHH

Tool

PUBMED_LINK

35041674

FULL NAME

Extended haplotype homozygosity

DESCRIPTION

Sabeti, P. C., Reich, D. E., Higgins, J. M., Levine, H. Z., Richter, D. J., Schaffner, S. F., ... & Lander, E. S. (2002). Detecting recent positive selection in the human genome from haplotype structure. Nature, 419(6909), 832-837.

Show full descriptionShow less

TITLE

Detecting selection using extended haplotype homozygosity (EHH)-based statistics in unphased or unpolarized data.

Main citation

Klassmann A, Gautier M. (2022) Detecting selection using extended haplotype homozygosity (EHH)-based statistics in unphased or unpolarized data. PLoS One, 17 (1) e0262024. doi:10.1371/journal.pone.0262024. PMID 35041674

ABSTRACT

Analysis of population genetic data often includes a search for genomic regions with signs of recent positive selection. One of such approaches involves the concept of extended haplotype homozygosity (EHH) and its associated statistics. These statistics typically require phased haplotypes, and some of them necessitate polarized variants. Here, we unify and extend previously proposed modifications to loosen these requirements. We compare the modified versions with the original ones by measuring the false discovery rate in simulated whole-genome scans and by quantifying the overlap of inferred candidate regions in empirical data. We find that phasing information is indispensable for accurate estimation of within-population statistics (for all but very large samples) and of cross-population statistics for small samples. Ancestry information, in contrast, is of lesser importance for both types of statistic. Our publicly available R package rehh incorporates the modified statistics presented here.

Show full abstractShow less

DOI

10.1371/journal.pone.0262024

GeneBayes

Tool

FULL NAME

Bayesian estimation of gene constraint from an evolutionary model with gene features

DESCRIPTION

Zeng, T., Spence, J. P., Mostafavi, H., & Pritchard, J. K. (2024). Bayesian estimation of gene constraint from an evolutionary model with gene features. Nature Genetics, 56, 1632-1643.

Show full descriptionShow less

URL

https://github.com/tkzeng/GeneBayes

KEYWORDS

gene constraint, Bayesian inference, selection coefficient, loss-of-function, s_het

Show full keywordsShow less

USE

GeneBayes estimates gene-level selective constraint (s_het) by combining an evolutionary population genetics model with machine learning on gene features, improving constraint inference for short genes.

TITLE

Bayesian estimation of gene constraint from an evolutionary model with gene features.

Main citation

Zeng T, Spence JP, Mostafavi H, Pritchard JK. (2024) Bayesian estimation of gene constraint from an evolutionary model with gene features. Nature Genetics, 56, 1632-1643. doi:10.1038/s41588-024-01820-9.

ABSTRACT

This study introduces GeneBayes, a framework that integrates an evolutionary model with gene features to estimate gene-level selective constraint. The method improves inference of the interpretable constraint metric s_het, especially for short genes, and outperforms existing metrics for prioritizing genes relevant to essentiality and human disease.

Show full abstractShow less

DOI

10.1038/s41588-024-01820-9

HWE

Tool

PUBMED_LINK

15789306

FULL NAME

Exact Tests of Hardy-Weinberg Equilibrium

DESCRIPTION

Wigginton, J. E., Cutler, D. J., & Abecasis, G. R. (2005). A note on exact tests of Hardy-Weinberg equilibrium. The American Journal of Human Genetics, 76(5), 887-893.

Show full descriptionShow less

TITLE

A note on exact tests of Hardy-Weinberg equilibrium.

Main citation

Wigginton JE, Cutler DJ, Abecasis GR. (2005) A note on exact tests of Hardy-Weinberg equilibrium. Am J Hum Genet, 76 (5) 887-93. doi:10.1086/429864. PMID 15789306

ABSTRACT

Deviations from Hardy-Weinberg equilibrium (HWE) can indicate inbreeding, population stratification, and even problems in genotyping. In samples of affected individuals, these deviations can also provide evidence for association. Tests of HWE are commonly performed using a simple chi2 goodness-of-fit test. We show that this chi2 test can have inflated type I error rates, even in relatively large samples (e.g., samples of 1,000 individuals that include approximately 100 copies of the minor allele). On the basis of previous work, we describe exact tests of HWE together with efficient computational methods for their implementation. Our methods adequately control type I error in large and small samples and are computationally efficient. They have been implemented in freely available code that will be useful for quality assessment of genotype data and for the detection of genetic association or population stratification in very large data sets.

Show full abstractShow less

DOI

10.1086/429864

Review-Fst

Tool

PUBMED_LINK

19687804

DESCRIPTION

Holsinger, K. E., & Weir, B. S. (2009). Genetics in geographically structured populations: defining, estimating and interpreting F ST. Nature Reviews Genetics, 10(9), 639-650.

Show full descriptionShow less

TITLE

Genetics in geographically structured populations: defining, estimating and interpreting F(ST).

Main citation

Holsinger KE, Weir BS. (2009) Genetics in geographically structured populations: defining, estimating and interpreting F(ST). Nat Rev Genet, 10 (9) 639-50. doi:10.1038/nrg2611. PMID 19687804

ABSTRACT

Wright's F-statistics, and especially F(ST), provide important insights into the evolutionary processes that influence the structure of genetic variation within and among populations, and they are among the most widely used descriptive statistics in population and evolutionary genetics. Estimates of F(ST) can identify regions of the genome that have been the target of selection, and comparisons of F(ST) from different parts of the genome can provide insights into the demographic history of populations. For these reasons and others, F(ST) has a central role in population and evolutionary genetics and has wide applications in fields that range from disease association mapping to forensic science. This Review clarifies how F(ST) is defined, how it should be estimated, how it is related to similar statistics and how estimates of F(ST) should be interpreted.

Show full abstractShow less

DOI

10.1038/nrg2611

SDS

Tool

PUBMED_LINK

27738015

FULL NAME

singleton density score

DESCRIPTION

Field, Y., Boyle, E. A., Telis, N., Gao, Z., Gaulton, K. J., Golan, D., ... & Pritchard, J. K. (2016). Detection of human adaptation during the past 2000 years. Science, 354(6313), 760-764.

Show full descriptionShow less

URL

https://github.com/yairf/SDS

KEYWORDS

singleton, recent selection

Show full keywordsShow less

USE

SDS is a method to infer very recent changes in allele frequencies from contemporary genome sequences

TITLE

Detection of human adaptation during the past 2000 years.

Main citation

Field Y, Boyle EA, Telis N, Gao Z, ...&, Pritchard JK. (2016) Detection of human adaptation during the past 2000 years. Science, 354 (6313) 760-764. doi:10.1126/science.aag0776. PMID 27738015

ABSTRACT

Detection of recent natural selection is a challenging problem in population genetics. Here we introduce the singleton density score (SDS), a method to infer very recent changes in allele frequencies from contemporary genome sequences. Applied to data from the UK10K Project, SDS reflects allele frequency changes in the ancestors of modern Britons during the past ~2000 to 3000 years. We see strong signals of selection at lactase and the major histocompatibility complex, and in favor of blond hair and blue eyes. For polygenic adaptation, we find that recent selection for increased height has driven allele frequency shifts across most of the genome. Moreover, we identify shifts associated with other complex traits, suggesting that polygenic adaptation has played a pervasive role in shaping genotypic and phenotypic variation in modern humans.

Show full abstractShow less

DOI

10.1126/science.aag0776

XP-EHH

Tool

PUBMED_LINK

35041674

FULL NAME

Cross-population extended haplotype homozygosity

DESCRIPTION

Klassmann, A., & Gautier, M. (2022). Detecting selection using extended haplotype homozygosity (EHH)-based statistics in unphased or unpolarized data. PloS one, 17(1), e0262024.

Show full descriptionShow less

TITLE

Detecting selection using extended haplotype homozygosity (EHH)-based statistics in unphased or unpolarized data.

Main citation

Klassmann A, Gautier M. (2022) Detecting selection using extended haplotype homozygosity (EHH)-based statistics in unphased or unpolarized data. PLoS One, 17 (1) e0262024. doi:10.1371/journal.pone.0262024. PMID 35041674

ABSTRACT

Analysis of population genetic data often includes a search for genomic regions with signs of recent positive selection. One of such approaches involves the concept of extended haplotype homozygosity (EHH) and its associated statistics. These statistics typically require phased haplotypes, and some of them necessitate polarized variants. Here, we unify and extend previously proposed modifications to loosen these requirements. We compare the modified versions with the original ones by measuring the false discovery rate in simulated whole-genome scans and by quantifying the overlap of inferred candidate regions in empirical data. We find that phasing information is indispensable for accurate estimation of within-population statistics (for all but very large samples) and of cross-population statistics for small samples. Ancestry information, in contrast, is of lesser importance for both types of statistic. Our publicly available R package rehh incorporates the modified statistics presented here.

Show full abstractShow less

DOI

10.1371/journal.pone.0262024

f

Tool

PUBMED_LINK

27197222

FULL NAME

fraction of sites under selection

DESCRIPTION

Moon, S., & Akey, J. M. (2016). A flexible method for estimating the fraction of fitness influencing mutations from large sequencing data sets. Genome Research, 26(6), 834-843.

Show full descriptionShow less

TITLE

A flexible method for estimating the fraction of fitness influencing mutations from large sequencing data sets.

Main citation

Moon S, Akey JM. (2016) A flexible method for estimating the fraction of fitness influencing mutations from large sequencing data sets. Genome Res, 26 (6) 834-43. doi:10.1101/gr.203059.115. PMID 27197222

ABSTRACT

A continuing challenge in the analysis of massively large sequencing data sets is quantifying and interpreting non-neutrally evolving mutations. Here, we describe a flexible and robust approach based on the site frequency spectrum to estimate the fraction of deleterious and adaptive variants from large-scale sequencing data sets. We applied our method to approximately 1 million single nucleotide variants (SNVs) identified in high-coverage exome sequences of 6515 individuals. We estimate that the fraction of deleterious nonsynonymous SNVs is higher than previously reported; quantify the effects of genomic context, codon bias, chromatin accessibility, and number of protein-protein interactions on deleterious protein-coding SNVs; and identify pathways and networks that have likely been influenced by positive selection. Furthermore, we show that the fraction of deleterious nonsynonymous SNVs is significantly higher for Mendelian versus complex disease loci and in exons harboring dominant versus recessive Mendelian mutations. In summary, as genome-scale sequencing data accumulate in progressively larger sample sizes, our method will enable increasingly high-resolution inferences into the characteristics and determinants of non-neutral variation.

Show full abstractShow less

DOI

10.1101/gr.203059.115

iHS

Tool

PUBMED_LINK

16494531

FULL NAME

Integrated haplotype score

DESCRIPTION

Voight, B. F., Kudaravalli, S., Wen, X., & Pritchard, J. K. (2006). A map of recent positive selection in the human genome. PLoS biology, 4(3), e72.

Show full descriptionShow less

TITLE

A map of recent positive selection in the human genome.

Main citation

Voight BF, Kudaravalli S, Wen X, Pritchard JK. (2006) A map of recent positive selection in the human genome. PLoS Biol, 4 (3) e72. doi:10.1371/journal.pbio.0040072. PMID 16494531

ABSTRACT

The identification of signals of very recent positive selection provides information about the adaptation of modern humans to local conditions. We report here on a genome-wide scan for signals of very recent positive selection in favor of variants that have not yet reached fixation. We describe a new analytical method for scanning single nucleotide polymorphism (SNP) data for signals of recent selection, and apply this to data from the International HapMap Project. In all three continental groups we find widespread signals of recent positive selection. Most signals are region-specific, though a significant excess are shared across groups. Contrary to some earlier low resolution studies that suggested a paucity of recent selection in sub-Saharan Africans, we find that by some measures our strongest signals of selection are from the Yoruba population. Finally, since these signals indicate the existence of genetic variants that have substantially different fitnesses, they must indicate loci that are the source of significant phenotypic variation. Though the relevant phenotypes are generally not known, such loci should be of particular interest in mapping studies of complex traits. For this purpose we have developed a set of SNPs that can be used to tag the strongest approximately 250 signals of recent selection in each population.

Show full abstractShow less

DOI

10.1371/journal.pbio.0040072