Selection

Summary Table

NAME	CITATION	YEAR
CMS	Grossman SR, Shlyakhter I, Karlsson EK, Byrne EH, ...&, Sabeti PC. (2010) A composite of multiple signals distinguishes causal variants in regions of positive selection Science, 327 (5967) 883-886. doi:10.1126/science.1183863. PMID 20056855	2010
EHH	Klassmann A, Gautier M. (2022) Detecting selection using extended haplotype homozygosity (EHH)-based statistics in unphased or unpolarized data PLoS One, 17 (1) e0262024. doi:10.1371/journal.pone.0262024. PMID 35041674	2022
HWE	Wigginton JE, Cutler DJ, Abecasis GR. (2005) A note on exact tests of Hardy-Weinberg equilibrium Am. J. Hum. Genet., 76 (5) 887-893. doi:10.1086/429864. PMID 15789306	2005
Review-Fst	Holsinger KE, Weir BS. (2009) Genetics in geographically structured populations: defining, estimating and interpreting F(ST) Nat. Rev. Genet., 10 (9) 639-650. doi:10.1038/nrg2611. PMID 19687804	2009
SDS	Field Y, Boyle EA, Telis N, Gao Z, ...&, Pritchard JK. (2016) Detection of human adaptation during the past 2000 years Science, 354 (6313) 760-764. doi:10.1126/science.aag0776. PMID 27738015	2016
XP-EHH	Klassmann A, Gautier M. (2022) Detecting selection using extended haplotype homozygosity (EHH)-based statistics in unphased or unpolarized data PLoS One, 17 (1) e0262024. doi:10.1371/journal.pone.0262024. PMID 35041674	2022
f	Moon S, Akey JM. (2016) A flexible method for estimating the fraction of fitness influencing mutations from large sequencing data sets Genome Res., 26 (6) 834-843. doi:10.1101/gr.203059.115. PMID 27197222	2016
iHS	Voight BF, Kudaravalli S, Wen X, Pritchard JK. (2006) A map of recent positive selection in the human genome PLoS Biol., 4 (3) e72. doi:10.1371/journal.pbio.0040072. PMID 16494531	2006

CMS

NAME : CMS
SHORT NAME : CMS
FULL NAME : Composite of multiple signals
TITLE : A composite of multiple signals distinguishes causal variants in regions of positive selection
DOI : 10.1126/science.1183863
ABSTRACT : The human genome contains hundreds of regions whose patterns of genetic variation indicate recent positive natural selection, yet for most the underlying gene and the advantageous mutation remain unknown. We developed a method, composite of multiple signals (CMS), that combines tests for multiple signals of selection and increases resolution by up to 100-fold. By applying CMS to candidate regions from the International Haplotype Map, we localized population-specific selective signals to 55 kilobases (median), identifying known and novel causal variants. CMS can not just identify individual loci but implicates precise variants selected by evolution.
CITATION : Grossman SR, Shlyakhter I, Karlsson EK, Byrne EH, ...&, Sabeti PC. (2010) A composite of multiple signals distinguishes causal variants in regions of positive selection Science, 327 (5967) 883-886. doi:10.1126/science.1183863. PMID 20056855
JOURNAL_INFO : Science ; Science ; 2010 ; 327 ; 5967 ; 883-886
PUBMED_LINK : 20056855

EHH

NAME : EHH
SHORT NAME : EHH
FULL NAME : Extended haplotype homozygosity
TITLE : Detecting selection using extended haplotype homozygosity (EHH)-based statistics in unphased or unpolarized data
DOI : 10.1371/journal.pone.0262024
ABSTRACT : Analysis of population genetic data often includes a search for genomic regions with signs of recent positive selection. One of such approaches involves the concept of extended haplotype homozygosity (EHH) and its associated statistics. These statistics typically require phased haplotypes, and some of them necessitate polarized variants. Here, we unify and extend previously proposed modifications to loosen these requirements. We compare the modified versions with the original ones by measuring the false discovery rate in simulated whole-genome scans and by quantifying the overlap of inferred candidate regions in empirical data. We find that phasing information is indispensable for accurate estimation of within-population statistics (for all but very large samples) and of cross-population statistics for small samples. Ancestry information, in contrast, is of lesser importance for both types of statistic. Our publicly available R package rehh incorporates the modified statistics presented here.
CITATION : Klassmann A, Gautier M. (2022) Detecting selection using extended haplotype homozygosity (EHH)-based statistics in unphased or unpolarized data PLoS One, 17 (1) e0262024. doi:10.1371/journal.pone.0262024. PMID 35041674
JOURNAL_INFO : PloS one ; PLoS One ; 2022 ; 17 ; 1 ; e0262024
PUBMED_LINK : 35041674

HWE

NAME : HWE
SHORT NAME : HWE
FULL NAME : Exact Tests of Hardy-Weinberg Equilibrium
TITLE : A note on exact tests of Hardy-Weinberg equilibrium
DOI : 10.1086/429864
ABSTRACT : Deviations from Hardy-Weinberg equilibrium (HWE) can indicate inbreeding, population stratification, and even problems in genotyping. In samples of affected individuals, these deviations can also provide evidence for association. Tests of HWE are commonly performed using a simple chi2 goodness-of-fit test. We show that this chi2 test can have inflated type I error rates, even in relatively large samples (e.g., samples of 1,000 individuals that include approximately 100 copies of the minor allele). On the basis of previous work, we describe exact tests of HWE together with efficient computational methods for their implementation. Our methods adequately control type I error in large and small samples and are computationally efficient. They have been implemented in freely available code that will be useful for quality assessment of genotype data and for the detection of genetic association or population stratification in very large data sets.
COPYRIGHT : https://www.elsevier.com/open-access/userlicense/1.0/
CITATION : Wigginton JE, Cutler DJ, Abecasis GR. (2005) A note on exact tests of Hardy-Weinberg equilibrium Am. J. Hum. Genet., 76 (5) 887-893. doi:10.1086/429864. PMID 15789306
JOURNAL_INFO : The American Journal of Human Genetics ; Am. J. Hum. Genet. ; 2005 ; 76 ; 5 ; 887-893
PUBMED_LINK : 15789306

Review-Fst

NAME : Review-Fst
TITLE : Genetics in geographically structured populations: defining, estimating and interpreting F(ST)
DOI : 10.1038/nrg2611
ABSTRACT : Wright's F-statistics, and especially F(ST), provide important insights into the evolutionary processes that influence the structure of genetic variation within and among populations, and they are among the most widely used descriptive statistics in population and evolutionary genetics. Estimates of F(ST) can identify regions of the genome that have been the target of selection, and comparisons of F(ST) from different parts of the genome can provide insights into the demographic history of populations. For these reasons and others, F(ST) has a central role in population and evolutionary genetics and has wide applications in fields that range from disease association mapping to forensic science. This Review clarifies how F(ST) is defined, how it should be estimated, how it is related to similar statistics and how estimates of F(ST) should be interpreted.
CITATION : Holsinger KE, Weir BS. (2009) Genetics in geographically structured populations: defining, estimating and interpreting F(ST) Nat. Rev. Genet., 10 (9) 639-650. doi:10.1038/nrg2611. PMID 19687804
JOURNAL_INFO : Nature reviews. Genetics ; Nat. Rev. Genet. ; 2009 ; 10 ; 9 ; 639-650
PUBMED_LINK : 19687804

SDS

NAME : SDS
SHORT NAME : SDS
FULL NAME : singleton density score
DESCRIPTION : SDS is a method to infer very recent changes in allele frequencies from contemporary genome sequences
URL : https://github.com/yairf/SDS
KEYWORDS : singleton, recent selection
TITLE : Detection of human adaptation during the past 2000 years
DOI : 10.1126/science.aag0776
ABSTRACT : Detection of recent natural selection is a challenging problem in population genetics. Here we introduce the singleton density score (SDS), a method to infer very recent changes in allele frequencies from contemporary genome sequences. Applied to data from the UK10K Project, SDS reflects allele frequency changes in the ancestors of modern Britons during the past ~2000 to 3000 years. We see strong signals of selection at lactase and the major histocompatibility complex, and in favor of blond hair and blue eyes. For polygenic adaptation, we find that recent selection for increased height has driven allele frequency shifts across most of the genome. Moreover, we identify shifts associated with other complex traits, suggesting that polygenic adaptation has played a pervasive role in shaping genotypic and phenotypic variation in modern humans.
CITATION : Field Y, Boyle EA, Telis N, Gao Z, ...&, Pritchard JK. (2016) Detection of human adaptation during the past 2000 years Science, 354 (6313) 760-764. doi:10.1126/science.aag0776. PMID 27738015
JOURNAL_INFO : Science ; Science ; 2016 ; 354 ; 6313 ; 760-764
PUBMED_LINK : 27738015

XP-EHH

NAME : XP-EHH
SHORT NAME : XP-EHH
FULL NAME : Cross-population extended haplotype homozygosity
TITLE : Detecting selection using extended haplotype homozygosity (EHH)-based statistics in unphased or unpolarized data
DOI : 10.1371/journal.pone.0262024
ABSTRACT : Analysis of population genetic data often includes a search for genomic regions with signs of recent positive selection. One of such approaches involves the concept of extended haplotype homozygosity (EHH) and its associated statistics. These statistics typically require phased haplotypes, and some of them necessitate polarized variants. Here, we unify and extend previously proposed modifications to loosen these requirements. We compare the modified versions with the original ones by measuring the false discovery rate in simulated whole-genome scans and by quantifying the overlap of inferred candidate regions in empirical data. We find that phasing information is indispensable for accurate estimation of within-population statistics (for all but very large samples) and of cross-population statistics for small samples. Ancestry information, in contrast, is of lesser importance for both types of statistic. Our publicly available R package rehh incorporates the modified statistics presented here.
CITATION : Klassmann A, Gautier M. (2022) Detecting selection using extended haplotype homozygosity (EHH)-based statistics in unphased or unpolarized data PLoS One, 17 (1) e0262024. doi:10.1371/journal.pone.0262024. PMID 35041674
JOURNAL_INFO : PloS one ; PLoS One ; 2022 ; 17 ; 1 ; e0262024
PUBMED_LINK : 35041674

f

NAME : f
SHORT NAME : f
FULL NAME : fraction of sites under selection
TITLE : A flexible method for estimating the fraction of fitness influencing mutations from large sequencing data sets
DOI : 10.1101/gr.203059.115
ABSTRACT : A continuing challenge in the analysis of massively large sequencing data sets is quantifying and interpreting non-neutrally evolving mutations. Here, we describe a flexible and robust approach based on the site frequency spectrum to estimate the fraction of deleterious and adaptive variants from large-scale sequencing data sets. We applied our method to approximately 1 million single nucleotide variants (SNVs) identified in high-coverage exome sequences of 6515 individuals. We estimate that the fraction of deleterious nonsynonymous SNVs is higher than previously reported; quantify the effects of genomic context, codon bias, chromatin accessibility, and number of protein-protein interactions on deleterious protein-coding SNVs; and identify pathways and networks that have likely been influenced by positive selection. Furthermore, we show that the fraction of deleterious nonsynonymous SNVs is significantly higher for Mendelian versus complex disease loci and in exons harboring dominant versus recessive Mendelian mutations. In summary, as genome-scale sequencing data accumulate in progressively larger sample sizes, our method will enable increasingly high-resolution inferences into the characteristics and determinants of non-neutral variation.
CITATION : Moon S, Akey JM. (2016) A flexible method for estimating the fraction of fitness influencing mutations from large sequencing data sets Genome Res., 26 (6) 834-843. doi:10.1101/gr.203059.115. PMID 27197222
JOURNAL_INFO : Genome research ; Genome Res. ; 2016 ; 26 ; 6 ; 834-843
PUBMED_LINK : 27197222

iHS

NAME : iHS
SHORT NAME : iHS
FULL NAME : Integrated haplotype score
TITLE : A map of recent positive selection in the human genome
DOI : 10.1371/journal.pbio.0040072
ABSTRACT : The identification of signals of very recent positive selection provides information about the adaptation of modern humans to local conditions. We report here on a genome-wide scan for signals of very recent positive selection in favor of variants that have not yet reached fixation. We describe a new analytical method for scanning single nucleotide polymorphism (SNP) data for signals of recent selection, and apply this to data from the International HapMap Project. In all three continental groups we find widespread signals of recent positive selection. Most signals are region-specific, though a significant excess are shared across groups. Contrary to some earlier low resolution studies that suggested a paucity of recent selection in sub-Saharan Africans, we find that by some measures our strongest signals of selection are from the Yoruba population. Finally, since these signals indicate the existence of genetic variants that have substantially different fitnesses, they must indicate loci that are the source of significant phenotypic variation. Though the relevant phenotypes are generally not known, such loci should be of particular interest in mapping studies of complex traits. For this purpose we have developed a set of SNPs that can be used to tag the strongest approximately 250 signals of recent selection in each population.
CITATION : Voight BF, Kudaravalli S, Wen X, Pritchard JK. (2006) A map of recent positive selection in the human genome PLoS Biol., 4 (3) e72. doi:10.1371/journal.pbio.0040072. PMID 16494531
JOURNAL_INFO : PLoS biology ; PLoS Biol. ; 2006 ; 4 ; 3 ; e72
PUBMED_LINK : 16494531