Skip to content

Selection

Summary Table

NAME CITATION YEAR
CMS Grossman SR, Shlyakhter I, Karlsson EK, Byrne EH, ...&, Sabeti PC. (2010) A composite of multiple signals distinguishes causal variants in regions of positive selection Science, 327 (5967) 883-886. doi:10.1126/science.1183863. PMID 20056855 2010
EHH Klassmann A, Gautier M. (2022) Detecting selection using extended haplotype homozygosity (EHH)-based statistics in unphased or unpolarized data PLoS One, 17 (1) e0262024. doi:10.1371/journal.pone.0262024. PMID 35041674 2022
HWE Wigginton JE, Cutler DJ, Abecasis GR. (2005) A note on exact tests of Hardy-Weinberg equilibrium Am. J. Hum. Genet., 76 (5) 887-893. doi:10.1086/429864. PMID 15789306 2005
Review-Fst Holsinger KE, Weir BS. (2009) Genetics in geographically structured populations: defining, estimating and interpreting F(ST) Nat. Rev. Genet., 10 (9) 639-650. doi:10.1038/nrg2611. PMID 19687804 2009
SDS Field Y, Boyle EA, Telis N, Gao Z, ...&, Pritchard JK. (2016) Detection of human adaptation during the past 2000 years Science, 354 (6313) 760-764. doi:10.1126/science.aag0776. PMID 27738015 2016
XP-EHH Klassmann A, Gautier M. (2022) Detecting selection using extended haplotype homozygosity (EHH)-based statistics in unphased or unpolarized data PLoS One, 17 (1) e0262024. doi:10.1371/journal.pone.0262024. PMID 35041674 2022
f Moon S, Akey JM. (2016) A flexible method for estimating the fraction of fitness influencing mutations from large sequencing data sets Genome Res., 26 (6) 834-843. doi:10.1101/gr.203059.115. PMID 27197222 2016
iHS Voight BF, Kudaravalli S, Wen X, Pritchard JK. (2006) A map of recent positive selection in the human genome PLoS Biol., 4 (3) e72. doi:10.1371/journal.pbio.0040072. PMID 16494531 2006

CMS

  • NAME : CMS
  • SHORT NAME : CMS
  • FULL NAME : Composite of multiple signals
  • TITLE : A composite of multiple signals distinguishes causal variants in regions of positive selection
  • DOI : 10.1126/science.1183863
  • ABSTRACT : The human genome contains hundreds of regions whose patterns of genetic variation indicate recent positive natural selection, yet for most the underlying gene and the advantageous mutation remain unknown. We developed a method, composite of multiple signals (CMS), that combines tests for multiple signals of selection and increases resolution by up to 100-fold. By applying CMS to candidate regions from the International Haplotype Map, we localized population-specific selective signals to 55 kilobases (median), identifying known and novel causal variants. CMS can not just identify individual loci but implicates precise variants selected by evolution.
  • CITATION : Grossman SR, Shlyakhter I, Karlsson EK, Byrne EH, ...&, Sabeti PC. (2010) A composite of multiple signals distinguishes causal variants in regions of positive selection Science, 327 (5967) 883-886. doi:10.1126/science.1183863. PMID 20056855
  • JOURNAL_INFO : Science ; Science ; 2010 ; 327 ; 5967 ; 883-886
  • PUBMED_LINK : 20056855

EHH

  • NAME : EHH
  • SHORT NAME : EHH
  • FULL NAME : Extended haplotype homozygosity
  • TITLE : Detecting selection using extended haplotype homozygosity (EHH)-based statistics in unphased or unpolarized data
  • DOI : 10.1371/journal.pone.0262024
  • ABSTRACT : Analysis of population genetic data often includes a search for genomic regions with signs of recent positive selection. One of such approaches involves the concept of extended haplotype homozygosity (EHH) and its associated statistics. These statistics typically require phased haplotypes, and some of them necessitate polarized variants. Here, we unify and extend previously proposed modifications to loosen these requirements. We compare the modified versions with the original ones by measuring the false discovery rate in simulated whole-genome scans and by quantifying the overlap of inferred candidate regions in empirical data. We find that phasing information is indispensable for accurate estimation of within-population statistics (for all but very large samples) and of cross-population statistics for small samples. Ancestry information, in contrast, is of lesser importance for both types of statistic. Our publicly available R package rehh incorporates the modified statistics presented here.
  • CITATION : Klassmann A, Gautier M. (2022) Detecting selection using extended haplotype homozygosity (EHH)-based statistics in unphased or unpolarized data PLoS One, 17 (1) e0262024. doi:10.1371/journal.pone.0262024. PMID 35041674
  • JOURNAL_INFO : PloS one ; PLoS One ; 2022 ; 17 ; 1 ; e0262024
  • PUBMED_LINK : 35041674

HWE

  • NAME : HWE
  • SHORT NAME : HWE
  • FULL NAME : Exact Tests of Hardy-Weinberg Equilibrium
  • TITLE : A note on exact tests of Hardy-Weinberg equilibrium
  • DOI : 10.1086/429864
  • ABSTRACT : Deviations from Hardy-Weinberg equilibrium (HWE) can indicate inbreeding, population stratification, and even problems in genotyping. In samples of affected individuals, these deviations can also provide evidence for association. Tests of HWE are commonly performed using a simple chi2 goodness-of-fit test. We show that this chi2 test can have inflated type I error rates, even in relatively large samples (e.g., samples of 1,000 individuals that include approximately 100 copies of the minor allele). On the basis of previous work, we describe exact tests of HWE together with efficient computational methods for their implementation. Our methods adequately control type I error in large and small samples and are computationally efficient. They have been implemented in freely available code that will be useful for quality assessment of genotype data and for the detection of genetic association or population stratification in very large data sets.
  • COPYRIGHT : https://www.elsevier.com/open-access/userlicense/1.0/
  • CITATION : Wigginton JE, Cutler DJ, Abecasis GR. (2005) A note on exact tests of Hardy-Weinberg equilibrium Am. J. Hum. Genet., 76 (5) 887-893. doi:10.1086/429864. PMID 15789306
  • JOURNAL_INFO : The American Journal of Human Genetics ; Am. J. Hum. Genet. ; 2005 ; 76 ; 5 ; 887-893
  • PUBMED_LINK : 15789306

Review-Fst

  • NAME : Review-Fst
  • TITLE : Genetics in geographically structured populations: defining, estimating and interpreting F(ST)
  • DOI : 10.1038/nrg2611
  • ABSTRACT : Wright's F-statistics, and especially F(ST), provide important insights into the evolutionary processes that influence the structure of genetic variation within and among populations, and they are among the most widely used descriptive statistics in population and evolutionary genetics. Estimates of F(ST) can identify regions of the genome that have been the target of selection, and comparisons of F(ST) from different parts of the genome can provide insights into the demographic history of populations. For these reasons and others, F(ST) has a central role in population and evolutionary genetics and has wide applications in fields that range from disease association mapping to forensic science. This Review clarifies how F(ST) is defined, how it should be estimated, how it is related to similar statistics and how estimates of F(ST) should be interpreted.
  • CITATION : Holsinger KE, Weir BS. (2009) Genetics in geographically structured populations: defining, estimating and interpreting F(ST) Nat. Rev. Genet., 10 (9) 639-650. doi:10.1038/nrg2611. PMID 19687804
  • JOURNAL_INFO : Nature reviews. Genetics ; Nat. Rev. Genet. ; 2009 ; 10 ; 9 ; 639-650
  • PUBMED_LINK : 19687804

SDS

  • NAME : SDS
  • SHORT NAME : SDS
  • FULL NAME : singleton density score
  • DESCRIPTION : SDS is a method to infer very recent changes in allele frequencies from contemporary genome sequences
  • URL : https://github.com/yairf/SDS
  • KEYWORDS : singleton, recent selection
  • TITLE : Detection of human adaptation during the past 2000 years
  • DOI : 10.1126/science.aag0776
  • ABSTRACT : Detection of recent natural selection is a challenging problem in population genetics. Here we introduce the singleton density score (SDS), a method to infer very recent changes in allele frequencies from contemporary genome sequences. Applied to data from the UK10K Project, SDS reflects allele frequency changes in the ancestors of modern Britons during the past ~2000 to 3000 years. We see strong signals of selection at lactase and the major histocompatibility complex, and in favor of blond hair and blue eyes. For polygenic adaptation, we find that recent selection for increased height has driven allele frequency shifts across most of the genome. Moreover, we identify shifts associated with other complex traits, suggesting that polygenic adaptation has played a pervasive role in shaping genotypic and phenotypic variation in modern humans.
  • CITATION : Field Y, Boyle EA, Telis N, Gao Z, ...&, Pritchard JK. (2016) Detection of human adaptation during the past 2000 years Science, 354 (6313) 760-764. doi:10.1126/science.aag0776. PMID 27738015
  • JOURNAL_INFO : Science ; Science ; 2016 ; 354 ; 6313 ; 760-764
  • PUBMED_LINK : 27738015

XP-EHH

  • NAME : XP-EHH
  • SHORT NAME : XP-EHH
  • FULL NAME : Cross-population extended haplotype homozygosity
  • TITLE : Detecting selection using extended haplotype homozygosity (EHH)-based statistics in unphased or unpolarized data
  • DOI : 10.1371/journal.pone.0262024
  • ABSTRACT : Analysis of population genetic data often includes a search for genomic regions with signs of recent positive selection. One of such approaches involves the concept of extended haplotype homozygosity (EHH) and its associated statistics. These statistics typically require phased haplotypes, and some of them necessitate polarized variants. Here, we unify and extend previously proposed modifications to loosen these requirements. We compare the modified versions with the original ones by measuring the false discovery rate in simulated whole-genome scans and by quantifying the overlap of inferred candidate regions in empirical data. We find that phasing information is indispensable for accurate estimation of within-population statistics (for all but very large samples) and of cross-population statistics for small samples. Ancestry information, in contrast, is of lesser importance for both types of statistic. Our publicly available R package rehh incorporates the modified statistics presented here.
  • CITATION : Klassmann A, Gautier M. (2022) Detecting selection using extended haplotype homozygosity (EHH)-based statistics in unphased or unpolarized data PLoS One, 17 (1) e0262024. doi:10.1371/journal.pone.0262024. PMID 35041674
  • JOURNAL_INFO : PloS one ; PLoS One ; 2022 ; 17 ; 1 ; e0262024
  • PUBMED_LINK : 35041674

f

  • NAME : f
  • SHORT NAME : f
  • FULL NAME : fraction of sites under selection
  • TITLE : A flexible method for estimating the fraction of fitness influencing mutations from large sequencing data sets
  • DOI : 10.1101/gr.203059.115
  • ABSTRACT : A continuing challenge in the analysis of massively large sequencing data sets is quantifying and interpreting non-neutrally evolving mutations. Here, we describe a flexible and robust approach based on the site frequency spectrum to estimate the fraction of deleterious and adaptive variants from large-scale sequencing data sets. We applied our method to approximately 1 million single nucleotide variants (SNVs) identified in high-coverage exome sequences of 6515 individuals. We estimate that the fraction of deleterious nonsynonymous SNVs is higher than previously reported; quantify the effects of genomic context, codon bias, chromatin accessibility, and number of protein-protein interactions on deleterious protein-coding SNVs; and identify pathways and networks that have likely been influenced by positive selection. Furthermore, we show that the fraction of deleterious nonsynonymous SNVs is significantly higher for Mendelian versus complex disease loci and in exons harboring dominant versus recessive Mendelian mutations. In summary, as genome-scale sequencing data accumulate in progressively larger sample sizes, our method will enable increasingly high-resolution inferences into the characteristics and determinants of non-neutral variation.
  • CITATION : Moon S, Akey JM. (2016) A flexible method for estimating the fraction of fitness influencing mutations from large sequencing data sets Genome Res., 26 (6) 834-843. doi:10.1101/gr.203059.115. PMID 27197222
  • JOURNAL_INFO : Genome research ; Genome Res. ; 2016 ; 26 ; 6 ; 834-843
  • PUBMED_LINK : 27197222

iHS

  • NAME : iHS
  • SHORT NAME : iHS
  • FULL NAME : Integrated haplotype score
  • TITLE : A map of recent positive selection in the human genome
  • DOI : 10.1371/journal.pbio.0040072
  • ABSTRACT : The identification of signals of very recent positive selection provides information about the adaptation of modern humans to local conditions. We report here on a genome-wide scan for signals of very recent positive selection in favor of variants that have not yet reached fixation. We describe a new analytical method for scanning single nucleotide polymorphism (SNP) data for signals of recent selection, and apply this to data from the International HapMap Project. In all three continental groups we find widespread signals of recent positive selection. Most signals are region-specific, though a significant excess are shared across groups. Contrary to some earlier low resolution studies that suggested a paucity of recent selection in sub-Saharan Africans, we find that by some measures our strongest signals of selection are from the Yoruba population. Finally, since these signals indicate the existence of genetic variants that have substantially different fitnesses, they must indicate loci that are the source of significant phenotypic variation. Though the relevant phenotypes are generally not known, such loci should be of particular interest in mapping studies of complex traits. For this purpose we have developed a set of SNPs that can be used to tag the strongest approximately 250 signals of recent selection in each population.
  • CITATION : Voight BF, Kudaravalli S, Wen X, Pritchard JK. (2006) A map of recent positive selection in the human genome PLoS Biol., 4 (3) e72. doi:10.1371/journal.pbio.0040072. PMID 16494531
  • JOURNAL_INFO : PLoS biology ; PLoS Biol. ; 2006 ; 4 ; 3 ; e72
  • PUBMED_LINK : 16494531