1000 Genomes

Catalog entries using this tag (links open the entry card on its page):

Entries

HapMap Phase I

STAGE_PERIOD

2003–2005

DESCRIPTION

International HapMap Project first data release: ~1 million SNPs in CEU, YRI, and JPT+CHB; produced the first genome-wide LD and recombination maps and drove early GWAS SNP selection and imputation panels.

Show full descriptionShow less

URL

https://www.genome.gov/10001688/international-hapmap-project

HapMap Phase II

1000 Genomes HapMap Reference

STAGE_PERIOD

2005–2007

DESCRIPTION

Expanded SNP density (~3.1M SNPs) and haplotype structure across the same core panels; improved tagging coverage and supported finer-scale association and phasing workflows before large-scale resequencing.

Show full descriptionShow less

URL

https://www.genome.gov/10001688/international-hapmap-project

HapMap Phase III

1000 Genomes HapMap Reference

STAGE_PERIOD

2007–2009

DESCRIPTION

Extended to 11 populations and ~1.6M SNPs; broader ancestry representation and LD maps that informed the design and early phases of the 1000 Genomes Project.

Show full descriptionShow less

URL

https://www.genome.gov/10001688/international-hapmap-project

High-coverage on GRCh38 (NYGC)

1000 Genomes WGS

STAGE_PERIOD

2019–

DESCRIPTION

High-coverage whole-genome sequencing of a subset of Phase 3 samples on GRCh38; improves rare-variant discovery, phasing, and structural-variant catalogs while staying aligned with the 1000 Genomes sample framework.

Show full descriptionShow less

URL

https://www.internationalgenome.org/

Phase 1

1000 Genomes Reference

STAGE_PERIOD

2010–2011

DESCRIPTION

Expanded low-coverage WGS (~1,092 individuals) with exome capture and dense SNP genotyping; primary SNP and indel reference for early imputation panels.

Show full descriptionShow less

URL

https://www.internationalgenome.org/

Phase 3

1000 Genomes Reference

STAGE_PERIOD

2012–2015

DESCRIPTION

~2,504 individuals across 26 populations; GRCh37/38 VCF releases became the standard allele-frequency, LD, and imputation backbone for GWAS and SV pipelines.

Show full descriptionShow less

URL

https://www.internationalgenome.org/

Pilot

1000 Genomes Reference

STAGE_PERIOD

2008–2010

DESCRIPTION

Proof-of-concept low-coverage whole-genome sequencing and SNP arrays across multiple populations; established protocols and data model for the main project.

Show full descriptionShow less

URL

https://www.internationalgenome.org/

b37

Reference 1000 Genomes WGS

FULL NAME

Broad Institute Homo_sapiens_assembly19 (b37)

DESCRIPTION

GRCh37-compatible reference FASTA used across Broad Institute and 1000 Genomes workflows: chromosomes 1-22, X, Y, MT, plus GL/NC unlocalized and unplaced contigs (as in the distributed assembly19 package). Coordinate system matches the 1KG/b37 ecosystem used by many GWAS imputation and joint-calling pipelines.

Show full descriptionShow less

URL

https://data.broadinstitute.org/snowman/hg19/

KEYWORDS

GRCh37; 1000 Genomes; Broad; b37; reference FASTA

Show full keywordsShow less

Main citation

Broad Institute / 1000 Genomes Project. Homo_sapiens_assembly19.fasta (b37). https://data.broadinstitute.org/snowman/hg19/

hs37d5

Reference 1000 Genomes WGS

FULL NAME

1000 Genomes GRCh37 + decoy (hs37d5)

DESCRIPTION

GRCh37 (b37-style) primary chromosomes and contigs plus the hs37d5 decoy sequence set (HuRef/BAC/Fosmid/NA12878-derived sequences) to reduce spurious alignments in short-read mapping. Standard reference for Phase 3-era 1000 Genomes alignment and many imputation and low-pass WGS workflows that target the 1KG coordinate system.

Show full descriptionShow less

URL

https://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/phase2_reference_assembly_sequence/

KEYWORDS

GRCh37; decoy; 1000 Genomes; alignment; hs37d5

Show full keywordsShow less

Main citation

1000 Genomes Project / Broad Institute. hs37d5 reference (GRCh37 plus decoy sequences). https://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/phase2_reference_assembly_sequence/

humanG1Kv37

Reference 1000 Genomes WGS

FULL NAME

1000 Genomes human_g1k_v37 reference

DESCRIPTION

GRCh37-based reference FASTA distributed by the 1000 Genomes Project (human_g1k_v37): chromosomes 1-22, X, Y, MT, plus GL unlocalized/unplaced contigs, without separate haplotype scaffolds or EBV. Commonly used as the Phase 1/III alignment reference when harmonizing with public 1KG VCFs and phase panels.

Show full descriptionShow less

URL

https://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/

KEYWORDS

GRCh37; 1000 Genomes; reference FASTA; human_g1k_v37

Show full keywordsShow less

Main citation

1000 Genomes Project. human_g1k_v37 reference (GRCh37). https://ftp.1000genomes.ebi.ac.uk/vol1/ftp/technical/reference/