Skip to content

Major Databases AMERICA

Curation of AMERICA — major databases in this region (see the Major databases hub).

Summary Table

Click a column header to sort the table.

National Center for Biotechnology Information (NCBI)

Database
DESCRIPTION
The U.S. National Center for Biotechnology Information is NIH’s primary bioinformatics hub: literature (PubMed), reference sequences (RefSeq, GenBank), analysis tools, and major data archives. As an INSDC partner it exchanges nucleotide data with ENA and DDBJ. SRA stores raw sequencing reads; dbGaP provides controlled access to participant-level genotype and phenotype data; GEO archives functional genomics; ClinVar aggregates variant interpretations for clinical and research use. Datasets routinely cross-link to PubMed and Gene records.
PubMed
https://pubmed.ncbi.nlm.nih.gov/
Bibliographic index of biomedical literature with abstracts and links to full text where available; central entry point for citations tied to NCBI sequence and GEO records.
GenBank
https://www.ncbi.nlm.nih.gov/genbank/
Annotated nucleotide sequence database and INSDC exchange partner—deposits receive accessions mirrored with ENA and DDBJ.
SRA
https://www.ncbi.nlm.nih.gov/sra/
Sequence Read Archive—high-volume store for raw sequencing output (reads, run metadata) with INSDC exchange alongside ENA and DDBJ.
RefSeq
https://www.ncbi.nlm.nih.gov/refseq/
Non-redundant reference sequences (genomes, transcripts, proteins) used for annotation, variant reporting, and stable NM_/NP_ accessions in clinical and research workflows.
ClinVar
https://www.ncbi.nlm.nih.gov/clinvar/
Archive of genomic variant interpretations (pathogenicity, drug response) aggregated from submitters; integrates with dbSNP and clinical testing pipelines.
dbGaP
https://www.ncbi.nlm.nih.gov/gap/
Database of Genotypes and Phenotypes—study-level portal for controlled-access participant genotype, phenotype, and omics under NIH data-use agreements.
GEO
https://www.ncbi.nlm.nih.gov/geo/
Gene Expression Omnibus—curated submissions of array- and sequencing-based functional genomics and expression profiling, with series and sample records.
URL
https://www.ncbi.nlm.nih.gov/

NCI Genomic Data Commons (GDC)

Database
DESCRIPTION
Major cancer genomics data portal (harmonized TCGA, TARGET, and related programs).
URL
https://portal.gdc.cancer.gov/

CGEn

Database
DESCRIPTION
Canada's national platform for genome sequencing and analysis (research and clinical genomics infrastructure).
URL
https://www.cgen.ca/

Edit JSON under json/databases/ (folders canada/, united_states/), then run python main.py from src/.