References Protein

Curation of Protein — listings under the References tab.

Summary Table

Click a column header to sort the table.

NAME	CATEGORY	Main citation	YEAR
BioGRID	Interaction	Oughtred R et al., Protein Sci, 2021	2021
STRING	Interaction	Szklarczyk D et al., Nucleic Acids Res, 2023	2023
Reactome	Pathway	Milacic M et al., Nucleic Acids Res, 2024	2024
UniProt	Proetin	UniProt Consortium, Nucleic Acids Res, 2025	2025

Interaction

BioGRID

Reference

PUBMED_LINK

33070389

DESCRIPTION

BioGRID is a biomedical interaction repository with data compiled through comprehensive curation efforts. Our current index is version 4.4.242 and searches 86,339 publications for 2,834,410 protein and genetic interactions, 31,144 chemical interactions and 1,128,339 post translational modifications from major model organism species. All data are freely provided via our search index and available for download in many standardized formats.

Show full descriptionShow less

URL

https://thebiogrid.org/

KEYWORDS

Interaction

Show full keywordsShow less

TITLE

The BioGRID database: A comprehensive biomedical resource of curated protein, genetic, and chemical interactions.

Main citation

Oughtred R, Rust J, Chang C, Breitkreutz BJ, ...&, Tyers M. (2021) The BioGRID database: A comprehensive biomedical resource of curated protein, genetic, and chemical interactions. Protein Sci, 30 (1) 187-200. doi:10.1002/pro.3978. PMID 33070389

ABSTRACT

The BioGRID (Biological General Repository for Interaction Datasets, thebiogrid.org) is an open-access database resource that houses manually curated protein and genetic interactions from multiple species including yeast, worm, fly, mouse, and human. The ~1.93 million curated interactions in BioGRID can be used to build complex networks to facilitate biomedical discoveries, particularly as related to human health and disease. All BioGRID content is curated from primary experimental evidence in the biomedical literature, and includes both focused low-throughput studies and large high-throughput datasets. BioGRID also captures protein post-translational modifications and protein or gene interactions with bioactive small molecules including many known drugs. A built-in network visualization tool combines all annotations and allows users to generate network graphs of protein, genetic and chemical interactions. In addition to general curation across species, BioGRID undertakes themed curation projects in specific aspects of cellular regulation, for example the ubiquitin-proteasome system, as well as specific disease areas, such as for the SARS-CoV-2 virus that causes COVID-19 severe acute respiratory syndrome. A recent extension of BioGRID, named the Open Repository of CRISPR Screens (ORCS, orcs.thebiogrid.org), captures single mutant phenotypes and genetic interactions from published high throughput genome-wide CRISPR/Cas9-based genetic screens. BioGRID-ORCS contains datasets for over 1,042 CRISPR screens carried out to date in human, mouse and fly cell lines. The biomedical research community can freely access all BioGRID data through the web interface, standardized file downloads, or via model organism databases and partner meta-databases.

Show full abstractShow less

DOI

10.1002/pro.3978

STRING

Reference

PUBMED_LINK

36370105

DESCRIPTION

STRING is a database of known and predicted protein-protein interactions. The interactions include direct (physical) and indirect (functional) associations; they stem from computational prediction, from knowledge transfer between organisms, and from interactions aggregated from other (primary) databases.

Show full descriptionShow less

URL

https://string-db.org/

KEYWORDS

Interaction

Show full keywordsShow less

TITLE

The STRING database in 2023: protein-protein association networks and functional enrichment analyses for any sequenced genome of interest.

Main citation

Szklarczyk D, Kirsch R, Koutrouli M, Nastou K, ...&, von Mering C. (2023) The STRING database in 2023: protein-protein association networks and functional enrichment analyses for any sequenced genome of interest. Nucleic Acids Res, 51 (D1) D638-D646. doi:10.1093/nar/gkac1000. PMID 36370105

ABSTRACT

Much of the complexity within cells arises from functional and regulatory interactions among proteins. The core of these interactions is increasingly known, but novel interactions continue to be discovered, and the information remains scattered across different database resources, experimental modalities and levels of mechanistic detail. The STRING database (https://string-db.org/) systematically collects and integrates protein-protein interactions-both physical interactions as well as functional associations. The data originate from a number of sources: automated text mining of the scientific literature, computational interaction predictions from co-expression, conserved genomic context, databases of interaction experiments and known complexes/pathways from curated sources. All of these interactions are critically assessed, scored, and subsequently automatically transferred to less well-studied organisms using hierarchical orthology information. The data can be accessed via the website, but also programmatically and via bulk downloads. The most recent developments in STRING (version 12.0) are: (i) it is now possible to create, browse and analyze a full interaction network for any novel genome of interest, by submitting its complement of encoded proteins, (ii) the co-expression channel now uses variational auto-encoders to predict interactions, and it covers two new sources, single-cell RNA-seq and experimental proteomics data and (iii) the confidence in each experimentally derived interaction is now estimated based on the detection method used, and communicated to the user in the web-interface. Furthermore, STRING continues to enhance its facilities for functional enrichment analysis, which are now fully available also for user-submitted genomes.

Show full abstractShow less

DOI

10.1093/nar/gkac1000

Pathway

Reactome

Reference

PUBMED_LINK

37941124

DESCRIPTION

REACTOME is an open-source, open access, manually curated and peer-reviewed pathway database. Our goal is to provide intuitive bioinformatics tools for the visualization, interpretation and analysis of pathway knowledge to support basic and clinical research, genome analysis, modeling, systems biology and education. Founded in 2003, the Reactome project is led by Lincoln Stein of OICR, Peter D’Eustachio of NYU Langone Health, Henning Hermjakob of EMBL-EBI, and Guanming Wu of OHSU.

Show full descriptionShow less

URL

https://reactome.org/

KEYWORDS

Pathway

Show full keywordsShow less

TITLE

The Reactome Pathway Knowledgebase 2024.

Main citation

Milacic M, Beavers D, Conley P, Gong C, ...&, D'Eustachio P. (2024) The Reactome Pathway Knowledgebase 2024. Nucleic Acids Res, 52 (D1) D672-D678. doi:10.1093/nar/gkad1025. PMID 37941124

ABSTRACT

The Reactome Knowledgebase (https://reactome.org), an Elixir and GCBR core biological data resource, provides manually curated molecular details of a broad range of normal and disease-related biological processes. Processes are annotated as an ordered network of molecular transformations in a single consistent data model. Reactome thus functions both as a digital archive of manually curated human biological processes and as a tool for discovering functional relationships in data such as gene expression profiles or somatic mutation catalogs from tumor cells. Here we review progress towards annotation of the entire human proteome, targeted annotation of disease-causing genetic variants of proteins and of small-molecule drugs in a pathway context, and towards supporting explicit annotation of cell- and tissue-specific pathways. Finally, we briefly discuss issues involved in making Reactome more fully interoperable with other related resources such as the Gene Ontology and maintaining the resulting community resource network.

Show full abstractShow less

DOI

10.1093/nar/gkad1025

Proetin

UniProt

Reference

PUBMED_LINK

39552041

DESCRIPTION

The Universal Protein Resource (UniProt) is a comprehensive resource for protein sequence and annotation data. The UniProt databases are the UniProt Knowledgebase (UniProtKB), the UniProt Reference Clusters (UniRef), and the UniProt Archive (UniParc). The UniProt consortium and host institutions EMBL-EBI, SIB and PIR are committed to the long-term preservation of the UniProt databases.

Show full descriptionShow less

URL

https://www.uniprot.org/

TITLE

UniProt: the Universal Protein Knowledgebase in 2025.

Main citation

UniProt Consortium. (2025) UniProt: the Universal Protein Knowledgebase in 2025. Nucleic Acids Res, 53 (D1) D609-D617. doi:10.1093/nar/gkae1010. PMID 39552041

ABSTRACT

The aim of the UniProt Knowledgebase (UniProtKB; https://www.uniprot.org/) is to provide users with a comprehensive, high-quality and freely accessible set of protein sequences annotated with functional information. In this publication, we describe ongoing changes to our production pipeline to limit the sequences available in UniProtKB to high-quality, non-redundant reference proteomes. We continue to manually curate the scientific literature to add the latest functional data and use machine learning techniques. We also encourage community curation to ensure key publications are not missed. We provide an update on the automatic annotation methods used by UniProtKB to predict information for unreviewed entries describing unstudied proteins. Finally, updates to the UniProt website are described, including a new tab linking protein to genomic information. In recognition of its value to the scientific community, the UniProt database has been awarded Global Core Biodata Resource status.

Show full abstractShow less

DOI

10.1093/nar/gkae1010