Skip to content

AI Datasets Clinical EHR

Curation of Clinical EHR within Datasets — listings under the AI tab.

Summary Table

Click a column header to sort the table.

NAME Main citation YEAR
MIMIC-IV
Johnson AEW et al., Sci Data, 2023
2023
eICU Collaborative Research Database
Pollard TJ et al., Sci Data, 2018
2018

MIMIC-IV

AI Datasets Clinical EHR ICU Critical Care PhysioNet MIMIC de-identified EHR
PUBMED_LINK
36596836
FULL NAME
Medical Information Mart for Intensive Care IV
DESCRIPTION
MIMIC-IV is a large, freely-available de-identified clinical database comprising over 300,000 patients admitted to the Beth Israel Deaconess Medical Center (2008-2019). It includes comprehensive ICU and Emergency Department data: demographics, vital signs, laboratory measurements, medications, procedures, diagnoses (ICD codes), imaging reports, nursing notes, and mortality outcomes. The relational database (BigQuery or local PostgreSQL) links hospital admissions (ADMISSIONS), patient stays (ICUSTAYS), charted observations (CHARTEVENTS), lab events (LABEVENTS), microbiology data (MICROBIOLOGYEVENTS), prescriptions (PRESCRIPTIONS), and discharge summaries. MIMIC-IV replaces MIMIC-III (2001-2012) with a modernized schema, cleaner data model, and expanded coverage. Widely used for developing and benchmarking clinical AI models (mortality prediction, sepsis detection, phenotyping, NLP), it requires credentialed access via PhysioNet (CITI Data or Specimens Only course). Supporting datasets include MIMIC-CXR (chest X-ray images) and MIMIC-NOTE (de-identified clinical notes).
URL
https://physionet.org/content/mimiciv/
KEYWORDS
EHR, ICU, clinical database, de-identified, Beth Israel, critical care, MIMIC, medical informatics
TITLE
MIMIC-IV, a freely accessible electronic health record dataset.
Main citation
Johnson AEW, Bulgarelli L, Shen L, Gayles A, Shammout A, Horng S, Pollard TJ, Hao S, Moody B, Gow B, Lehman LH, Celi LA, Mark RG. (2023) MIMIC-IV, a freely accessible electronic health record dataset. Scientific Data, 10:1. doi:10.1038/s41597-022-01899-x. PMID 36596836
ABSTRACT
MIMIC-IV is a publicly available database of de-identified electronic health records for patients admitted to the Beth Israel Deaconess Medical Center (BIDMC) in Boston, Massachusetts. The database is updated annually and is freely available to credentialed researchers. MIMIC-IV contains information on patient demographic characteristics, vital signs, laboratory measurements, medications, and diagnoses. We describe the process of creating the database, the structure of the data, and the tools available to users. MIMIC-IV is a valuable resource for researchers in critical care, clinical informatics, and machine learning.
DOI
10.1038/s41597-022-01899-x

eICU Collaborative Research Database (eICU-CRD)

AI Datasets Clinical EHR ICU Critical Care PhysioNet Multi-center Free Access
PUBMED_LINK
30204154
DESCRIPTION
The eICU Collaborative Research Database (eICU-CRD) is a large multi-center intensive care unit database from Philips Healthcare's eICU telehealth program, in partnership with MIT Laboratory for Computational Physiology. Contains de-identified data for over 200,000 admissions from 139,000 unique patients across 335 ICU units at 208 US hospitals (2014-2015). Includes: demographics, vital signs, care plan documentation, severity of illness measures (APACHE IV), diagnoses (3,933 unique active problems), laboratory measurements (158 lab types), medications, continuous infusions, intake/output, microbiology, nurse charting, and structured notes. Data access follows the same process as MIMIC: free of charge, requires PhysioNet credentialed access (CITI Data or Specimens Only Research course + Data Use Agreement). Complements MIMIC-IV (single-center, Boston) with multi-center US coverage for external validation and generalizability of ICU ML models.
URL
https://eicu-crd.mit.edu/
KEYWORDS
eICU, critical care, ICU, multicenter, Philips, PhysioNet, vital signs, severity of illness
TITLE
The eICU Collaborative Research Database, a freely available multi-center database for critical care research.
Main citation
Pollard TJ, Johnson AEW, Raffa JD, Celi LA, Mark RG, Badawi O. (2018) The eICU Collaborative Research Database, a freely available multi-center database for critical care research. Scientific Data, 5:180178. doi:10.1038/sdata.2018.178. PMID 30204154
ABSTRACT
Critical care patients are monitored closely through the course of their illness. Philips Healthcare has developed a telehealth system, the eICU Program, which leverages these data to support management of critically ill patients. Here we describe the eICU Collaborative Research Database, a multi-center intensive care unit (ICU) database with high granularity data for over 200,000 admissions to ICUs monitored by eICU Programs across the United States.
DOI
10.1038/sdata.2018.178