Medical Event Data Standard

This organization contains GitHub Repositories for the Medical Event Data Standard (MEDS), a simple dataset schema for machine learning over electronic health record (EHR) data. Unlike existing tools, pipelines, or common data models, MEDS is a minimal standard designed for maximum interoperability across datasets, existing tools, and model architectures. By providing a simple standardization layer between datasets and model-specific code, MEDS can help make machine learning research for EHR data dramatically more reproducible, robust, computationally performant, and collaborative. Alongside this report, we also release several existing integrations with models, datasets, and tools, and will work actively with the community going forward for further adoption and use. See our draft proposal for more details, and please leave comments or questions via GitHub issues to help us improve this effort! Find the Contribution guidelines here.

Software Ecosystem

Project	Type	Documentation URL	Repository URL	Paper URL	Description
Core MEDS	Core	GitHub	GitHub	OpenReview	A data standard and community for building and sharing EHR machine learning tools
MEDS-Reader	Package	Docs	GitHub	arXiv	An optimized Python package for efficient EHR data processing achieving 10-100x improvements in memory, speed, and disk usage
MEDS-Transforms	Package		GitHub		A set of functions and scripts for extraction to and transformation/pre-processing of MEDS-formatted data.
MEDS-Tab	Package	Docs	GitHub		A library designed for automated tabularization, data preparation with aggregations and time windowing.
ACES	Package	Docs	GitHub	arXiv	A package and configuration language for reproducible extraction of task cohorts for machine learning over event-stream datasets
MEDS-Torch	Package	Docs	GitHub		Advancing healthcare machine learning through flexible, robust, and scalable sequence modeling tools.
MEDS-Evaluation	Package		GitHub		Evaluation pipeline for MEDS.
MEDS-ETL	Package		GitHub		Efficient ETL that supports OMOP, MIMIC, eICU, PyHealth.
FEMR	Package		GitHub		A Python package for manipulating longitudinal EHR data for machine learning, with a focus on supporting the creation of foundation models and verifying their presumed benefits in healthcare.
MEDS-DEV	Benchmark		GitHub		A benchmark for evaluating the performance of machine learning models on MEDS-formatted data.
MEDS-Inspect	Package		GitHub		A package to interactively inspect your MEDS data.

Pretrained Models

CLMBR-T-base: https://huggingface.co/StanfordShahLab/clmbr-t-base
Context Clues (a collection of Mamba, Llama, Hyena, and GPT models across context lengths from 512 - 16,384 tokens): https://huggingface.co/collections/StanfordShahLab/context-clues-6757f893f6a2918c7ab809f1

Datasets / Benchmarks

Dataset	Stays	Version	Frequency	Origin	Originally Published	License	Repository Link	MEDS ETL	Full Dataset Name
AUMCdb	23,000	v1.0.2	up to 1 minute	Netherlands	2019	Not specified	DANS	Github	Amsterdam University Medical Center Database
eICU	201,000	v2.0	5 minutes	USA	2017	PhysioNet	PhysioNet	Github	eICU Collaborative Research Database
HiRID	34,000	v1.1.1	2 / 5 minutes	Switzerland	2020	Physionet	PhysioNet	Github	High-Resolution ICU Dataset
INSPIRE	130,000	v1.2	Not specified	South Korea	2024	Korea Credentialed Health Data License	PhysioNet	Github	INformative Surgical Patient dataset for Innovative Research Environment
MIMIC-IV	73,000	v3.1	~1 hour	USA	2020	PhysioNet	PhysioNet	Github	Medical Information Mart for Intensive Care IV
NWICU	25,000	v0.1.0	Not specified	USA	2023	Physionet	PhysioNet	Github	Northwestern ICU Database
SICdb	27,350	v1.0.8	1 minute	Austria	2024	PhysioNet	PhysioNet	Github	Salzburg Intensive Care Database

EHRSHOT: https://ehrshot.stanford.edu

Coming Soon...

Tools that are planned to be compatible with MEDS:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Medical Event Data Standard

Medical Event Data Standard

Software Ecosystem

Pretrained Models

Datasets / Benchmarks

Coming Soon...

Pinned Loading

Repositories

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

People

Top languages

Uh oh!

Most used topics

Uh oh!