The Data and Research Core addresses research priorities and needs by linking and preparing multiple sources and types of research data. To accomplish its mission, the Data and Research Core facilitates the extraction and transformation of data from EHR and data on lifestyle contributors to health for research use.
About the Data and Research Core
Expanding Health Data for AI
The mission of the AIM-AHEAD Data and Research Core (DRC) is to broaden the scope of healthcare data in artificial intelligence and machine learning (AI/ML) and expand its availability to health researchers. The DRC is not a single database. Instead, AIM-AHEAD seeks to catalyze an ecosystem of datasets to help enrich the data used in AI/ML models.
AIM-AHEAD Data Partners
Data Set Options for Research Funded by AIM-AHEAD
These data sources are options for projects teams to propose for AIM-AHEAD-funded research projects. Applicants may also propose other data sources for their projects. As noted in the right column, AIM-AHEAD data partners provide extra services to facilitate access and mentorship to AIM-AHEAD-approved project teams.
|
Source |
Description |
Data Allowed |
Access Notes |
|
A customized subset from OCHIN Community Health Database |
EHR data from community health centers across US |
HIPAA Limited Data Set, individual-patient level data with dates and geographic indicators if needed for research |
AIM-AHEAD Data Partner* with facilitated access, concierge services for funded projects. Available through AIM-AHEAD Service Workbench, data use agreement and IRB required. (see below) |
|
Data Bridge from MedStar Health (Curated data from the MedStar Health EHR) |
EHR data from hospital system network with broad patient representation |
Multiple curated dataset options (further detail on website) pre-curated or custom curated de-identified EHR, Limited Dataset, Full PHI EHR dataset, Imaging, Select clinical notes, select genomics data, synthetic data |
AIM-AHEAD Data Partner* with facilitated access, concierge services for funded projects. Available through MedStar Health, data use agreement and IRB required. (see below) |
|
Selected large-scale cohorts related to heart, lung, blood and sleep disorders. Includes both prospective clinical studies and associated genomic TOPMED data. |
De-identified dataset. Including individual level genomic (TOPMED full genomes) and clinical datasets. |
Available on NHLBI BioData Catalyst Infrastructure. Requires approval of Data Access Request; most datasets require IRB. |
|
|
A variety of datasets available including clinical and genomic data |
Public data, and controlled access data (depends on dataset) |
Available on AIM-AHEAD Service Workbench; access requirements depend on the dataset. |
|
|
The All of Us Research Program is building one of the largest biomedical data resources of its kind. |
The All of Us Research Hub stores health data from a group of participants from across the United States. |
Available on All of Us Research Workbench, requires registration and institutional use agreement. |
|
|
ScHARe is a cloud-based research collaboration platform developed by the NIMHD and the National Institute of Nursing Research |
Google-hosted Public Datasets ScHARe-hosted Public Datasets ScHARe-hosted Project Datasets |


