Data and Research Core (DRC)

Address research priorities and needs for conducting AI/ML research targeting the use of electronic health records

About

The mission of the AIM-AHEAD Data and Research Core (DRC) is to broaden the scope of healthcare data in artificial intelligence and machine learning (AI/ML) and expand its availability to health disparities researchers.

The DRC is not a single database. Instead, AIM-AHEAD seeks to catalyze an ecosystem of datasets to help enrich the data used in AI/ML models.

Data Set Options for Research Funded by AIM-AHEAD

These data sources are options for projects teams to propose for AIM-AHEAD-funded research projects. Applicants may also propose other data sources for their projects. As noted in the right column, AIM-AHEAD data partners provide extra services to facilitate access and mentorship to AIM-AHEAD-approved project teams. 

Source

Description

Data Allowed

Access Notes 

A customized subset from OCHIN

EHR data from under-resourced communities

HIPAA Limited Data Set, individual-patient level data with dates and geographic indicators if needed for research

AIM-AHEAD Data Partner* with facilitated access, concierge services for funded projects. Available through AIM-AHEAD Service Workbench, data use agreement and IRB required. (see below)

Data Bridge from MedStar Health

(Curated data from the MedStar Health EHR)

EHR data from hospital system network with patient data from under-resourced populations

Multiple curated dataset options (further detail on website) pre-curated or custom curated de-identified EHR, Limited Dataset, Full PHI EHR dataset, Imaging, Select clinical notes, select genomics data, synthetic data

AIM-AHEAD Data Partner* with facilitated access, concierge services for funded projects. Available through MedStar Health, data use agreement and IRB required. (see below)

       

60+ studies from NHLBI BioData Catalyst

Selected large-scale cohorts related to heart, lung, blood and sleep disorders. Includes both prospective clinical studies and associated genomic TOPMED data.

De-identified dataset. Including individual level genomic (TOPMED full genomes) and clinical datasets.

Available on NHLBI BioData Catalyst Infrastructure. Requires approval of Data Access Request; most datasets require IRB.

Selected 15 Open datasets on AWS

A variety of datasets available including clinical and genomic data

Public data, and controlled access data (depends on dataset)

Available on AIM-AHEAD Service Workbench; access requirements depend on the dataset. 

NIH All of Us

The All of Us Research Program is building one of the largest biomedical data resources of its kind.

The All of Us Research Hub stores health data from a group of participants from across the United States.

Available on All of Us Research Workbench, requires registration and institutional use agreement.

The ScHARe Data Ecosystem

ScHARe is a cloud-based research collaboration platform developed by the NIMHD and the  National Institute of Nursing  Research

Google-hosted Public Datasets

ScHARe-hosted Public Datasets

ScHARe-hosted Project Datasets



 

 

The DRC and Infrastructure Core also collaborate to assist AIM-AHEAD awardees in locating other data sources to support their projects. As part of its mission to expand datasets used in AI/ML, AIM-AHEAD has conducted a landscape survey to raise awareness about datasets that may be of interest to the research community. Each dataset has its own governance process and rules for access.

Apply to include a dataset in the data landscape list

View the landscape survey datasets

AIM-AHEAD Data Partners 

AIM-AHEAD-funded projects may apply to receive facilitated access and data concierge services from AIM-AHEAD data partners that emphasize under-resourced populations.

Data Bridge from MedStar Health MedStar Health and MedStar Health Research Institute (MHRI) include an extensive network of clinical facilities in the mid-Atlantic region Learn More
OCHIN Database OCHIN, a nonprofit health care innovation center. Learn More
Scroll to top