
Data and Research Core (DRC)
Address research priorities and needs for conducting AI/ML research targeting the use of electronic health records
About
The mission of the AIM-AHEAD Data and Research Core (DRC) is to broaden the scope of healthcare data in artificial intelligence and machine learning (AI/ML) and expand its availability to health disparities researchers.
The DRC is not a single database. Instead, AIM-AHEAD seeks to catalyze an ecosystem of datasets to help enrich the data used in AI/ML models.
Data Set Options for Research Funded by AIM-AHEAD
These data sources are options for projects teams to propose for AIM-AHEAD-funded research projects. Applicants may also propose other data sources for their projects. As noted in the right column, AIM-AHEAD data partners provide extra services to facilitate access and mentorship to AIM-AHEAD-approved project teams.
Source |
Description |
Data Allowed |
Access Notes |
A customized subset from OCHIN |
EHR data from under-resourced communities |
HIPAA Limited Data Set, individual-patient level data with dates and geographic indicators if needed for research |
AIM-AHEAD Data Partner* with facilitated access, concierge services for funded projects. Available through AIM-AHEAD Service Workbench, data use agreement and IRB required. (see below) |
Data Bridge from MedStar Health (Curated data from the MedStar Health EHR) |
EHR data from hospital system network with patient data from under-resourced populations |
Multiple curated dataset options (further detail on website) pre-curated or custom curated de-identified EHR, Limited Dataset, Full PHI EHR dataset, Imaging, Select clinical notes, select genomics data, synthetic data |
AIM-AHEAD Data Partner* with facilitated access, concierge services for funded projects. Available through MedStar Health, data use agreement and IRB required. (see below) |
Selected large-scale cohorts related to heart, lung, blood and sleep disorders. Includes both prospective clinical studies and associated genomic TOPMED data. |
De-identified dataset. Including individual level genomic (TOPMED full genomes) and clinical datasets. |
Available on NHLBI BioData Catalyst Infrastructure. Requires approval of Data Access Request; most datasets require IRB. |
|
A variety of datasets available including clinical and genomic data |
Public data, and controlled access data (depends on dataset) |
Available on AIM-AHEAD Service Workbench; access requirements depend on the dataset. |
|
The All of Us Research Program is building one of the largest biomedical data resources of its kind. |
The All of Us Research Hub stores health data from a group of participants from across the United States. |
Available on All of Us Research Workbench, requires registration and institutional use agreement. |
|
ScHARe is a cloud-based research collaboration platform developed by the NIMHD and the National Institute of Nursing Research |
Google-hosted Public Datasets ScHARe-hosted Public Datasets ScHARe-hosted Project Datasets |
The DRC and Infrastructure Core also collaborate to assist AIM-AHEAD awardees in locating other data sources to support their projects. As part of its mission to expand datasets used in AI/ML, AIM-AHEAD has conducted a landscape survey to raise awareness about datasets that may be of interest to the research community. Each dataset has its own governance process and rules for access.
AIM-AHEAD Data Partners
AIM-AHEAD-funded projects may apply to receive facilitated access and data concierge services from AIM-AHEAD data partners that emphasize under-resourced populations.