Address research priorities and needs to form an inclusive basis for conducting equity-focused AI/ML research targeting the use of electronic health records
About
The mission of the AIM-AHEAD Data and Research Core (DRC) is to broaden the diversity and representation of healthcare data in artificial intelligence and machine learning (AI/ML) and expand its availability to diverse teams of researchers to address health disparities.
The DRC is not a single database. Instead, AIM-AHEAD seeks to catalyze an ecosystem of datasets to help address the lack of population diversity in data used in AI/ML models.
Data Set Options for Research Funded by AIM-AHEAD
These data sources are options for projects teams to propose for AIM-AHEAD-funded research projects. Applicants may also propose other data sources for their projects. As noted in the right column, AIM-AHEAD data partners provide extra services to facilitate access and mentorship to AIM-AHEAD-approved project teams.
HIPAA Limited Data Set, individual-patient level data with dates and geographic indicators if needed for research
AIM-AHEAD Data Partner* with facilitated access, concierge services for funded projects. Available through AIM-AHEAD Service Workbench, data use agreement and IRB required. (see below)
EHR data from hospital system network with 31% African American patient representation
Multiple curated dataset options (further detail on website) pre-curated or custom curated de-identified EHR, Limited Dataset, Full PHI EHR dataset, Imaging, Select clinical notes, select genomics data, synthetic data
AIM-AHEAD Data Partner* with facilitated access, concierge services for funded projects. Available through MedStar Health, data use agreement and IRB required. (see below)
Selected large-scale cohorts related to heart, lung, blood and sleep disorders. Includes both prospective clinical studies and associated genomic TOPMED data.
De-identified dataset. Including individual level genomic (TOPMED full genomes) and clinical datasets.
ScHARe is a cloud-based research collaboration platform developed by the National Institute on Minority Health and Health Disparities and the National Institute of Nursing Research
The DRC and Infrastructure Core also collaborate to assist AIM-AHEAD awardees in locating other data sources to support their projects. As part of its mission to diversify datasets used in AI/ML, AIM-AHEAD has conducted a landscape survey to raise awareness about datasets that may be of interest to the research community. Each dataset has its own governance process and rules for access.
AIM-AHEAD-funded projects may apply to receive facilitated access and data concierge services from AIM-AHEAD data partners that emphasize historically under-resourced and under-represented populations.
Data Bridge from MedStar HealthMedStar Health and MedStar Health Research Institute (MHRI) include an extensive network of clinical facilities in the mid-Atlantic region
Learn More
OCHIN Community Health Equity DatabaseOCHIN, a nonprofit health care innovation center with a core mission to advance health equity.
Learn More
How AIM-AHEAD Data Partners Expand Representation
Race
People who select a single race other than White, or who select more than one race
3,087,377
2,220,068
Ethnicity
People who select an ethnicity other than those listed under the race of White
2,618,1291
1,962,904
Age
<18 years old and 65 years and above
2,623,517
2,374,283
Sexual and Gender Minority
Individuals with sexual orientation other than ‘straight,’ gender identity other than ‘man’ or ‘woman,’ and/or sex other than ‘male’ or ‘female’
326,557
Not well-captured
Income
Annual household income < $25,000
4,911,886
Not well-captured
Education
People without a high school diploma or GED
Not well-captured but FQHCs generally higher than general population
2,560
Access to Care
Needed a medical visit in the past 12 months but cannot readily use the health care system or pay for needed care
Not well-captured but FQHCs generally higher than general population
Not well-captured
Geography
Residents of established rural and non-metropolitan zip codes, based on the HRSA Federal Office of Rural Health Policy data files
1,276,525
83,3092
Disability
People with a physical, functional, cognitive, or other condition that substantially limits one or more life activities
1. People with Hispanic ethnicity at any race 2. Based on rural and suburban hospital discharges 3. Based on ICD codes for disability in study by Clark et al, including physical, visual, hearing, intellectual/developmental disabilities