Data and Research Core (DRC)
The mission of the AIM-AHEAD Data and Research Core (DRC) is to broaden the diversity and representation of healthcare data in AI/ML and expand its availability to diverse teams of researchers to address health disparities.
The DRC is not a single database. Instead, AIM-AHEAD seeks to catalyze an ecosystem of datasets to help address the lack of population diversity in data used in AI/ML models.
Data Set Options for AIM-AHEAD-funded Research
The following data sources are options for projects teams to propose for AIM-AHEAD-funded research projects. Applicants may also propose other data sources for their projects. As noted in the right column, AIM-AHEAD data partners provide extra services to facilitate access and mentorship to AIM-AHEAD-approved project teams.
Source |
Description |
Data Allowed |
Access Notes |
A customized subset from OCHIN Community Health Equity Database |
EHR data from underserved communities |
HIPAA Limited Data Set, individual-patient level data with dates and geographic indicators if needed for research |
AIM-AHEAD Data Partner* with facilitated access, concierge services for funded projects. Available through AIM-AHEAD Service Workbench, data use agreement and IRB required. (see below) |
Data Bridge from MedStar Health (Curated data from the MedStar Health EHR) |
EHR data from hospital system network with 31% African American patient representation |
Multiple curated dataset options (further detail on website) pre-curated or custom curated de-identified EHR, Limited Dataset, Full PHI EHR dataset, Imaging, Select clinical notes, select genomics data, synthetic data |
AIM-AHEAD Data Partner* with facilitated access, concierge services for funded projects. Available through MedStar Health, data use agreement and IRB required. (see below) |
Selected large-scale cohorts related to heart, lung, blood and sleep disorders. Includes both prospective clinical studies and associated genomic TOPMED data. |
De-identified dataset. Including individual level genomic (TOPMED full genomes) and clinical datasets. |
Available on NHLBI BioData Catalyst Infrastructure. Requires approval of Data Access Request; most datasets require IRB. |
|
A variety of datasets available including clinical and genomic data |
Public data, and controlled access data (depends on dataset) |
Available on AIM-AHEAD Service Workbench; access requirements depend on the dataset. |
|
The All of Us Research Program is building one of the largest biomedical data resources of its kind. |
The All of Us Research Hub stores health data from a diverse group of participants from across the United States. |
Available on All of Us Research Workbench, requires registration and institutional use agreement. |
|
ScHARe is a cloud-based research collaboration platform developed by the National Institute on Minority Health and Health Disparities and the National Institute of Nursing Research |
Google-hosted Public Datasets ScHARe-hosted Public Datasets ScHARe-hosted Project Datasets |
The DRC and Infrastructure Core also collaborate to assist AIM-AHEAD awardees in locating other data sources to support their projects. As part of its mission to diversify datasets used in AI/ML, AIM-AHEAD has conducted a landscape survey to raise awareness about datasets that may be of interest to the research community. Each dataset has its own governance process and rules for access.
AIM-AHEAD Data Partners
AIM-AHEAD-funded projects may apply to receive facilitated access and data concierge services from AIM-AHEAD data partners that emphasize historically under-resourced and under-represented populations:
How AIM-AHEAD Data Partners Expand Representation
Race |
People who select a single race other than White, or who select more than one race |
2,620,875 |
2,467,865 |
Ethnicity |
People who select an ethnicity other than those listed under the race of White |
2,224,7561 |
2,233,438 |
Age |
<18 years old and 65 years and above |
1,337,789 |
1,424,082 |
Sexual and Gender Minority |
|
249,9782 |
Not well-captured |
Income |
Annual household income < $25,000 |
4,129,925 |
Not well-captured |
Education |
People without a high school diploma or GED |
Not well-captured but FQHCs generally higher than general population |
2,560 |
Access to Care |
Needed a medical visit in the past 12 months but cannot readily use the health care system or pay for needed care |
Not well-captured but FQHCs generally higher than general population |
Not well-captured |
Geography |
Residents of established rural and non-metropolitan zip codes, based on the HRSA Federal Office of Rural Health Policy data files |
1,072,666 |
36,3453 |
Disability |
People with a physical, functional, cognitive, or other condition that substantially limits one or more life activities |
632,5744 |
266,2264 |
Source: All of Us reference UBR categories
1People with Hispanic ethnicity and any race
2People with ‘other’ sex or non-straight sexual orientation or non-male, non-female gender identity
3Based on rural and suburban hospital discharges
4Based on ICD codes for disability in study by Clark et al , including physical, visual, hearing, intellectual/developmental disabilities
Data & Research Core (DRC) Office Hours
Discover the power of AIM-AHEAD provided data. AIM-AHEAD offers diverse data resources from a variety of data providers for your Call for Proposal (CFP) needs. For applicants interested in OCHIN and AADB data, our teams are eager to support your research proposals and are offering several office hours leading up to the CFP application date. We have found that past years’ applicants who have attended office hours have a better understanding of our data offerings and have created strong proposals. Our interactive sessions provide an opportunity to ask detailed questions from experienced research analysts, explore aggregate data via Cohort Discovery, and strategize scientific approaches to your research. We’re also happy to point you towards other valuable AIM-AHEAD resources which might be a better fit for your research proposal. Your success is our priority.
OCHIN’s Office Hour sessions will be tailored to applicants interested in the Research Fellowship Program, but anyone is welcome to attend. For applicants interested in using OCHIN data for other AIM-AHEAD programs, we recommend setting up an individual consult by submitting a DRC Intake form.
The AIM-AHEAD Data Bridge (AADB) Office Hours are open for all researchers interested in any of the AIM-AHEAD programs. We urge all applicants to review the AADB Data Offerings to optimally leverage the office hours. For applicants who would like methodological consultation, please submit an AADB Request Form.
May 2024
Monday |
Tuesday |
Wednesday |
Thursday |
Friday |
|
1 |
2 |
3 |
|
6 AADB 1 – 2pm PT / 4 – 5pm ET |
7 |
8 OCHIN9 – 10am PT / AADB 11- 12pm PT/ 2-3pm ET |
9 |
10 |
13 OCHIN 1 – 2pm PT / 4 – 5pm ET |
14 |
15 OCHIN11 – 12pm PT / AADB 11- 12pm PT/ 2-3pm ET |
16 |
17 AADB 7 – 8am PT / 10 – 11am ET |
20 AADB 1 – 2pm PT / 4 – 5pm ET |
21 |
22 AADB 11- 12pm PT/ 2-3pm ET |
23 |
24 |
27 |
28 |
29 AADB 11- 12pm PT/ 2-3pm ET |
30 |
31 AADB 7 – 8am PT / 10 – 11am ET |
June 2024
Monday |
Tuesday |
Wednesday |
Thursday |
Friday |
3 AADB 1 – 2pm PT / 4 – 5pm ET |
4 |
5 AADB 11- 12pm PT/ 2-3pm ET |
6 |
7 |
10 AADB 1 – 2pm PT / 4 – 5pm ET |
11 |
12 AADB 11- 12pm PT/ 2-3pm ET |
13 |
14 |