Call for Applications (Closed)
AIM-AHEAD All of Us Training Program
Traineeship in Advanced Data Analysis using the All of Us Database
The AIM-AHEAD All of Us Training Program is intended to increase researcher diversity in AI/ML by leveraging the All of Us data and infrastructure (Researcher Workbench). This 8-month training program will engage a diverse group of 25 graduate students, postdocs, early-career faculty, healthcare professionals, and other non-academic professionals from underrepresented populations.
Funding Cycle | 2024-2025 |
Release Date | October 18, 2024 |
Application Due Date | Monday, November 18, 2024. Applications must be received by 11:59 p.m. Eastern Time |
Notification of Award | January 6, 2025 |
Program Start Date | January 15, 2025 |
Informational Webinar Schedule | Thursday, October 31 2024, 1:00 - 2:00 p.m. Central Time, 2:00 - 3:00 p.m. Eastern Time |
Informational Webinar Recording | Click to watch the October 31, 2024 informational webinar recording Click to open the October 31, 2024 informational webinar slides |
Application Link | Applications are now closed. |
Project Period | 8-month training program |
Stipend | $8,000 stipend and $2,000 allowance to attend the AIM-AHEAD Annual Meeting 2025 |
Mentor(s) | Trainees will receive direct 1:1 support and guidance on career development and research from an experienced AIM-AHEAD mentor |
NIH Biosketch | Applicant NIH biosketch or Curriculum Vitae (CV) is required |
Letters of Support | Minimum of two letters from the applicant’s supervisor(s) and faculty |
Data Usage Agreement | Applicant's institution must hold an active Data Use and Registration Agreement (DURA) with All of Us |
Issued by
Artificial Intelligence/Machine Learning Consortium to Advance Health Equity and Researcher Diversity (AIM-AHEAD) program
Program Description
The application of artificial intelligence and machine learning (AI/ML) to large datasets is dramatically expanding the capacity for hypothesis testing impacting the biomedical and socioeconomic domains. However, underrepresented communities, particularly those at heightened risk of socioeconomic and health disparities, are not receiving AI/ML’s benefits. Training a diverse workforce of researchers proficient in the application of AI/ML represents an opportunity to address a critical unmet need by potentially extending the benefits of AI/ML to underrepresented, at-risk communities.
The central goal of the AIM-AHEAD All of Us Training Program is to promote researcher diversity in AI/ML by training individuals from diverse backgrounds, such as those from underrepresented racial and ethnic groups, who are committed to gaining proficiency in AI/ML data analysis and applying their expertise to benefit communities underrepresented in biomedical research.
View the AIM-AHEAD All of Us Training Program Infographic linked below which offers a visual overview of the first cohort's academic achievements, acquired competencies, and substantive evaluations provided by trainees regarding their personal experiences throughout this extensive training program.
AIM-AHEAD All of Us Training Program Infographic.
Program Objectives
Objective 1: | The trainee will apply R, Python, and/or Jupyter Notebook to analyze All of Us datasets from diverse and underrepresented communities |
Objective 2: | The trainee will formulate hypotheses testable by applying AI/ML and advanced data analyses to All of Us data |
Objective 3: | The trainee will present his/her project at the 2025 AIM-AHEAD Annual Meeting |
The AIM-AHEAD consortium (Data Science Training Core and Communications Hub), All of Us, and RTI are partnering to offer AIM-AHEAD stakeholders, trainees, mentees, and consortium partners a training opportunity designed to increase researcher diversity in AI/ML by leveraging the All of Us data and infrastructure (Researcher Workbench).
The Researcher Workbench is a cloud-based platform where registered researchers can access Registered and Controlled Tier data. Its powerful tools support data analysis and collaboration. Researchers use the Workbench to access, store, and analyze data for specific research projects. Researchers can perform high-powered queries and analysis within the All of Us datasets using R or Python via the integrated, cloud-based Jupyter Notebook environment.
Using the AIM-AHEAD Connect Platform, this 8-month training program will engage a diverse group of 25 graduate students, postdocs, early-career faculty, healthcare professionals, and other non-academic professionals from underrepresented populations. Trainees will use the Dataset Builder to search, extract, and organize health information from the All of Us database, and use the Cohort Builder to create, review, and annotate data from All of Us human subject cohorts. Trainees will also receive training and technical assistance related to R, Python, Jupyter Notebook, and model development for All of Us data subsets in the Researcher Workbench. Training will include:
-
- Merging/validating data across All of Us sources
- Building a supervised model
- Splitting data into subsets for model training and testing
- Considering biases that may be present and detected or missed by the model
- Validating the model
The training, which utilizes All of Us data collected from communities historically underrepresented in biomedical research, is directed particularly toward investigators conducting research at the intersection of AI/ML and health disparities. Trainees can work independently, with another trainee who has similar interests, or a community partner.
Potential research topics that could be addressed include, but are not limited to:
-
- Examining statistical variation in the social determinants of health and intersectionality
- Examining statistical interactions between family health history and lifestyle factors
- Using statistical analysis to identify socioeconomic, environmental, and heritable determinants of clinically significant diseases and syndromes
Please refer to the following resources to determine if your desired research topic is viable to pursue in the All of Us training program:
All of Us Resource | Resource Links |
---|---|
All of Us data dictionaries: What data fields are available on All of Us? | Data Dictionaries for the Curated Data Repositories (CDRs) – User Support |
Detailed information on the entirety of the All of Us data repository | Getting Started – User Support |
All of Us Data Browser: What survey data, health conditions, and other data types are present in the data (e.g. How many people diagnosed with diabetes are in the dataset?) | https://databrowser.researchallofus.org/ |
Protecting Data and Privacy | Research Projects Directory | All of Us Research Program | NIH |
Having received advanced practical training in coding, model development, hypothesis testing, and data cleaning and analysis, trainees completing this program will be well prepared to harness AI/ML approaches to conduct hypothesis-driven analysis of complex datasets. The trainee will join the community of AI/ML professionals passionately committed to extend the benefits of AI/ML to communities underrepresented in biomedical research.
Trainee Expectations
-
- Attend all training sessions, both synchronous and asynchronous, including webinars and seminars
- Work on the program an average of at least 8 hours per week
- Engage with an AIM-AHEAD Mentor (to be assigned through the program)
- Engage in learning communities and peer networking
- Access the All of Us Researcher Workbench
- Complete the provided training related to R, Python, and Jupyter Notebook, available via AIM-AHEAD Connect
- Complete the supervised training on model building for analysis
- Complete the provided training on data splitting for algorithm training and testing, addressing biases in model development, and model validation
- Utilize concierge services on the Workbench and R/Python coding
- Utilize AIM-AHEAD HelpDesk support
- Work individually or in partnership with another trainee to present a work-in-progress research poster at the AIM-AHEAD Annual Meeting in summer 2025
- Work individually or in partnership with another trainee to generate an abstract suitable for submission to a conference, and/or a manuscript suitable for peer-reviewed publication
- Play an active part in the AIM-AHEAD community
Trainees Will Receive
-
- An $8,000 stipend upon successful completion of trainee milestones
- A $2,000 allowance to attend the AIM-AHEAD Annual Meeting 2025
- Support and guidance from an experienced AIM-AHEAD mentor
- Support from the AIM-AHEAD Data Science Training Core
- Direct 1:1 guidance, virtual office hours, HelpDesk support, and concierge services supporting users of All of Us Researcher Workbench, R, and Python coding
- Training on:
- Data analysis using All of Us Researcher Workbench, R, Python, and Jupyter Notebook
- Use and applications of R, Python, and Jupyter Notebook
- Data merging and validation across All of Us sources
- Building and validating models for analysis
- Data splitting methods for model training vs. testing
- Detecting and addressing biases in model development
- Hypothesis development for testing by analysis of All of Us data
- Assistance with obtaining a Data Use and Registration Agreement (DURA), if needed
- Using the All of Us database of de-identified medical data from >1 million Americans
Trainee Program Stipend
Each trainee will receive a stipend of $8,000 which will be disbursed in four installments based on trainee completion of required milestones. Each trainee will also be provided an allowance of $2,000 to cover the cost of airfare, hotel accommodations, local transportation and per diem to attend the AIM-AHEAD Annual Meeting 2025.
*Trainees may apply to multiple AIM-AHEAD training programs and fellowships, but if accepted into more than one, they must select only one AIM-AHEAD training program to participate in per funding cycle. Additionally, AIM-AHEAD-funded Coordinating Center Personnel or AIM-AHEAD-funded Awardees (MPIs, PIs, Co-Investigators, other personnel) are eligible to apply, but if accepted, they will not receive the stipend and allowance.*
Trainee Mentorship
Each awarded trainee will receive research and career mentorship from experienced, skilled investigators selected from AIM-AHEAD core members. The online mentoring platform AIM-AHEAD Connect (https://connect.aim-ahead.net) will be used to match mentors with awarded trainees and for mentor/trainee engagement and progress tracking.
Eligibility Criteria
- Applicants must be:
- U.S. Citizens, Permanent Residents, or Non-Citizen U.S. Nationals
- U.S. Citizen: An individual who is a citizen of the United States by law, birth or naturalization (https://www.law.cornell.edu/definitions/uscode.php?width=840&height=800&iframe=true&def_id=42-USC-630966247-802284531&term_occur=1&term_src=title:42:chapter:99:section:9102)
- Permanent Resident: An immigrant/non-citizen who can legally reside in the United States in perpetuity (https://www.law.cornell.edu/wex/lawful_permanent_resident_(lpr))
- Non-Citizen National: A person born in an outlying possession of the United States on or after the date of formal acquisition by the United States (https://www.law.cornell.edu/uscode/text/8/1408)
- Able to submit Form W-9 (Request for Taxpayer Identification)
- Affiliated with one of the following entities:
- Higher Education Institutions
- Public/State Controlled Institutions of Higher Education8
- Private Institutions of Higher Education
- Hispanic-serving Institution
- Historically Black Colleges and Universities (HBCUs)
- Tribally Controlled Colleges and Universities (TCCUs)
- Alaska Native and Native Hawaiian Serving Institutions
- Asian American Native American Pacific Islander Serving Institutions (AANAPISIs)
- Nonprofits Other Than Institutions of Higher Education
- Nonprofits with 501(c)(3) IRS Status (Other than Institutions of Higher Education)
- Nonprofits without 501(c)(3) IRS Status (Other than Institutions of Higher Education)
- For-Profit Organizations
- Small Businesses
- For-Profit Organizations (Other than Small Businesses)
- From an institution that holds an active Data Use and Registration Agreement (DURA) with All of Us. Confirm DURA.
- Willing to sign the Data User Code of Conduct (DUCC). This agreement outlines the program’s expectations for researchers who use the Researcher Workbench and describes how program data may be used. View the DUCC.
- Higher Education Institutions
- U.S. Citizens, Permanent Residents, or Non-Citizen U.S. Nationals
- Education: Applicants can be post-baccalaureate or graduate students, postdoctoral fellows, medical students or residents, allied health trainees, early-career investigators, or early-career employees of non-academic institutions as defined in item 1C above. Applicants must hold at a minimum a bachelor’s degree from an accredited U.S. institution in one of the following or related fields:
- Physical sciences (e.g. chemistry, physics)
- Biological or life sciences (e.g. biology, zoology, biochemistry, microbiology)
- Mathematics or statistics
- Data science
- Engineering
- Health sciences (e.g. pharmacy, psychology, health information technology, nurses, therapists, social workers)
- Public health (epidemiology, biostatistics, health administration, clinical implementation specialists)
- Recommended Applicant Knowledge, Skills, and Experience:
To ensure success in the training program, applicants must already possess certain skills, knowledge and experience. These include:- Prior programming experience
- Basic understanding of statistics
- A working command of English, as courses are taught in English
- Successfully completed an undergraduate or graduate course in probability and statistics
- Has practical experience in coding/programming with R or Python
- Has experience in data manipulation and management gained through coursework and/or research projects
- If the trainee plans to conduct a project using the All of Us genomics data, the trainee must have
- Prior experience with genetics or computational biology
- Practical experience with Bayesian analysis and maximum likelihood estimation
AIM-AHEAD Interest in Promoting Diversity
The goal of the AIM-AHEAD Coordinating Center is to promote diversity in the research workforce in AI/ML. Research shows that diverse teams working together and capitalizing on innovative ideas and distinct perspectives outperform homogeneous teams. Scientists and trainees from diverse backgrounds and life experiences bring different perspectives, creativity, and individual enterprise to address complex scientific problems. There are many benefits that flow from a diverse NIH-supported scientific workforce, including: fostering scientific innovation, enhancing global competitiveness, contributing to robust learning environments, improving the quality of research, advancing the likelihood that underserved populations and those that experience health disparities and inequities participate in and benefit from health research, and enhancing public trust. See the NIH Interest in Diversity (NOT-OD-20-031: Notice of NIH's Interest in Diversity). Individuals from diverse backgrounds, including those from the following underrepresented groups are encouraged to apply for the traineeship:
A. Individuals from racial and ethnic groups that have been shown by the National Science Foundation to be underrepresented in health-related sciences on a national basis (see data at http://www.nsf.gov/statistics/showpub.cfm?TopID=2&SubID=27 and the report Women, Minorities, and Persons with Disabilities in Science and Engineering). The following racial and ethnic groups have been shown to be underrepresented in biomedical research:
- Blacks or African Americans
- Hispanics or Latinos,
- American Indians or Alaska Natives,
- Native Hawaiians, and other Pacific Islanders.
- In addition, it is recognized that underrepresentation can vary from setting to setting; individuals from racial or ethnic groups that can be demonstrated convincingly to be underrepresented by the grantee institution should be encouraged to participate in NIH programs to enhance diversity. For more information on racial and ethnic categories and definitions, see the OMB Revisions to the Standards for Classification of Federal Data on Race and Ethnicity (https://www.govinfo.gov/content/pkg/FR-1997-10-30/html/97-28653.htm).
B. Individuals with disabilities, who are defined as those with a physical or mental impairment that substantially limits one or more major life activities, as described in the Americans with Disabilities Act of 1990, as amended. See NSF data at https://www.nsf.gov/statistics/2017/nsf17310/static/data/tab7-5.pdf.
C. Individuals from disadvantaged backgrounds, defined as those who meet two or more of the following criteria:
- Were or currently are homeless, as defined by the McKinney-Vento Homeless Assistance Act (Definition: https://nche.ed.gov/mckinney-vento/);
- Were or currently are in the foster care system, as defined by the Administration for Children and Families (Definition: https://www.acf.hhs.gov/cb/focus-areas/foster-care);
- Were eligible for the Federal Free and Reduced Lunch Program for two or more years (Definition: https://www.fns.usda.gov/school-meals/income-eligibility-guidelines);
- Have/had no parents or legal guardians who completed a bachelor’s degree (see https://nces.ed.gov/pubs2018/2018009.pdf);
- Were or currently are eligible for Federal Pell grants (Definition: https://studentaid.gov/understand-aid/types/grants/pell) ;
- Received support from the Special Supplemental Nutrition Program for Women, Infants and Children (WIC) as a parent or child (Definition: https://www.fns.usda.gov/wic/wic-eligibility-requirements).
- Grew up in one of the following areas:
- A U.S. rural area, as designated by the Health Resources and Services Administration (HRSA) Rural Health Grants Eligibility Analyzer (https://data.hrsa.gov/tools/rural-health), or
- A Centers for Medicare and Medicaid Services-designated Low-Income and Health Professional Shortage Area (qualifying zip codes are included in HRSA file link above). Either of these two criteria can serve as a criterion for the disadvantaged background designation.
Note: Only one of the two areas under #vii can be used as a criterion for the disadvantaged background definition.
Students from low socioeconomic (SES) status backgrounds have been shown to obtain bachelor’s and advanced degrees at significantly lower rates than students from middle and high SES groups (see https://nces.ed.gov/programs/coe), and are subsequently less likely to be represented in biomedical research. For background see Department of Education data at, https://nces.ed.gov/; https://www2.ed.gov/rschstat/research/pubs/advancing-diversity-inclusion.pdf.
D. Literature shows that women from the above backgrounds (A, B, and C) face particular challenges in scientific fields at the graduate level and beyond. (See, e.g., From the NIH: A Systems Approach to Increasing the Diversity of Biomedical Research Workforce https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5008902/).
Women are known to be underrepresented in doctorate-granting research institutions at senior faculty levels in most biomedical-relevant disciplines and may also be underrepresented at other faculty levels in some scientific disciplines (See data from the National Science Foundation National Center for Science and Engineering Statistics: Women, Minorities, and Persons with Disabilities in Science and Engineering, special report available at https://www.nsf.gov/statistics/2017/nsf17310/, especially Table 9-23, describing science, engineering, and health doctorate holders employed in universities and 4-year colleges, by broad occupation, sex, years since doctorate, and faculty rank).
Upon review of NSF data, and scientific discipline or field related data, NIH encourages institutions to consider women for faculty-level, diversity-targeted programs to address faculty recruitment, appointment, retention or advancement.
Application Process
Applications are now closed.
Trainee Selection
A review committee comprised of AIM-AHEAD Consortium members will apply the following criteria to evaluate and prioritize applications. In assigning priority scores, reviewers will apply the standard NIH 1-9 scoring range to Criteria 1, where a score of 1 indicates highest enthusiasm, and a score of 9 indicates lowest enthusiasm, based on NIH Simplified Review Framework - https://grants.nih.gov/grants/guide/notice-files/NOT-OD-24-010.html
Criteria 1:
-
- Articulation of Expectations and Reasons for Participation: Evaluate the clarity and depth with which the applicant articulates personal expectations and motivations for joining the program. Consider how convincingly the applicant communicates the necessity and significance of the training for their career or academic ambitions.
- Research Background and Motivation for Training: Assess the extent of the applicant’s relevant background, professional experience, or academic qualifications that support their readiness for this program. Gauge their motivation and potential to actively participate in and derive meaningful benefits from the training.
- Long-term Application of Training: Examine the specificity and feasibility of the applicant’s plans to apply AI/ML training in their research or professional development. Look for detailed strategies that indicate a commitment to integrating the training into their long-term career or academic objectives.
- Support from Supervisors or Mentors: Determine if the letter of support provides strong and unequivocal commitment from the supervisor, faculty, or mentor. It should confirm the provision of sufficient protected time for the applicant to fully engage with the training.
- Reference(s) and Assurance of Success: Critique the letter(s) of reference for their effectiveness in providing a persuasive argument that the applicant is well-prepared and likely to succeed in the program. The reference(s) should highlight relevant skills, accomplishments, and the applicant's capacity for advanced training.
- Community Engagement and Collaboration: Evaluate how well the applicant expresses a readiness to actively engage with the AIM-AHEAD community. Look for a demonstrated commitment to contributing to communal resources, empowering new users, and promoting a culture of diversity and inclusivity within the community.
Criteria 2:
Prior Data Science Experience: Assess applicant data science experience by reviewing education, work history, programming proficiency, project involvement, and understanding of math/statistics applied in data analysis.
To be evaluated by selecting one of the following options from a drop-down menu.
-
- Beginner
- Intermediate
- Expert
Notification of Awards
Applicants should expect notification of their acceptance status on Monday, January 6, 2025. Accepted applicants will receive an invite from PaymentWorks requesting:
- a valid tax ID (either an EIN or SSN) via W9 for U.S. vendors
- to upload a Bank Validation file for ACH/EFT or Wire Payments (https://community.paymentworks.com/payees/s/article/What-Is-A-Bank-Validation-File)
If an applicant applies and is accepted into more than one AIM-AHEAD training program, they must select only one to participate in per funding cycle. Additionally, AIM-AHEAD-funded Coordinating Center Personnel or AIM-AHEAD-funded Awardees (MPIs, PIs, Co-Investigators, other personnel) are eligible to apply, but if accepted, they will not receive the stipend and allowance.
Informational Webinar
The first informational webinar was Thursday, October 31, 2024
Inquiries
AIM-AHEAD All of Us Training Program Co-Directors
Robert Mallet, PhD, Legand (Lee) Burge, PhD, Toufeeq Syed, PhD
Frequently Asked Questions
Please refer to the FAQs below before submitting a help desk ticket:
AIM-AHEAD All of Us Frequently Asked Questions
Please feel free to submit a help desk ticket if you have any questions: