Call for Applications

AIM-AHEAD All of Us Training Program

Traineeship in Advanced Data Analysis using the All of Us Database

 

The AIM-AHEAD All of Us Training Program is intended to increase researcher diversity in AI/ML by leveraging the All of Us data and infrastructure (Researcher Workbench). This 8-month training program will engage a diverse group of 25 graduate students, postdocs, early-career faculty, healthcare professionals, and other non-academic professionals from underrepresented populations.

Funding Cycle 2024-2025
Release Date October 18, 2024
Application Due Date

Monday, November 18, 2024. Applications must be received by 11:59 p.m. Eastern Time

Notification of Award January 6, 2025
Program Start Date January 15, 2025
Informational Webinar Schedule

Thursday, October 31 2024, 1:00 p.m. Central Time/2:00 p.m. Eastern Time

Registration Link: https://signup.aim-ahead.net/event/p/25De223c9F

Informational Webinar Recording

Check back for links to webinar recordings

Application Link

Step 1: Click here to register as a "mentee/learner" on AIM-AHEAD Connect (our Community Building Platform)

Step 2: Click here to submit a traineeship application for review using InfoReady platform

Project Period

8-month training program

Stipend

$8,000 stipend and $2,000 allowance to attend the AIM-AHEAD Annual Meeting 2025

Mentor(s)

Trainees will receive direct 1:1 support and guidance on career development and research from an experienced AIM-AHEAD mentor

NIH Biosketch

Applicant NIH biosketch or Curriculum Vitae (CV) is required

Letters of Support

Minimum of two letters from the applicant’s supervisor(s) and faculty

Data Usage Agreement

Applicant's institution must hold an active Data Use and Registration Agreement (DURA) with All of Us

Issued by

Artificial Intelligence/Machine Learning Consortium to Advance Health Equity and Researcher Diversity (AIM-AHEAD) program

Program Description

The application of artificial intelligence and machine learning (AI/ML) to large datasets is dramatically expanding the capacity for hypothesis testing impacting the biomedical and socioeconomic domains. However, underrepresented communities, particularly those at heightened risk of socioeconomic and health disparities, are not receiving AI/ML’s benefits. Training a diverse workforce of researchers proficient in the application of AI/ML represents an opportunity to address a critical unmet need by potentially extending the benefits of AI/ML to underrepresented, at-risk communities. 

The central goal of the AIM-AHEAD All of Us Training Program is to promote researcher diversity in AI/ML by training individuals from diverse backgrounds, such as those from underrepresented racial and ethnic groups, who are committed to gaining proficiency in AI/ML data analysis and applying their expertise to benefit communities underrepresented in biomedical research.

View the AIM-AHEAD All of Us Training Program Infographic linked below which offers a visual overview of the first cohort's academic achievements, acquired competencies, and substantive evaluations provided by trainees regarding their personal experiences throughout this extensive training program.

AIM-AHEAD All of Us Training Program Infographic.

Program Objectives

Objective 1: The trainee will apply R, Python, and/or Jupyter Notebook to analyze All of Us datasets from diverse and underrepresented communities
Objective 2: The trainee will formulate hypotheses testable by applying AI/ML and advanced data analyses to All of Us data
Objective 3: The trainee will present his/her project at the 2025 AIM-AHEAD Annual Meeting

The AIM-AHEAD consortium (Data Science Training Core and Communications Hub), All of Us, and RTI are partnering to offer AIM-AHEAD stakeholders, trainees, mentees, and consortium partners a training opportunity designed to increase researcher diversity in AI/ML by leveraging the All of Us data and infrastructure (Researcher Workbench).

The Researcher Workbench is a cloud-based platform where registered researchers can access Registered and Controlled Tier data. Its powerful tools support data analysis and collaboration. Researchers use the Workbench to access, store, and analyze data for specific research projects. Researchers can perform high-powered queries and analysis within the All of Us datasets using R or Python via the integrated, cloud-based Jupyter Notebook environment.  

Using the AIM-AHEAD Connect Platform, this 8-month training program will engage a diverse group of 25 graduate students, postdocs, early-career faculty, healthcare professionals, and other non-academic professionals from underrepresented populations. Trainees will use the Dataset Builder to search, extract, and organize health information from the All of Us database, and use the Cohort Builder to create, review, and annotate data from All of Us human subject cohorts. Trainees will also receive training and technical assistance related to R, Python, Jupyter Notebook, and model development for All of Us data subsets in the Researcher Workbench. Training will include:

    • Merging/validating data across All of Us sources
    • Building a supervised model
    • Splitting data into subsets for model training and testing
    • Considering biases that may be present and detected or missed by the model
    • Validating the model

The training, which utilizes All of Us data collected from communities historically underrepresented in biomedical research, is directed particularly toward investigators conducting research at the intersection of AI/ML and health disparities. Trainees can work independently, with another trainee who has similar interests, or a community partner. 

Potential research topics that could be addressed include, but are not limited to:

    • Examining statistical variation in the social determinants of health and intersectionality
    • Examining statistical interactions between family health history and lifestyle factors
    • Using statistical analysis to identify socioeconomic, environmental, and heritable determinants of clinically significant diseases and syndromes

Please refer to the following resources to determine if your desired research topic is viable to pursue in the All of Us training program:

All of Us Resource Resource Links
All of Us data dictionaries: What data fields are available on All of Us? Data Dictionaries for the Curated Data Repositories (CDRs) – User Support
Detailed information on the entirety of the All of Us data repository Getting Started – User Support
All of Us Data Browser: What survey data, health conditions, and other data types are present in the data (e.g. How many people diagnosed with diabetes are in the dataset?) https://databrowser.researchallofus.org/
Protecting Data and Privacy Research Projects Directory | All of Us Research Program | NIH

Having received advanced practical training in coding, model development, hypothesis testing, and data cleaning and analysis, trainees completing this program will be well prepared to harness AI/ML approaches to conduct hypothesis-driven analysis of complex datasets. The trainee will join the community of AI/ML professionals passionately committed to extend the benefits of AI/ML to communities underrepresented in biomedical research.

Trainee Expectations

        • Attend all training sessions, both synchronous and asynchronous, including webinars and seminars
        • Work on the program an average of at least 8 hours per week
        • Engage with an AIM-AHEAD Mentor (to be assigned through the program)
        • Engage in learning communities and peer networking
        • Access the All of Us Researcher Workbench
        • Complete the provided training related to R, Python, and Jupyter Notebook, available via AIM-AHEAD Connect
        • Complete the supervised training on model building for analysis
        • Complete the provided training on data splitting for algorithm training and testing, addressing biases in model development, and model validation
        • Utilize concierge services on the Workbench and R/Python coding
        • Utilize AIM-AHEAD HelpDesk support
        • Work individually or in partnership with another trainee to present a work-in-progress research poster at the AIM-AHEAD Annual Meeting in summer 2025
        • Work individually or in partnership with another trainee to generate an abstract suitable for submission to a conference, and/or a manuscript suitable for peer-reviewed publication
        • Play an active part in the AIM-AHEAD community

Trainees Will Receive

        • An $8,000 stipend upon successful completion of trainee milestones
        • A $2,000 allowance to attend the AIM-AHEAD Annual Meeting 2025
        • Support and guidance from an experienced AIM-AHEAD mentor
        • Support from the AIM-AHEAD Data Science Training Core
        • Direct 1:1 guidance, virtual office hours, HelpDesk support, and concierge services supporting users of All of Us Researcher Workbench, R, and Python coding
        • Training on:
          • Data analysis using All of Us Researcher Workbench, R, Python, and Jupyter Notebook
          • Use and applications of R, Python, and Jupyter Notebook
          • Data merging and validation across All of Us sources
          • Building and validating models for analysis
          • Data splitting methods for model training vs. testing
          • Detecting and addressing biases in model development
          • Hypothesis development for testing by analysis of All of Us data
          • Assistance with obtaining a Data Use and Registration Agreement (DURA), if needed
          • Using the All of Us database of de-identified medical data from >1 million Americans

Trainee Program Stipend

Each trainee will receive a stipend of $8,000 which will be disbursed in four installments based on trainee completion of required milestones. Each trainee will also be provided an allowance of $2,000 to cover the cost of airfare, hotel accommodations, local transportation and per diem to attend the AIM-AHEAD Annual Meeting 2025.

*Trainees may apply to multiple AIM-AHEAD training programs and fellowships, but if accepted into more than one, they must select only one AIM-AHEAD training program to participate in per funding cycle. Additionally, AIM-AHEAD-funded Coordinating Center Personnel (MPIs, PIs, Co-Investigators, other personnel) are eligible to apply, but if accepted, they will not receive the stipend and allowance.*

Trainee Mentorship

Each awarded trainee will receive research and career mentorship from experienced, skilled investigators selected from AIM-AHEAD core members. The online mentoring platform AIM-AHEAD Connect (https://connect.aim-ahead.net) will be used to match mentors with awarded trainees and for mentor/trainee engagement and progress tracking.


Eligibility Criteria

  1. Applicants must be:
    1. U.S. Citizens, Permanent Residents, or Non-Citizen U.S. Nationals
    2. Able to submit Form W-9 (Request for Taxpayer Identification)
    3. Affiliated with one of the following entities:
      • Higher Education Institutions
        • Public/State Controlled Institutions of Higher Education8
        • Private Institutions of Higher Education
        Individuals affiliated with the following types of Higher Education Institutions are always encouraged to apply:
        • Hispanic-serving Institution
        • Historically Black Colleges and Universities (HBCUs)
        • Tribally Controlled Colleges and Universities (TCCUs)
        • Alaska Native and Native Hawaiian Serving Institutions
        • Asian American Native American Pacific Islander Serving Institutions (AANAPISIs)
      • Nonprofits Other Than Institutions of Higher Education
        • Nonprofits with 501(c)(3) IRS Status (Other than Institutions of Higher Education)
        • Nonprofits without 501(c)(3) IRS Status (Other than Institutions of Higher Education)
      • For-Profit Organizations
        • Small Businesses
        • For-Profit Organizations (Other than Small Businesses)
      • From an institution that holds an active Data Use and Registration Agreement (DURA) with All of Us. Confirm DURA.
      • Willing to sign the Data User Code of Conduct (DUCC). This agreement outlines the program’s expectations for researchers who use the Researcher Workbench and describes how program data may be used. View the DUCC.
  2. Education: Applicants can be post-baccalaureate or graduate students, postdoctoral fellows, medical students or residents, allied health trainees, early-career investigators, or early-career employees of non-academic institutions as defined in item 1C above. Applicants must hold at a minimum a bachelor’s degree from an accredited U.S. institution in one of the following or related fields:
    • Physical sciences (e.g. chemistry, physics)
    • Biological or life sciences (e.g. biology, zoology, biochemistry, microbiology)
    • Mathematics or statistics
    • Data science
    • Engineering
    • Health sciences (e.g. pharmacy, psychology, health information technology, nurses, therapists, social workers)
    • Public health (epidemiology, biostatistics, health administration, clinical implementation specialists)
  3. Recommended Applicant Knowledge, Skills, and Experience:
    To ensure success in the training program, applicants must already possess certain skills, knowledge and experience. These include:
    • Prior programming experience
    • Basic understanding of statistics
    • A working command of English, as courses are taught in English
    Additionally, it is strongly recommended that applicants have some of the following skills and experiences to optimize their learning experience and better prepare them for the challenges of the program.
    • Successfully completed an undergraduate or graduate course in probability and statistics
    • Has practical experience in coding/programming with R or Python
    • Has experience in data manipulation and management gained through coursework and/or research projects
    • If the trainee plans to conduct a project using the All of Us genomics data, the trainee must have
      • Prior experience with genetics or computational biology
      • Practical experience with Bayesian analysis and maximum likelihood estimation
    Introductory or refresher courses on these topics will be available to successful applicants at the start of the traineeship, via the AIM-AHEAD Connect platform.

AIM-AHEAD Interest in Promoting Diversity

The goal of the AIM-AHEAD Coordinating Center is to promote diversity in the research workforce in AI/ML. Research shows that diverse teams working together and capitalizing on innovative ideas and distinct perspectives outperform homogeneous teams. Scientists and trainees from diverse backgrounds and life experiences bring different perspectives, creativity, and individual enterprise to address complex scientific problems. There are many benefits that flow from a diverse NIH-supported scientific workforce, including: fostering scientific innovation, enhancing global competitiveness, contributing to robust learning environments, improving the quality of research, advancing the likelihood that underserved populations and those that experience health disparities and inequities participate in and benefit from health research, and enhancing public trust. See the NIH Interest in Diversity (NOT-OD-20-031: Notice of NIH's Interest in Diversity). Individuals from diverse backgrounds, including those from the following underrepresented groups are encouraged to apply for the traineeship:

A. Individuals from racial and ethnic groups that have been shown by the National Science Foundation to be underrepresented in health-related sciences on a national basis (see data at http://www.nsf.gov/statistics/showpub.cfm?TopID=2&SubID=27 and the report Women, Minorities, and Persons with Disabilities in Science and Engineering). The following racial and ethnic groups have been shown to be underrepresented in biomedical research:

  1. Blacks or African Americans
  2. Hispanics or Latinos,
  3. American Indians or Alaska Natives,
  4. Native Hawaiians, and other Pacific Islanders.
  5. In addition, it is recognized that underrepresentation can vary from setting to setting; individuals from racial or ethnic groups that can be demonstrated convincingly to be underrepresented by the grantee institution should be encouraged to participate in NIH programs to enhance diversity. For more information on racial and ethnic categories and definitions, see the OMB Revisions to the Standards for Classification of Federal Data on Race and Ethnicity (https://www.govinfo.gov/content/pkg/FR-1997-10-30/html/97-28653.htm).

B. Individuals with disabilities, who are defined as those with a physical or mental impairment that substantially limits one or more major life activities, as described in the Americans with Disabilities Act of 1990, as amended. See NSF data at https://www.nsf.gov/statistics/2017/nsf17310/static/data/tab7-5.pdf.

C. Individuals from disadvantaged backgrounds, defined as those who meet two or more of the following criteria:

  1. Were or currently are homeless, as defined by the McKinney-Vento Homeless Assistance Act (Definition: https://nche.ed.gov/mckinney-vento/);
  2. Were or currently are in the foster care system, as defined by the Administration for Children and Families (Definition: https://www.acf.hhs.gov/cb/focus-areas/foster-care);
  3. Were eligible for the Federal Free and Reduced Lunch Program for two or more years (Definition: https://www.fns.usda.gov/school-meals/income-eligibility-guidelines);
  4. Have/had no parents or legal guardians who completed a bachelor’s degree (see https://nces.ed.gov/pubs2018/2018009.pdf);
  5. Were or currently are eligible for Federal Pell grants (Definition: https://studentaid.gov/understand-aid/types/grants/pell) ;
  6. Received support from the Special Supplemental Nutrition Program for Women, Infants and Children (WIC) as a parent or child (Definition: https://www.fns.usda.gov/wic/wic-eligibility-requirements).
  7. Grew up in one of the following areas:
    1. A US rural area, as designated by the Health Resources and Services Administration (HRSA) Rural Health Grants Eligibility Analyzer (https://data.hrsa.gov/tools/rural-health), or
    2. A Centers for Medicare and Medicaid Services-designated Low-Income and Health Professional Shortage Area (qualifying zip codes are included in HRSA file link above). Either of these two criteria can serve as a criterion for the disadvantaged background designation.

Note: Only one of the two areas under #vii can be used as a criterion for the disadvantaged background definition.

Students from low socioeconomic (SES) status backgrounds have been shown to obtain bachelor’s and advanced degrees at significantly lower rates than students from middle and high SES groups (see https://nces.ed.gov/programs/coe), and are subsequently less likely to be represented in biomedical research. For background see Department of Education data at, https://nces.ed.gov/; https://www2.ed.gov/rschstat/research/pubs/advancing-diversity-inclusion.pdf.

D. Literature shows that women from the above backgrounds (A, B, and C) face particular challenges in scientific fields at the graduate level and beyond. (See, e.g., From the NIH: A Systems Approach to Increasing the Diversity of Biomedical Research Workforce https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5008902/).

Women are known to be underrepresented in doctorate-granting research institutions at senior faculty levels in most biomedical-relevant disciplines and may also be underrepresented at other faculty levels in some scientific disciplines (See data from the National Science Foundation National Center for Science and Engineering Statistics: Women, Minorities, and Persons with Disabilities in Science and Engineering, special report available at https://www.nsf.gov/statistics/2017/nsf17310/, especially Table 9-23, describing science, engineering, and health doctorate holders employed in universities and 4-year colleges, by broad occupation, sex, years since doctorate, and faculty rank).

Upon review of NSF data, and scientific discipline or field related data, NIH encourages institutions to consider women for faculty-level, diversity-targeted programs to address faculty recruitment, appointment, retention or advancement.


Application Process

Applications must be submitted during the open application period (10/18/24 - 11/18/24). Applications should address the requirements below and any additional questions via the Traineeship Application Form. The application should be understandable to readers from outside the applicant’s field of study and must clearly present the project aims, applicable studies already completed, methods, materials, and AIM-AHEAD engagement plan. 

Applicants must:

    1. Register as a “mentee/learner” on AIM-AHEAD Connect (our Community Building platform). Your AIM-AHEAD Connect profile will allow you to log in to submit your application on the InfoReady platform
    2. Click here to submit an application for review using the InfoReady platform

Please use Chrome, Firefox, or Edge. Please note that both of the above steps must be completed for consideration.

All applications must be received by November 18, 2024, at 11:59 p.m. Eastern Time.

Profile Information

    • Provide your name, organization, department, position title, research area, email address, and profile web page
    • Please answer the profile questions on InfoReady

Letters of Support

    • One signed letter of support from the applicant’s supervisor is required. Letters of support must include the referee’s contact information (full name, position title, organization, email/phone number, and signature)
    • Minimum of one letter of recommendation (one additional letter may be provided) is required from faculty who taught the applicant or from an individual who can attest to the applicant’s preparedness, aptitude, and rationale for advanced data analysis training. Reference(s) should highlight relevant skills and accomplishments

Academic Unofficial Transcript

Academic transcripts (official or a photocopy) must be included from the applicant’s undergraduate and, if applicable, graduate programs for current students and postgraduates.

Biographical Sketch of the Applicant

The applicant’s NIH biosketch or Curriculum Vitae (CV) is required and should not exceed 5 pages.

Statement of Rationale for Pursuing Training

Provide a personal statement of not more than 900 words. Include each of the following sections in your response. 

    1. Describe what you hope to accomplish through the AIM-AHEAD All of Us Training Program. Provide your rationale and need for training in All of Us data and acquiring these skills.
    2. Provide the research question you plan to address using All of Us data and describe how you generally plan to answer the question within 4-6-months. This section should be a high-level plan that identifies the coding language you plan to use, your hypothesis, what analysis or statistical test you will potentially run, and a general workflow of the research project. The purpose of this section is for reviewers to gauge your understanding of the subject of interest and the potential feasibility of the project. Looking at the All of Us Data Browser can help you decide if the dataset provides the data needed to answer your research question.
    3. Describe your familiarity with, and/or interest, in AI/ML analysis, programming, EHR, clinical or genomic data analysis, biomedical science, public health background, and cloud-based computation.
    4. Explain how you plan to apply the training to achieve your long-term research interests and objectives.
    5. Indicate whether you would like to work independently, with another trainee who has similar interests, or with a community partner on your research project. Stipends will not be provided to community partners.

Trainee Selection

A review committee comprised of AIM-AHEAD Consortium members will apply the following criteria to evaluate and prioritize applications. In assigning priority scores, reviewers will apply the standard NIH 1-9 scoring range to Criteria 1, where a score of 1 indicates highest enthusiasm, and a score of 9 indicates lowest enthusiasm, based on NIH Simplified Review Framework - https://grants.nih.gov/grants/guide/notice-files/NOT-OD-24-010.html

Criteria 1: 

    1. Articulation of Expectations and Reasons for Participation: Evaluate the clarity and depth with which the applicant articulates personal expectations and motivations for joining the program. Consider how convincingly the applicant communicates the necessity and significance of the training for their career or academic ambitions.
    2. Research Background and Motivation for Training: Assess the extent of the applicant’s relevant background, professional experience, or academic qualifications that support their readiness for this program. Gauge their motivation and potential to actively participate in and derive meaningful benefits from the training.
    3. Long-term Application of Training: Examine the specificity and feasibility of the applicant’s plans to apply AI/ML training in their research or professional development. Look for detailed strategies that indicate a commitment to integrating the training into their long-term career or academic objectives.
    4. Support from Supervisors or Mentors: Determine if the letter of support provides strong and unequivocal commitment from the supervisor, faculty, or mentor. It should confirm the provision of sufficient protected time for the applicant to fully engage with the training.
    5. Reference(s) and Assurance of Success: Critique the letter(s) of reference for their effectiveness in providing a persuasive argument that the applicant is well-prepared and likely to succeed in the program. The reference(s) should highlight relevant skills, accomplishments, and the applicant's capacity for advanced training.
    6. Community Engagement and Collaboration: Evaluate how well the applicant expresses a readiness to actively engage with the AIM-AHEAD community. Look for a demonstrated commitment to contributing to communal resources, empowering new users, and promoting a culture of diversity and inclusivity within the community.

Criteria 2: 

Prior Data Science Experience: Assess applicant data science experience by reviewing education, work history, programming proficiency, project involvement, and understanding of math/statistics applied in data analysis. 

To be evaluated by selecting one of the following options from a drop-down menu.

    • Beginner
    • Intermediate 
    • Expert

Notification of Awards

Applicants should expect notification of their acceptance status on Monday, January 6, 2025. Accepted applicants will receive an invite from PaymentWorks requesting:

If an applicant applies and is accepted into more than one AIM-AHEAD training program, they must select only one to participate in per funding cycle. Additionally, AIM-AHEAD-funded Coordinating Center Personnel (MPIs, PIs, Co-Investigators, other personnel) are eligible to apply, but if accepted, they will not receive the stipend and allowance.


Informational Webinar

There will be an initial informational webinar on Thursday, October 31, 2024, from 1:00 - 2:00 p.m. Central Time/2:00 - 3:00 p.m. Eastern Time

Registration Link: https://signup.aim-ahead.net/event/p/25De223c9F


Inquiries

AIM-AHEAD All of Us Training Program Co-Directors

Robert Mallet, PhD, Legand (Lee) Burge, PhD, Toufeeq Syed, PhD

Frequently Asked Questions

Please refer to the FAQs below before submitting a help desk ticket:

AIM-AHEAD All of Us Frequently Asked Questions

Please feel free to submit a help desk ticket if you have any questions:

AIM-AHEAD Training Programs HelpDesk

Scroll to top