PROJECT SUMMARY The National Institutes on Aging (NIA) has recommended strengthening research infrastructures to address future aging research questions (2016 Data Infrastructure Review Committee Report and PAR-16-367). In particular, they recommend: 1) integrating biological data into larger population-based studies; 2) increasing use of electronic health record (EHR) data and linking to medical care claims data; and 3) developing new approaches to collecting data to answer important scientific questions about mechanisms of aging. The Rochester Epidemiology Project (REP; NIA R01 AG034676) is a unique infrastructure for studies of aging, because the REP collects longitudinal EHR data on all health conditions that come to medical attention for a large, Midwestern population. Therefore, the REP allows investigators to study all age-related diseases and outcomes. However, the REP has three significant gaps. First, the REP does not include biospecimens. Second, the REP is missing health care delivered outside of the health care institutions that partner with the REP, and it does not include information on filled prescriptions. Third, a significant proportion of EHR data is difficult to access due to two factors: 1) the full text of the EHRs includes extensive clinical notes about aging outcomes and geriatric syndromes, but these notes are not routinely coded for billing, and can only be accessed through laborious manual review; and 2) the REP health care partners use three different EHR systems, making it difficult to apply electronic data extraction tools across all partners. To address these three gaps, we will develop an interdisciplinary collaboration across experts in aging research, epidemiologic methods, biobanking, and medical informatics to create a new, comprehensive research infrastructure (?Bio-REP?) to support aging research. In the R21 phase, we will develop a comprehensive research infrastructure that combines the REP data with Mayo Clinic Biobank biospecimens, medical claims data from the Centers for Medicare and Medicaid Services (CMS; Aim 1), and geriatric syndrome data that are included in the unstructured EHR clinical notes using Natural Language Processing techniques (NLP; Aim 2). In the R33 phase, we will deploy NLP algorithms developed in Aim 2 in the clinical notes from two additional EHR systems (Aim 3), and we will conduct two demonstration projects. First, we will measure associations between novel aging-related biomarkers and aging-related outcomes (Aim 4). Second, we will determine whether two common medications that are hypothesized to impact aging (metformin and angiotensin receptor blockers) modify associations between aging biomarkers and aging outcomes (Aim 5). The new, robust Bio-REP infrastructure will support a wide range of efficient, cost-effective observational studies to characterize associations between aging-related biomarkers and specific diseases, geriatric syndromes, and drug utilization. Such studies are urgently needed to design effective clinical trials to improve the health span of the aging population.