Containerizing tasks to ensure robust AI/ML data curation pipelines to estimate environmental disparities in the rural south
将任务容器化,以确保强大的 AI/ML 数据管理管道,以估计南部农村的环境差异
基本信息
- 批准号:10842665
- 负责人:
- 金额:$ 34.38万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2022
- 资助国家:美国
- 起止时间:2022-07-24 至 2024-03-31
- 项目状态:已结题
- 来源:
- 关键词:AccelerationAccountingAddressAdministrative SupplementAdoptedAdoptionAir PollutionArchitectureArtificial IntelligenceCitiesCodeCollaborationsCommunitiesComputer SystemsDataData SetDecentralizationDisparityDocumentationEngineeringEnsureEnvironmentEnvironmental HealthEthnic OriginExposure toFAIR principlesFoundationsGoalsHealthHeat WavesHigh Performance ComputingHigh temperature of physical objectHospitalizationIndividualInstitutionMachine LearningMedicareOlder PopulationOutcomeOutputOwnershipParentsPersonsPlayPopulationProcessRaceReadinessRegulationReproducibilityResearchResearch PersonnelRuralRural HealthRural PopulationSocioeconomic StatusSourceTRUST principlesTestingUnited States Centers for Medicare and Medicaid ServicesUpdateVariantVirginiaWeatherWest VirginiaWorkadverse birth outcomesbasecold temperaturecollaborative environmentcommunity engagementcomputerized data processingcomputing resourcesdata analysis pipelinedata cleaningdata curationdata privacyenvironmental disparityenvironmental health disparityenvironmental justiceethnic minority populationextreme temperaturehealth dataimprovedlow socioeconomic statusmachine learning pipelineprivacy protectionracial minority populationrepositoryresearch data disseminationrural settingruralityurban areaurban setting
项目摘要
Project Summary
Our parent R01 addresses a major scientific gap studying how the health of rural populations, including rural
racial and ethnic minority groups, is impacted by air pollution and extreme temperature. As part of the parent
R01 we are 1) generating a data architecture for air pollution, heat, cold, health, socioeconomic status (SES),
urban and rural form, and other factors in Virginia and West Virginia; 2) estimating the disparities in exposure
to air pollution and weather (cold, heat, and heat waves) by race/ethnicity, SES, and rurality, accounting for
variations in rural form; and 3) estimating the disparities in the associations between exposure to air pollution
or weather and health for the very young (adverse birth outcomes) and older populations (hospital admissions
for those >65 years), considering differences by various vulnerability factors (low SES, race/ethnicity) and
urban/rural form.
As part of the parent R01, we have made significant progress in generating a data architecture for air pollution,
heat, cold, health, SES, urban/rural form, and other factors. More specifically we have obtained health data
from the Centers of Medicare and Medicaid services (CMS). However, while considerable progress has been
made in developing data processing pipelines for CMS claims data, the CMS data privacy and confidentiality
limitations hinder the sharing of preprocessed datasets, leading to duplication of data cleaning and processing
efforts. This duplication of effort can be wasteful, as researchers may not be able to build on each other's work
or collaborate effectively. While data cannot be shared, sharing of open processing pipelines is crucial to
eliminate duplication efforts and allow for more readily available AI/ML ready data. When workflows are shared
containers are critical to ensure reproducibility.
With this administrative supplement, our goal is the adoption of containerized data processing tasks to
enhance the deployment of AI/ML pipelines for CMS data in the parent R01 and the wider research
communities. The use of containers enables the easy exchange or updating of single components in a
processing pipeline, which can be reused/recycled across AI/ML pipelines shared by different investigators in
the study team and more broadly across research institutions. The adoption of data processing containerized
tasks enhances reproducibility of our parent R01 and allows for the optimization of computational resources
over High-Performance Computing (HPC).
The container-based AI/ML pipelines will accelerate the velocity of research in the parent R01 allowing us to
rigorously estimate the disparities in the associations between exposure to air pollution and weather on health
outcomes. Furthermore, these improvements are crucial to allow for the dissemination of the workflow
pipelines across the wider research community.
项目摘要
我们的父母R01解决了一个主要的科学差距,研究了包括农村在内的农村人口的健康
种族和少数民族群体受到空气污染和极端温度的影响。作为父母的一部分
R01我们是1)生成用于空气污染,热,寒冷,健康,社会经济状况(SES)的数据架构(SES),
城市和农村形式,以及弗吉尼亚州和西弗吉尼亚州的其他因素; 2)估计暴露差异
通过种族/种族,SES和乡村
农村形式的变化; 3)估计暴露于空气污染之间的关联差异
或年轻(不良出生结果)和较老的人口(住院)
对于那些> 65年),考虑各种脆弱性因素(低SES,种族/种族)和
城市/农村形式。
作为母公司R01的一部分,我们在生成空气污染的数据架构方面取得了重大进展,
热,寒冷,健康,SES,城市/农村形式以及其他因素。更具体地说,我们获得了健康数据
来自Medicare和Medicaid服务(CMS)的中心。但是,尽管取得了很大的进步
为CMS索赔数据,CMS数据隐私和机密性开发数据处理管道而制作
限制阻碍了预处理数据集的共享,从而重复了数据清洁和处理
努力。这种重复的努力可能会浪费,因为研究人员可能无法在彼此的工作上建立
或有效合作。尽管数据无法共享,但共享开放处理管道的共享对于
消除重复工作,并允许更容易获得的AI/ML准备就绪数据。当共享工作流程时
容器对于确保可重复性至关重要。
有了这种管理补充,我们的目标是采用容器化数据处理任务
增强父R01和更广泛的研究中CMS数据的AI/ML管道的部署
社区。容器的使用可以轻松交换或更新单个组件
处理管道,可以在不同研究者共享的AI/ML管道上重复使用/回收
研究团队以及更广泛的研究机构。采用数据处理容器
任务增强了我们的父r01的可重复性,并允许优化计算资源
超出高性能计算(HPC)。
基于容器的AI/ML管道将加速父r01的研究速度,使我们能够
严格估计暴露于空气污染与健康天气之间的关联差异
结果。此外,这些改进对于允许工作流程的传播至关重要
整个更广泛的研究社区的管道。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Michelle L Bell其他文献
Michelle L Bell的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Michelle L Bell', 18)}}的其他基金
Air Pollution, Heat, Cold, and Health: Disparities in the Rural South
空气污染、炎热、寒冷和健康:南方农村地区的差异
- 批准号:
10670746 - 财政年份:2022
- 资助金额:
$ 34.38万 - 项目类别:
Enhancing SPACE, an innovative python package to account for spatial confounding used to estimate climate-sensitive events among older Medicare
增强 SPACE,这是一个创新的 Python 包,用于解决空间混杂问题,用于估计旧医疗保险中的气候敏感事件
- 批准号:
10839707 - 财政年份:2022
- 资助金额:
$ 34.38万 - 项目类别:
Air Pollution, Heat, Cold, and Health: Disparities in the Rural South
空气污染、炎热、寒冷和健康:南方农村地区的差异
- 批准号:
10390562 - 财政年份:2022
- 资助金额:
$ 34.38万 - 项目类别:
Connecting weather-related health risk and climate change projections in relation to rural health disparities
将与天气相关的健康风险和气候变化预测与农村健康差异联系起来
- 批准号:
10838844 - 财政年份:2022
- 资助金额:
$ 34.38万 - 项目类别:
Susceptibility and adverse health outcomes related to climate-sensitive events among older Medicare beneficiaries with Alzheimer and Dementia
患有阿尔茨海默症和痴呆症的老年医疗保险受益人与气候敏感事件相关的易感性和不良健康结果
- 批准号:
10607424 - 财政年份:2022
- 资助金额:
$ 34.38万 - 项目类别:
Environmental Health Disparities in an Older Population
老年人口的环境健康差异
- 批准号:
10196974 - 财政年份:2017
- 资助金额:
$ 34.38万 - 项目类别:
Vulnerability to Health Effects of Wildfires under a Changing Climate in Western
西部气候变化下野火对健康影响的脆弱性
- 批准号:
8471704 - 财政年份:2012
- 资助金额:
$ 34.38万 - 项目类别:
Vulnerability to Health Effects of Wildfires under a Changing Climate in Western
西部气候变化下野火对健康影响的脆弱性
- 批准号:
8266997 - 财政年份:2012
- 资助金额:
$ 34.38万 - 项目类别:
Effects of Fine Particle Composition on Birth Outcomes
细颗粒成分对出生结果的影响
- 批准号:
8625750 - 财政年份:2011
- 资助金额:
$ 34.38万 - 项目类别:
Effects of Fine Particle Composition on Birth Outcomes
细颗粒成分对出生结果的影响
- 批准号:
8828687 - 财政年份:2011
- 资助金额:
$ 34.38万 - 项目类别:
相似国自然基金
签字注册会计师动态配置问题研究:基于临阵换师视角
- 批准号:72362023
- 批准年份:2023
- 资助金额:28 万元
- 项目类别:地区科学基金项目
全生命周期视域的会计师事务所分所一体化治理与审计风险控制研究
- 批准号:72372064
- 批准年份:2023
- 资助金额:40 万元
- 项目类别:面上项目
会计师事务所数字化能力构建:动机、经济后果及作用机制
- 批准号:72372028
- 批准年份:2023
- 资助金额:42.00 万元
- 项目类别:面上项目
会计师事务所薪酬激励机制:理论框架、激励效应检验与优化重构
- 批准号:72362001
- 批准年份:2023
- 资助金额:28.00 万元
- 项目类别:地区科学基金项目
环境治理目标下的公司财务、会计和审计行为研究
- 批准号:72332002
- 批准年份:2023
- 资助金额:165.00 万元
- 项目类别:重点项目
相似海外基金
NeuroMAP Phase II - Recruitment and Assessment Core
NeuroMAP 第二阶段 - 招募和评估核心
- 批准号:
10711136 - 财政年份:2023
- 资助金额:
$ 34.38万 - 项目类别:
Providing Tobacco Treatment to Patients Undergoing Lung Cancer Screening at MedStar Health: A Randomized Trial
为 MedStar Health 接受肺癌筛查的患者提供烟草治疗:一项随机试验
- 批准号:
10654115 - 财政年份:2023
- 资助金额:
$ 34.38万 - 项目类别:
Center for the Promotion of Cancer Health Equity (CePCHE)
癌症健康公平促进中心 (CePCHE)
- 批准号:
10557579 - 财政年份:2023
- 资助金额:
$ 34.38万 - 项目类别:
Validating Sensor-based Approaches for Monitoring Eating Behavior and Energy Intake by Accounting for Real-World Factors that Impact Accuracy and Acceptability
通过考虑影响准确性和可接受性的现实因素来验证基于传感器的饮食行为和能量摄入监测方法
- 批准号:
10636986 - 财政年份:2023
- 资助金额:
$ 34.38万 - 项目类别:
Wisconsin Registry for Alzheimer's Prevention
威斯康星州阿尔茨海默病预防登记处
- 批准号:
10655978 - 财政年份:2023
- 资助金额:
$ 34.38万 - 项目类别: