III: Small: Predictive Modeling from High-Dimensional, Sparsely and Irregularly Sampled, Longitudinal Data
III:小:根据高维、稀疏和不规则采样的纵向数据进行预测建模
基本信息
- 批准号:2226025
- 负责人:
- 金额:$ 59.99万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2022
- 资助国家:美国
- 起止时间:2022-10-01 至 2025-09-30
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Longitudinal data resulting from repeated observations from a set of individuals over time are commonplace in many applications, including health sciences, learning sciences, social sciences, life sciences, and economics. Such data present unprecedented opportunities to uncover the relationship between the time- varying patterns of certain measured variables (features or covariates) and outcomes of interest e.g., economic meltdown societal unrest, disease onset, health risk, etc. In real-world settings, the number of variables is often very large; often only a small subset of variables is recorded at any given time, resulting in sparse data with a high proportion of missing observations. Furthermore, such data exhibit complex correlations which if not properly accounted for, can lead to misleading statistical inferences. Additional complications arise from the fact that the data exhibit abrupt discontinuities that are often driven by transitions between states that are not directly observable (e.g., from "healthy" to "infected"). Large size of data sets demand methods that are scalable. And in high stakes applications, e.g., healthcare, human interpretability of the predictive models is of paramount importance. The project will yield substantial advances over the current state-of-the-art in scalable machine learning methods for predictive modeling of longitudinal outcomes from high-dimensional, irregularly sampled, sparse, longitudinal health data. The open-source implementations of the predictive modeling tools will find applications in many domains including behavioral, social, environmental, economic, learning, and health sciences. The project will enhance the research-based training of a diverse graduate and undergraduate students in Data Sciences and Computer Science (especially Artificial Intelligence), areas of great national importance. The educational activities associated with the project will help equip a diverse cadre of Data Scientists, AI experts, and health sciences, social sciences, learning sciences, and related areas with state-of-the-art machine learning tools for predictive modeling from longitudinal data. The project will produce a new graduate course and course modules, sample projects, etc. on predictive modeling from longitudinal data to be integrated into Data Sciences curricula. The project will help introduce students from diverse backgrounds, including women and underrepresented minorities, to a broad range of educational, research, and career opportunities in Data Sciences. The broader impacts of the project will be further enhanced by broad dissemination of all research results (publications, software, data sets, course materials).The project will develop a family of scalable deep kernel gaussian process regression algorithms for interpretable predictive modeling from high dimensional, sparsely and irregularly time sampled, longitudinal data with complex, a priori unknown correlation structure. The resulting methods will be able to discover the patterns of transitions between unobserved or hidden states, account for abrupt discontinuities in outcomes. They will be able to explain their predictions by learning the underlying complex correlation structure exhibited by the data and by identifying not only the variables that drive the predictions, but also the temporal context in which they do so. The project will rigorously empirically evaluate the resulting methods with simulated longitudinal data (with different correlation structures, different missingness mechanisms, different time-dependent variable importance), several benchmark longitudinal data sets, and, most importantly, deidentified longitudinal electronic health records data and socio-demographic data from real-world healthcare applications (in collaboration with clinical experts).This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
随着时间的流逝,由一组个人反复观察到的纵向数据在许多应用中很普遍,包括健康科学,学习科学,社会科学,生命科学和经济学。此类数据为揭示了某些测量变量(特征或协变量)的时间变化模式与感兴趣的结果之间的关系的前所未有的机会,例如,经济崩溃的社会动荡,疾病发作,健康风险,健康风险等。在现实世界中,变量的数量通常很大;通常,在任何给定时间都只记录一小部分变量,从而产生稀疏的数据,而缺失的观测值很高。此外,此类数据表现出复杂的相关性,如果未正确考虑,可能会导致误导性的统计推断。其他并发症是由于数据表现出突然的不连续性,而这些不连续性通常是由不直接观察到的状态之间的过渡驱动的(例如,从“健康”到“感染”)。 大尺寸的数据集需求方法可扩展。 在高利益应用中,例如医疗保健,人类对预测模型的解释性至关重要。该项目将在可扩展的机器学习方法中的当前最新方法中取得重大进展,以预测从高维,不规则地采样,稀疏,纵向健康数据的纵向结果进行预测。预测建模工具的开源实施将在许多领域中找到应用程序,包括行为,社会,环境,经济,学习和健康科学。该项目将增强对数据科学和计算机科学(尤其是人工智能)的不同研究生和本科生的研究培训,这是国家重要性的重要性。与该项目相关的教育活动将有助于配备各种数据科学家,人工智能专家以及健康科学,社会科学,学习科学及相关领域,并使用最先进的机器学习工具,用于从纵向数据中进行预测性建模。该项目将在预测性建模中生成新的研究生课程和课程模块,示例项目等,从纵向数据进行预测建模。该项目将有助于介绍来自不同背景的学生,包括妇女和代表性不足的少数民族,以了解数据科学领域的各种教育,研究和职业机会。通过广泛传播所有研究结果(出版物,软件,数据集,课程材料),该项目的更广泛影响将进一步增强。该项目将开发一个可伸缩的深层核心高斯流程流程回归算法,以从高度的高度模型中进行可解释的预测建模,稀疏和不规则的时间采样,具有复杂的纵向数据,这是先验未知的相关结构。最终的方法将能够发现未观察或隐藏状态之间的过渡模式,以说明结果中的突然不连续性。他们将能够通过学习数据所表现出的基本复杂相关结构,并不仅通过识别驱动预测的变量,还可以确定其这样做的时间环境来解释他们的预测。 该项目将通过模拟纵向数据(具有不同的相关结构,不同的丢失机制,不同的时间依赖性可变重要性),几个基准测试纵向数据集以及最重要的是,最重要的是,更重要的是,更重要的是,除了纵向纵向的纵向电子健康记录数据和社会健康记录数据和社会, - 现实世界中医疗保健应用程序(与临床专家合作)的人口统计学数据。该奖项反映了NSF的法定任务,并被认为是值得通过基金会的知识分子优点和更广泛的影响评估标准通过评估来支持的。
项目成果
期刊论文数量(1)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
A Simple, Fast Algorithm for Continual Learning from High-Dimensional Data
一种简单、快速的高维数据持续学习算法
- DOI:
- 发表时间:2023
- 期刊:
- 影响因子:0
- 作者:Ashtekar, Neil;Honavar, Vasant G
- 通讯作者:Honavar, Vasant G
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Vasant Honavar其他文献
Vasant Honavar的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Vasant Honavar', 18)}}的其他基金
Collaborative Research: RI: III: SHF: Small: Multi-Stakeholder Decision Making: Qualitative Preference Languages, Interactive Reasoning, and Explanation
协作研究:RI:III:SHF:小型:多利益相关者决策:定性偏好语言、交互式推理和解释
- 批准号:
2225824 - 财政年份:2022
- 资助金额:
$ 59.99万 - 项目类别:
Standard Grant
AI Institute: Planning: Institute for AI-Enabled Materials Discovery, Design, and Synthesis
人工智能研究所:规划:人工智能材料发现、设计和合成研究所
- 批准号:
2020243 - 财政年份:2020
- 资助金额:
$ 59.99万 - 项目类别:
Standard Grant
EAGER: Interpreting Black-Box Predictive Models Through Causal Attribution
EAGER:通过因果归因解释黑盒预测模型
- 批准号:
2041759 - 财政年份:2020
- 资助金额:
$ 59.99万 - 项目类别:
Standard Grant
BD Spokes: SPOKE: NORTHEAST: Collaborative Research: Integration of Environmental Factors and Causal Reasoning Approaches for Large-Scale Observational Health Research
BD 发言:发言:东北:合作研究:大规模观察健康研究的环境因素和因果推理方法的整合
- 批准号:
1636795 - 财政年份:2017
- 资助金额:
$ 59.99万 - 项目类别:
Standard Grant
EAGER: Towards a Computational Infrastructure for Analysis of Sensitive Data
EAGER:建立用于分析敏感数据的计算基础设施
- 批准号:
1551843 - 财政年份:2015
- 资助金额:
$ 59.99万 - 项目类别:
Standard Grant
SHF:Large:Collaborative Research: Inferring Software Specifications from Open Source Repositories by Leveraging Data and Collective Community Expertise
SHF:大型:协作研究:利用数据和集体社区专业知识从开源存储库推断软件规范
- 批准号:
1518732 - 财政年份:2015
- 资助金额:
$ 59.99万 - 项目类别:
Standard Grant
SGER: Exploratory Investigation of Modular Ontology Languages
SGER:模块化本体语言的探索性研究
- 批准号:
0639230 - 财政年份:2006
- 资助金额:
$ 59.99万 - 项目类别:
Standard Grant
ITR: Algorithms and Software for Knowledge Acquisition from Heterogeneous Distributed Data
ITR:从异构分布式数据获取知识的算法和软件
- 批准号:
0219699 - 财政年份:2002
- 资助金额:
$ 59.99万 - 项目类别:
Continuing Grant
RIA: Constructive Neural Network Learning Algorithms for Pattern Classification
RIA:用于模式分类的构造性神经网络学习算法
- 批准号:
9409580 - 财政年份:1994
- 资助金额:
$ 59.99万 - 项目类别:
Continuing Grant
相似国自然基金
单细胞分辨率下的石杉碱甲介导小胶质细胞极化表型抗缺血性脑卒中的机制研究
- 批准号:82304883
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
小分子无半胱氨酸蛋白调控生防真菌杀虫活性的作用与机理
- 批准号:32372613
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
诊疗一体化PS-Hc@MB协同训练介导脑小血管病康复的作用及机制研究
- 批准号:82372561
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
非小细胞肺癌MECOM/HBB通路介导血红素代谢异常并抑制肿瘤起始细胞铁死亡的机制研究
- 批准号:82373082
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
FATP2/HILPDA/SLC7A11轴介导肿瘤相关中性粒细胞脂代谢重编程影响非小细胞肺癌放疗免疫的作用和机制研究
- 批准号:82373304
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
相似海外基金
Evaluating the efficacy of a novel NASH therapeutic
评估新型 NASH 疗法的疗效
- 批准号:
10698971 - 财政年份:2023
- 资助金额:
$ 59.99万 - 项目类别:
III: Small: RUI: A Fairness Auditing Framework for Predictive Mobility Models
III:小:RUI:预测移动模型的公平性审核框架
- 批准号:
2304213 - 财政年份:2023
- 资助金额:
$ 59.99万 - 项目类别:
Standard Grant
Mid-sized GDNF Mimics For Neural Regeneration
中型 GDNF 模拟神经再生
- 批准号:
10811356 - 财政年份:2023
- 资助金额:
$ 59.99万 - 项目类别:
Modulation of cancer induced immune suppression via inhibition of SCD1
通过抑制 SCD1 调节癌症诱导的免疫抑制
- 批准号:
10896572 - 财政年份:2022
- 资助金额:
$ 59.99万 - 项目类别:
Modulation of cancer induced immune suppression via inhibition of SCD1
通过抑制 SCD1 调节癌症诱导的免疫抑制
- 批准号:
10546697 - 财政年份:2022
- 资助金额:
$ 59.99万 - 项目类别: