Collaborative Research: EnCORE: Institute for Emerging CORE Methods in Data Science

合作研究:EnCORE:数据科学新兴核心方法研究所

基本信息

  • 批准号:
    2217069
  • 负责人:
  • 金额:
    $ 257.23万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2022
  • 资助国家:
    美国
  • 起止时间:
    2022-09-01 至 2027-08-31
  • 项目状态:
    未结题

项目摘要

The proliferation of data-driven decision making, and its increased popularity, has fueled rapid emergence of data science as a new scientific discipline. Data science is seen as a key enabler of future businesses, technologies, and healthcare that can transform all aspects of socioeconomic lives. Its fast adoption, however, often comes with ad hoc implementation of techniques with suboptimal, and sometimes unfair and potentially harmful, results. The time is ripe to develop principled approaches to lay solid foundations of data science. This is particularly challenging as real-world data is highly complex with intricate structures, unprecedented scale, rapidly evolving characteristics, noise, and implicit biases. Addressing these challenges requires a concerted effort across multiple scientific disciplines such as statistics for robust decision making under uncertainty; mathematics and electrical engineering for enabling data-driven optimization beyond worst case; theoretical computer science and machine learning for new algorithmic paradigms to deal with dynamic and sensitive data in an ethical way; and basic sciences to bring the technical developments to the forefront of health sciences and society. The proposed institute for emerging CORE methods in data science (EnCORE) brings together a diverse team of researchers spanning the afore-mentioned disciplines from the University of California San Diego, University of Texas Austin, University of Pennsylvania, and the University of California Los Angeles. It presents an ambitious vision to transform the landscape of the four CORE pillars of data science: C for complexities of data, O for optimization, R for responsible learning, and E for education and engagement. Along with its transformative research vision, the institute fosters a bold plan for outreach and broadening participation by engaging students of diverse backgrounds at all levels from K-12 to postdocs and junior faculty. The project aims to impact a wide demography of students by offering collaborative courses across its partner universities and a flexible co-mentorship plan for truly multidisciplinary research. With regular organization of workshops, summer schools, and seminars, the project aims to engage the entire scientific community to become the new nexus of research and education on foundations of data science. To bring the fruit of theoretical development to practice, EnCORE will continuously work with industry partners, domain scientists, and will forge strong connections with other National Science Foundation Harnessing Data Revolution institutes across the nation.EnCORE as an institute embodies intellectual merit that has the potential to lead ground-breaking research to shape the foundations of data science in the United States. Its research mission is organized around three themes. The first theme on data complexity addresses the complex characteristics of data such as massive size, huge feature space, rapid changes, variety of sources, implicit dependence structures, arbitrary outliers, and noise. A major overhaul of the core concepts of algorithm design is needed with a holistic view of different computational complexity measures. Faced with noise and outliers, uncertainty estimation is both necessary, and at the same time difficult, due to dynamic and changing data. Data heterogeneity poses major challenges even in basic classification tasks. The structural relationships hidden inside such data are crucial in the understanding and processing, and for downstream data analysis tasks such as in visualization and neuroscience. The second theme of EnCORE aims to transform the classical area of optimization where adaptive methods and human intervention can lead to major advances. It plans to revisit the foundations of distributed optimization to include heterogeneity, robustness, safety, and communication; and address statistical uncertainty due to distributional shift in dynamic data in control and reinforcement learning. The third and final theme of EnCORE proposes to build the foundations of responsible learning. Applications of machine learning in human-facing systems are severely hampered when the learned models are hard for users to understand and reproduce, may give biased outcomes, are easily changeable by an adversary, and reveal sensitive information. Thus, interpretability, reproducibility, fairness, privacy, and robustness must be incorporated in any data-driven decision making. The experience and dedication to mentoring and outreach, collaborative curriculum design, socially aware responsible research program, extensive institute activities, and industrial partnerships would pave the way for a substantial broader impact for EnCORE. Summer schools with year-long mentoring will take place in three states involving a large demography. Joint courses with hybrid, and fully online offerings will be developed. Utilizing prior experience of running Thinkabit lab that has impacted over 74,000 K-12 students so far, EnCORE will embark on an ambitious and thoughtful outreach program to improve the representation of under-represented groups and help create a future generation of workforce that is diverse, responsible, and has solid foundations in data science.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
数据驱动的决策的扩散及其越来越受欢迎,促进了数据科学作为一门新科学学科的快速出现。数据科学被视为可以改变社会经济生活各个方面的未来业务,技术和医疗保健的关键推动力。然而,它的快速采用通常伴随着临时实施具有次优的技术,有时是不公平和潜在有害的结果。开发有原则的方法以奠定数据科学基础的时间已经成熟。这尤其具有挑战性,因为现实世界数据具有复杂的结构,前所未有的规模,迅速发展的特征,噪声和隐式偏见非常复杂。应对这些挑战需要在多个科学学科中进行一致的努力,例如在不确定性下进行强大决策的统计数据;数学和电气工程,用于使数据驱动的优化超出最坏情况;新算法范式的理论计算机科学和机器学习以道德方式处理动态和敏感数据;和基本科学,以使技术发展成为健康科学和社会的最前沿。拟议的数据科学新兴核心方法(Encore)汇集了一个多样化的研究人员,涵盖了加利福尼亚大学圣地亚哥分校,德克萨斯大学奥斯汀分校,宾夕法尼亚大学和加利福尼亚大学洛杉矶大学的上述学科。它提出了一个雄心勃勃的愿景,可以改变数据科学四个核心支柱的景观:C涉及数据的复杂性,o进行优化,负责任的学习以及E用于教育和参与。该研究所凭借其变革性的研究愿景,为从K-12到博士后和初级教师的各个级别的学生参与各种背景的学生提供了一个大胆的计划和扩大参与的计划。该项目的目的是通过在其合作伙伴大学提供合作课程以及为真正的多学科研究的灵活的会计计划提供协作课程,以影响学生的广泛人口。通过定期组织研讨会,暑期学校和研讨会,该项目旨在吸引整个科学界成为数据科学基础的研究和教育的新联系。为了将理论发展的果实练习,Encore将不断与行业伙伴,领域科学家合作,并将与其他国家科学基金会建立牢固的联系。它的研究任务围绕三个主题组织。数据复杂性上的第一个主题解决了数据的复杂特征,例如大小,巨大的特征空间,快速变化,各种来源,内隐依赖性结构,任意异常值和噪声。需要对算法设计的核心概念进行重大改革,并以不同的计算复杂性度量的整体视图。面对噪声和离群值,由于动态和变化的数据,不确定性估计是必要的,同时也很困难。数据异质性即使在基本分类任务中也构成了重大挑战。隐藏在此类数据中的结构关系对于理解和处理至关重要,对于下游数据分析任务,例如可视化和神经科学。 Encore的第二个主题旨在改变自适应方法和人类干预可以带来重大进展的经典优化领域。它计划重新审视分布式优化的基础,以包括异质性,鲁棒性,安全性和沟通;并解决由于控制和增强学习中动态数据的分布变化而导致的统计不确定性。 Encore的第三个也是最后一个主题提议建立负责任的学习的基础。当学习模型难以理解和繁殖时,机器学习在人体面向系统中的应用受到严重阻碍,可能会产生偏见的结果,很容易被对手改变并揭示敏感信息。因此,必须将可解释性,可重现性,公平性,隐私性和鲁棒性纳入任何数据驱动的决策中。对指导和宣传,协作课程设计,社会意识负责任的研究计划,广泛的研究所活动和工业合作伙伴关系的经验和奉献精神将为Encore带来更广泛的影响。为期一年的指导的暑期学校将在涉及大量人口的三个州举行。将开发具有混合动力和完全在线产品的联合课程。利用迄今为止对74,000多名K-12学生产生影响的先前经验的经验,Encore将启动一项雄心勃勃,周到的外展计划,以改善代表性不足的群体的代表性,并帮助创造一个多元化,负责任的未来一代劳动力,并在数据科学中具有稳固的基础。更广泛的影响审查标准。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Sujay Sanghavi其他文献

Stratospheric chlorine activation in the Arctic winters 1995/96–2001/02 derived from GOME OClO measurements
1995/96–2001/02 北极冬季平流层氯活化来自 GOME OClO 测量
  • DOI:
    10.1016/j.asr.2003.08.069
  • 发表时间:
    2004
  • 期刊:
  • 影响因子:
    2.6
  • 作者:
    S. Kühl;W. Wilms;S. Beirle;C. Frankenberg;M. Grzegorski;J. Hollwedel;F. Khokhar;Sarit Kraus;U. Platt;Sujay Sanghavi;C. V. Friedeburg;T. Wagner
  • 通讯作者:
    T. Wagner
Geometric Median (GM) Matching for Robust Data Pruning
用于稳健数据修剪的几何中值 (GM) 匹配
  • DOI:
  • 发表时间:
    2024
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Anish Acharya;I. Dhillon;Sujay Sanghavi
  • 通讯作者:
    Sujay Sanghavi
Learning Graphical Models for Hypothesis Testing
学习假设检验的图形模型
In-Context Learning with Transformers: Softmax Attention Adapts to Function Lipschitzness
使用 Transformers 进行上下文学习:Softmax Attention 适应函数 Lipschitzness
  • DOI:
    10.48550/arxiv.2402.11639
  • 发表时间:
    2024
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Liam Collins;Advait Parulekar;Aryan Mokhtari;Sujay Sanghavi;Sanjay Shakkottai
  • 通讯作者:
    Sanjay Shakkottai
Understanding the Training Speedup from Sampling with Approximate Losses
了解具有近似损失的采样的训练加速
  • DOI:
    10.48550/arxiv.2402.07052
  • 发表时间:
    2024
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Rudrajit Das;Xi Chen;Bertram Ieong;Parikshit Bansal;Sujay Sanghavi
  • 通讯作者:
    Sujay Sanghavi

Sujay Sanghavi的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Sujay Sanghavi', 18)}}的其他基金

HDR TRIPODS: UT Austin Institute on the Foundations of Data Science
HDR TRIPODS:UT Austin 数据科学基础研究所
  • 批准号:
    1934932
  • 财政年份:
    2019
  • 资助金额:
    $ 257.23万
  • 项目类别:
    Continuing Grant
AF: Medium: Dropping Convexity: New Algorithms, Statistical Guarantees and Scalable Software for Non-convex Matrix Estimation
AF:中:降低凸性:用于非凸矩阵估计的新算法、统计保证和可扩展软件
  • 批准号:
    1564000
  • 财政年份:
    2016
  • 资助金额:
    $ 257.23万
  • 项目类别:
    Continuing Grant
CIF: Medium: Collaborative Research: New Approaches to Robustness in High-Dimensions
CIF:中:协作研究:高维鲁棒性的新方法
  • 批准号:
    1302435
  • 财政年份:
    2013
  • 资助金额:
    $ 257.23万
  • 项目类别:
    Continuing Grant
CAREER: Networks and Statistical Inference: New Connections and Algorithms
职业:网络和统计推断:新连接和算法
  • 批准号:
    0954059
  • 财政年份:
    2010
  • 资助金额:
    $ 257.23万
  • 项目类别:
    Continuing Grant
NetSE: Small: Social Networks in the Real World: From Sensing to Structure Analysis
NetSE:小型:现实世界中的社交网络:从感知到结构分析
  • 批准号:
    1017525
  • 财政年份:
    2010
  • 资助金额:
    $ 257.23万
  • 项目类别:
    Standard Grant
NeTS: Medium: Collaborative Research: Shaping, Learning and Optimizing Dynamic Networks
NeTS:媒介:协作研究:塑造、学习和优化动态网络
  • 批准号:
    0964391
  • 财政年份:
    2010
  • 资助金额:
    $ 257.23万
  • 项目类别:
    Continuing Grant

相似国自然基金

支持二维毫米波波束扫描的微波/毫米波高集成度天线研究
  • 批准号:
    62371263
  • 批准年份:
    2023
  • 资助金额:
    52 万元
  • 项目类别:
    面上项目
腙的Heck/脱氮气重排串联反应研究
  • 批准号:
    22301211
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
水系锌离子电池协同性能调控及枝晶抑制机理研究
  • 批准号:
    52364038
  • 批准年份:
    2023
  • 资助金额:
    33 万元
  • 项目类别:
    地区科学基金项目
基于人类血清素神经元报告系统研究TSPYL1突变对婴儿猝死综合征的致病作用及机制
  • 批准号:
    82371176
  • 批准年份:
    2023
  • 资助金额:
    49 万元
  • 项目类别:
    面上项目
FOXO3 m6A甲基化修饰诱导滋养细胞衰老效应在补肾法治疗自然流产中的机制研究
  • 批准号:
    82305286
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

Collaborative Research: EnCORE: Institute for Emerging CORE Methods in Data Science
合作研究:EnCORE:数据科学新兴核心方法研究所
  • 批准号:
    2217058
  • 财政年份:
    2022
  • 资助金额:
    $ 257.23万
  • 项目类别:
    Continuing Grant
Collaborative Research: Empowering Educators to Create Customized, Culturally-Responsive Instructional Materials from Scratch Encore Harmonized with the Interest of Students
协作研究:使教育工作者能够从头开始创建定制的、文化响应式的教学材料,并与学生的兴趣相协调
  • 批准号:
    2201313
  • 财政年份:
    2022
  • 资助金额:
    $ 257.23万
  • 项目类别:
    Continuing Grant
Collaborative Research: EnCORE: Institute for Emerging CORE Methods in Data Science
合作研究:EnCORE:数据科学新兴核心方法研究所
  • 批准号:
    2217033
  • 财政年份:
    2022
  • 资助金额:
    $ 257.23万
  • 项目类别:
    Continuing Grant
Collaborative Research: Empowering Educators to Create Customized, Culturally-Responsive Instructional Materials from Scratch Encore Harmonized with the Interest of Students
协作研究:使教育工作者能够从头开始创建定制的、文化响应式的教学材料,并与学生的兴趣相协调
  • 批准号:
    2201312
  • 财政年份:
    2022
  • 资助金额:
    $ 257.23万
  • 项目类别:
    Continuing Grant
Collaborative Research: EnCORE: Institute for Emerging CORE Methods in Data Science
合作研究:EnCORE:数据科学新兴核心方法研究所
  • 批准号:
    2217062
  • 财政年份:
    2022
  • 资助金额:
    $ 257.23万
  • 项目类别:
    Continuing Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了