Foundations of Data Science Institute

数据科学研究所基础

基本信息

  • 批准号:
    2023505
  • 负责人:
  • 金额:
    $ 590.03万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2020
  • 资助国家:
    美国
  • 起止时间:
    2020-09-01 至 2025-08-31
  • 项目状态:
    未结题

项目摘要

The Foundations of Data Science Institute (FODSI) brings together a large and diverse team of researchers and educators from UC Berkeley, MIT, Boston University, Bryn Mawr College, Harvard University, Howard University, and Northeastern University, with the aim of advancing the theoretical foundations for the field of data science. Data science has emerged as a central science for the 21st century, a widespread approach to science and technology that exploits the explosion in the availability of data to allow empirical investigations at unprecedented scale and scope. It now plays a central role in diverse domains across all of science, commerce and industry. The development of theoretical foundations for principled approaches to data science is particularly challenging because it requires progress across the full breadth of scientific issues that arise in the rich and complex processes by which data can be used to make decisions. These issues include the specification of the goals of data analysis, the development of models that aim to capture the way in which data may have arisen, the crafting of algorithms that are responsive to the models and goals, an understanding of the impact of misspecifications of these models and goals, an understanding of the effects of interactions, interventions and feedback mechanisms that affect the data and the interpretation of the results, concern about the uncertainty of these results, an understanding of the impact of other decision-makers with competing goals, and concern about the economic, social, and ethical implications of automated data analysis and decision-making. To address these challenges, FODSI brings together experts from many cognate academic disciplines, including computer science, statistics, mathematics, electrical engineering, and economics. Institute research outcomes have strong potential to directly impact the many application domains for data science in industry, commerce, science and society, facilitated by mechanisms that directly involve a stream of institute-trained personnel in industrial partners' projects, and by public activities designed to nurture substantive interactions between foundational and use-inspired research communities in data science. The institute also aims to educate and mentor future leaders in data science, through the further development of a pioneering undergraduate program in data science, and by training a diverse cohort of graduate students and postdocs with an innovative approach that emphasizes strong mentorship, flexibility, and breadth of collaboration opportunities. In addition, the institute plans to host an annual summer school that will deliver core curriculum and a taste of foundational research to a diverse group of advanced undergraduates, graduate students, and postdocs. It aims to broaden participation and increase diversity in the data science workforce, bringing the excitement of data science to under-represented groups at the high school level, and targeting diverse participation in the institute's public activities. And it will act as a nexus for research and education in the foundations of data science, by convening public events, such as summer schools and research workshops and other collaborative research opportunities, and by providing models for education, human resource development, and broadening participation.The scientific focus of the institute will encompass the full range of issues that arise in data science -- modeling issues, inferential issues, computational issues, and societal issues – and the challenges that emerge from the conflicts between their competing requirements. Its research agenda is organized around eight themes. Three of these themes focus on key challenges arising from the rich variety of interactions between a decision maker and its environment, not only the classical view of data that is processed in a batch or a stream, but also sequential interactions with feedback (the control perspective), experimental interactions designed to answer "what if" questions (the causality perspective), and strategic interactions involving other actors with conflicting goals (the economic perspective). The other research themes focus on opportunities for major impacts across disciplinary boundaries: on elucidating the algorithmic landscape of statistical problems, and in particular the computational complexity of statistical estimation problems, on sketching, sampling, and sub-linear time algorithms designed to address issues of scalability in data science problems; on exploiting statistical methodology in the service of algorithms; and on using breakthroughs in applied mathematics to address computational and inferential challenges. Intellectual contributions to societal issues in data science will feature throughout this set of themes. The institute will exploit strong connections with its scientific and industrial partners to ensure that these research directions enjoy a rich engagement with a broad range of commercial, technological and scientific application domains. Its sequence of research workshops and a collaborative research program will serve the broader research community by nurturing additional research in these key challenge areas. The institute will be led by a steering committee that will seek the help of an external advisory board to prioritize its research themes and activities throughout its lifetime. Its educational programs will include curriculum development from K-12 through undergraduate, a graduate level visit program, and a postdoc training model, aimed at empowering the next generation of leaders to fluidly work across conventional disciplinary boundaries while being mindful of the full set of scientific issues. The institute will undertake a multi-pronged effort to recruit, engage and support the full range of groups traditionally under-represented in mathematics, computer science and statistics.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
数据科学研究所基金会 (FODSI) 汇集了来自加州大学伯克利分校、麻省理工学院、波士顿大学、布林莫尔学院、哈佛大学、霍华德大学和东北大学的庞大且多元化的研究人员和教育工作者团队,旨在推进理论研究。数据科学已成为 21 世纪的一门核心科学,它是一种广泛的科学技术方法,它利用数据可用性的爆炸式增长,以前所未有的规模和范围进行实证研究。中发挥核心作用数据科学原则方法的理论基础的发展尤其具有挑战性,因为它需要在数据使用的丰富而复杂的过程中出现的全面科学问题上取得进展。这些问题包括数据分析目标的规范、旨在捕获数据可能出现方式的模型的开发、响应模型和目标的算法的设计、对数据分析的理解。这些模型和目标的错误指定的影响,了解影响数据和结果解释的相互作用、干预措施和反馈机制的影响,对这些结果的不确定性的关注,对具有相互竞争目标的其他决策者的影响的理解,以及对经济、为了应对这些挑战,FODSI 汇集了来自许多相关学科的专家,包括计算机科学、统计学、数学、电气工程和经济学,研究所的研究成果具有强大的潜力。直接影响工业数据科学的许多应用领域,该研究所还致力于通过直接让经过研究所培训的人员参与工业合作伙伴项目的机制以及旨在培养数据科学领域的基础研究社区和受使用启发的研究社区之间的实质性互动的公共活动来促进商业、科学和社会的发展。通过开发数据科学领域的开创性本科课程,并通过强调强有力的指导、灵活性和进一步广泛的合作机会的创新方法来培训多元化的研究生和博士后群体,教育和指导数据科学领域的未来领导者此外,该研究所计划举办一年一度的暑期学校,为不同的高年级本科生、研究生和博士后群体提供核心课程和基础研究的体验,旨在扩大数据科学劳动力的参与范围并增加多样性,带来兴奋。它将通过举办暑期学校等公共活动,作为数据科学基础研究和教育的纽带。和研究研讨会等合作研究机会,并提供教育、人力资源开发和扩大参与的模型。该研究所的科学重点将涵盖数据科学中出现的全方位问题——建模问题、推理问题、计算问题和社会问题其研究议程围绕八个主题,重点关注决策者与其环境之间丰富多样的互动所带来的关键挑战。以批处理或流方式处理的数据的经典视图,但是还有反馈的顺序交互(控制视角)、旨在回答“假设”问题的实验交互(因果关系视角)以及具有冲突目标的其他参与者的战略交互(经济视角)。跨学科边界的影响:阐明统计问题的算法景观,特别是统计估计问题的计算复杂性,旨在解决数据科学问题中的可扩展性问题的草图、采样和亚线性时间算法;方法论服务于算法;以及利用应用数学的突破来解决数据科学中的计算和推理挑战,该研究所将利用与科学和工业合作伙伴的紧密联系来确保这些研究方向。其一系列研究研讨会和合作研究项目将通过培育这些关键挑战领域的更多研究来服务于更广泛的研究界。指导委员会将寻求一个机构的帮助其教育计划将包括从 K-12 到本科生的课程开发、研究生访问计划和博士后培训模式,旨在使下一代领导者能够流畅地进行研究。跨传统学科界限开展工作,同时关注全套科学问题。该研究所将多管齐下,招募、参与和支持传统上在数学、计算机科学和统计学领域代表性不足的各个群体。 NSF 的法定使命通过使用基金会的智力价值和更广泛的影响审查标准进行评估,该项目被认为值得支持。

项目成果

期刊论文数量(53)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Image-to-Image Regression with Distribution-Free Uncertainty Quantification and Applications in Imaging
无分布不确定性量化的图像到图像回归及其在成像中的应用
  • DOI:
  • 发表时间:
    2022-02-10
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Anastasios Nikolas Angelopoulos;Amit Kohli;Stephen Bates;Michael I. Jordan;J. Malik;T. Alshaabi;S. Upadhyayula;Yaniv Romano
  • 通讯作者:
    Yaniv Romano
Markov Persuasion Processes and Reinforcement Learning
马尔可夫说服过程和强化学习
Provable Benefits of Actor-Critic Methods for Offline Reinforcement Learning
离线强化学习的 Actor-Critic 方法的可证明的好处
  • DOI:
  • 发表时间:
    2021-01
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Zanette, Andrea;Brunskill, Emma;Wainwright, Martin J.
  • 通讯作者:
    Wainwright, Martin J.
When does gradient descent with logistic loss find interpolating two-layer networks?
具有逻辑损失的梯度下降何时找到插值两层网络?
  • DOI:
  • 发表时间:
    2021-01
  • 期刊:
  • 影响因子:
    6
  • 作者:
    Chatterji, Niladri S.;Long, Philip M.;Bartlett, Peter L.
  • 通讯作者:
    Bartlett, Peter L.
Near Optimal Policy Optimization via REPS
通过 REPS 实现近乎最优的策略优化
  • DOI:
    10.1016/j.mtcomm.2023.106560
  • 发表时间:
    2021-03-17
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Aldo Pacchiano;Jonathan Lee;P. Bartlett;Ofir Nachum
  • 通讯作者:
    Ofir Nachum
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Peter Bartlett其他文献

Defending Against Saddle Point Attack in Byzantine-Robust Distributed Learning Supplementary Material
防御拜占庭稳健分布式学习补充材料中的鞍点攻击
Mathematical Foundations of Machine Learning
机器学习的数学基础
  • DOI:
    10.4171/owr/2021/15
  • 发表时间:
    2022-03-14
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Peter Bartlett;Cristina Butucea;Johannes Schmidt
  • 通讯作者:
    Johannes Schmidt
Can a Transformer Represent a Kalman Filter?
变压器可以代表卡尔曼滤波器吗?
  • DOI:
    10.48550/arxiv.2312.06937
  • 发表时间:
    2023-12-12
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Gautam Goel;Peter Bartlett
  • 通讯作者:
    Peter Bartlett
Space, the final frontier: outdoor access for people living with dementia
空间,最后的前沿:痴呆症患者的户外活动
  • DOI:
  • 发表时间:
    2017
  • 期刊:
  • 影响因子:
    3.4
  • 作者:
    Elaine Argyle;T. Dening;Peter Bartlett
  • 通讯作者:
    Peter Bartlett
Minimax Fixed-Design Linear Regression
极小极大固定设计线性回归
  • DOI:
  • 发表时间:
    2015
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Peter Bartlett; Wouter Koolen; Alan Malek; Eiji Takimoto; Manfred Warmuth
  • 通讯作者:
    Manfred Warmuth

Peter Bartlett的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Peter Bartlett', 18)}}的其他基金

Conference: Women-in-Theory Workshop
会议:女性理论研讨会
  • 批准号:
    2227705
  • 财政年份:
    2022
  • 资助金额:
    $ 590.03万
  • 项目类别:
    Standard Grant
Conference: Women-in-Theory Workshop
会议:女性理论研讨会
  • 批准号:
    2227705
  • 财政年份:
    2022
  • 资助金额:
    $ 590.03万
  • 项目类别:
    Standard Grant
Collaboration on the Theoretical Foundations of Deep Learning
深度学习理论基础的合作
  • 批准号:
    2031883
  • 财政年份:
    2020
  • 资助金额:
    $ 590.03万
  • 项目类别:
    Continuing Grant
RI: AF: Small: Optimizing probabilities for learning: sampling meets optimization
RI:AF:小:优化学习概率:采样满足优化
  • 批准号:
    1909365
  • 财政年份:
    2019
  • 资助金额:
    $ 590.03万
  • 项目类别:
    Continuing Grant
RI: AF: Small: Deep Learning Theory
RI:AF:小:深度学习理论
  • 批准号:
    1619362
  • 财政年份:
    2016
  • 资助金额:
    $ 590.03万
  • 项目类别:
    Standard Grant
MCS: AF: Small: Algorithms for Large Scale Prediction Problems
MCS:AF:小型:大规模预测问题的算法
  • 批准号:
    1115788
  • 财政年份:
    2011
  • 资助金额:
    $ 590.03万
  • 项目类别:
    Standard Grant
Regularization Methods for Online Learning
在线学习的正则化方法
  • 批准号:
    0830410
  • 财政年份:
    2008
  • 资助金额:
    $ 590.03万
  • 项目类别:
    Standard Grant
Statistical Methods for Prediction of Individual Sequences
预测个体序列的统计方法
  • 批准号:
    0707060
  • 财政年份:
    2007
  • 资助金额:
    $ 590.03万
  • 项目类别:
    Continuing Grant
MSPA-MCS: Collaborative Research: Statistical Learning Methods for Complex Decision Problems in Natural Language Processing
MSPA-MCS:协作研究:自然语言处理中复杂决策问题的统计学习方法
  • 批准号:
    0434383
  • 财政年份:
    2004
  • 资助金额:
    $ 590.03万
  • 项目类别:
    Standard Grant

相似国自然基金

基于大数据的全球基础研究人才分布及科学基金人才培养成效研究
  • 批准号:
  • 批准年份:
    2019
  • 资助金额:
    30 万元
  • 项目类别:
    专项基金项目
再制造产品性能调控中的基础科学问题
  • 批准号:
    51535011
  • 批准年份:
    2015
  • 资助金额:
    290.0 万元
  • 项目类别:
    重点项目
现代管理科学中国学派的基础性研究
  • 批准号:
    71071070
  • 批准年份:
    2010
  • 资助金额:
    27.0 万元
  • 项目类别:
    面上项目
关于国家自然科学基金资助经济管理基础数据建设的可行性研究
  • 批准号:
    70940004
  • 批准年份:
    2009
  • 资助金额:
    8.0 万元
  • 项目类别:
    专项基金项目
中国管理科学研究基础数据库方案设计与实施研究
  • 批准号:
    G0624010
  • 批准年份:
    2006
  • 资助金额:
    7.3 万元
  • 项目类别:
    专项基金项目

相似海外基金

Time series clustering to identify and translate time-varying multipollutant exposures for health studies
时间序列聚类可识别和转化随时间变化的多污染物暴露以进行健康研究
  • 批准号:
    10749341
  • 财政年份:
    2024
  • 资助金额:
    $ 590.03万
  • 项目类别:
Convergent Engineering and Biomolecular Science
融合工程与生物分子科学
  • 批准号:
    10557613
  • 财政年份:
    2023
  • 资助金额:
    $ 590.03万
  • 项目类别:
Regulation of seizure timing by circadian rhythms and sleep
通过昼夜节律和睡眠调节癫痫发作时间
  • 批准号:
    10643189
  • 财政年份:
    2023
  • 资助金额:
    $ 590.03万
  • 项目类别:
UNderstanding the Delivery of Low-Value CAre To CHildren and the Barriers to De-Implementation (UN-LATCH)
了解向儿童提供低价值护理以及取消实施的障碍 (UN-LATCH)
  • 批准号:
    10649811
  • 财政年份:
    2023
  • 资助金额:
    $ 590.03万
  • 项目类别:
Social Vulnerability, Sleep, and Early Hypertension Risk in Younger Adults
年轻人的社会脆弱性、睡眠和早期高血压风险
  • 批准号:
    10643145
  • 财政年份:
    2023
  • 资助金额:
    $ 590.03万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了