CIF: Small: Collaborative Research: Inference of Information Measures on Large Alphabets: Fundamental Limits, Fast Algorithms, and Applications
CIF:小型:协作研究:大字母表上信息测量的推断:基本限制、快速算法和应用
基本信息
- 批准号:1528159
- 负责人:
- 金额:$ 25万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2015
- 资助国家:美国
- 起止时间:2015-09-01 至 2018-08-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
A key task in information theory is to characterize fundamental performance limits in compression, communication, and more general operational problems involving the storage, transmission and processing of information. Such characterizations are usually in terms of information measures, among the most fundamental of which are the Shannon entropy and the mutual information. In addition to their prominent operational roles in the traditional realms of information theory, information measures have found numerous applications in many statistical modeling and machine learning tasks. Various modern data-analytic applications deal with data sets naturally viewed as samples from a probability distribution over a large domain. Due to the typically large alphabet size and resource constraints, the practitioner contends with the difficulty of undersampling in applications ranging from corpus linguistics to neuroscience. One of the main goals of this project is the development of a general theory based on a new set of mathematical tools that will facilitate the construction and analysis of optimal estimation of information measures on large alphabets. The other major facet of this project is the incorporation of the new theoretical methodologies into machine learning algorithms, thereby significantly impacting current real-world learning practices. Successful completion of this project will result in enabling technologies and practical schemes - in applications ranging from analysis of neural response data to learning graphical models - that are provably much closer to attaining the fundamental performance limits than existing ones. The findings of this project will enrich existing big data-analytic curricula. A new course dedicated to high-dimensional statistical inference that addresses estimation for large-alphabet data in depth will be created and offered. Workshops on the themes and findings of this project will be organized and held at Stanford and UIUC. A comprehensive approximation-theoretic approach to estimating functionals of distributions on large alphabets will be developed via computationally efficient procedures based on best polynomial approximation, with provable essential optimality guarantees. Rooted in the high-dimensional statistics literature, our key observation is that while estimating the distribution itself requires the sample size to scale linearly with the alphabet size, it is possible to accurately estimate functionals of the distribution, such as entropy or mutual information, with sub-linear sample complexity. This requires going beyond the conventional wisdom by developing more sophisticated approaches than maximal likelihood (?plug-in?) estimation. The other major facet of this project is translating the new theoretical methodologies into highly scalable and efficient machine learning algorithms, thereby significantly impacting current real-world learning practices and significantly boosting the performance in several of the most prevalent machine learning applications, such as learning graphical models, that rely on mutual information estimation.
信息论的一个关键任务是描述压缩、通信以及涉及信息存储、传输和处理的更一般操作问题的基本性能限制。这种表征通常是信息度量,其中最基本的是香农熵和互信息。除了在传统信息论领域中的突出操作作用之外,信息度量还在许多统计建模和机器学习任务中找到了广泛的应用。各种现代数据分析应用程序处理的数据集自然被视为来自大域的概率分布的样本。由于通常较大的字母表大小和资源限制,从业者在从语料库语言学到神经科学的应用中都面临着欠采样的困难。该项目的主要目标之一是开发基于一套新数学工具的通用理论,该理论将有助于构建和分析大型字母表上信息度量的最佳估计。该项目的另一个主要方面是将新的理论方法纳入机器学习算法中,从而显着影响当前的现实世界学习实践。 该项目的成功完成将带来支持技术和实用方案——从神经反应数据分析到学习图形模型的应用范围——事实证明,这些技术和方案比现有方案更接近实现基本性能极限。该项目的研究结果将丰富现有的大数据分析课程。 将创建并提供一门致力于高维统计推断的新课程,该课程可深入解决大字母数据的估计问题。关于该项目的主题和研究成果的研讨会将在斯坦福大学和伊利诺伊大学香槟分校 (UIUC) 组织和举办。将通过基于最佳多项式近似的计算有效程序开发一种用于估计大字母表上分布函数的综合近似理论方法,并具有可证明的基本最优性保证。植根于高维统计文献,我们的主要观察是,虽然估计分布本身需要样本大小与字母表大小线性缩放,但可以准确估计分布的函数,例如熵或互信息,亚线性样本复杂度。这需要超越传统智慧,开发比最大似然(?插件?)估计更复杂的方法。该项目的另一个主要方面是将新的理论方法转化为高度可扩展和高效的机器学习算法,从而显着影响当前的现实世界学习实践,并显着提高几个最流行的机器学习应用程序的性能,例如学习图形模型,依赖于互信息估计。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Tsachy Weissman其他文献
Lottery Ticket Adaptation: Mitigating Destructive Interference in LLMs
彩票改编:减轻法学硕士的破坏性干扰
- DOI:
- 发表时间:
2024-06-24 - 期刊:
- 影响因子:0
- 作者:
Ashwinee Panda;Berivan Isik;Xiangyu Qi;Sanmi Koyejo;Tsachy Weissman;Prateek Mittal - 通讯作者:
Prateek Mittal
Communication-Efficient Federated Learning through Importance Sampling
通过重要性采样实现高效沟通的联邦学习
- DOI:
10.48550/arxiv.2306.12625 - 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Berivan Isik;Francesco Pase;Deniz Gündüz;Oluwasanmi Koyejo;Tsachy Weissman;Michele Zorzi - 通讯作者:
Michele Zorzi
Tsachy Weissman的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Tsachy Weissman', 18)}}的其他基金
Collaborative Research: CIF: Medium: An Information-Theoretic Foundation for Adaptive Bidding in First-Price Auctions
合作研究:CIF:媒介:一价拍卖中自适应出价的信息理论基础
- 批准号:
2106467 - 财政年份:2021
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
CIF:Small:Collaborative Research: Compressed databases for similarity queries: fundamental limits and algorithms
CIF:Small:协作研究:用于相似性查询的压缩数据库:基本限制和算法
- 批准号:
1321174 - 财政年份:2013
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
EAGER: Action in Information Processing
EAGER:信息处理中的行动
- 批准号:
1049413 - 财政年份:2010
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
Collaborative Research: The Role of Feedback in Two-Way Communication Networks
协作研究:反馈在双向通信网络中的作用
- 批准号:
0729119 - 财政年份:2007
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
CAREER: Toward a Unified Approach to Universality in Information Processing
职业:走向信息处理通用性的统一方法
- 批准号:
0546535 - 财政年份:2006
- 资助金额:
$ 25万 - 项目类别:
Continuing Grant
相似国自然基金
小分子代谢物Catechin与TRPV1相互作用激活外周感觉神经元介导尿毒症瘙痒的机制研究
- 批准号:82371229
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
DHEA抑制小胶质细胞Fis1乳酸化修饰减轻POCD的机制
- 批准号:82301369
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
异常激活的小胶质细胞通过上调CTSS抑制微血管特异性因子MFSD2A表达促进1型糖尿病视网膜病变的免疫学机制研究
- 批准号:82370827
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
SETDB1调控小胶质细胞功能及参与阿尔茨海默病发病机制的研究
- 批准号:82371419
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
PTBP1驱动H4K12la/BRD4/HIF1α复合物-PKM2正反馈环路促进非小细胞肺癌糖代谢重编程的机制研究及治疗方案探索
- 批准号:82303616
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
相似海外基金
Collaborative Research: NSF-AoF: CIF: Small: AI-assisted Waveform and Beamforming Design for Integrated Sensing and Communication
合作研究:NSF-AoF:CIF:小型:用于集成传感和通信的人工智能辅助波形和波束成形设计
- 批准号:
2326622 - 财政年份:2024
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
Collaborative Research: NSF-AoF: CIF: Small: AI-assisted Waveform and Beamforming Design for Integrated Sensing and Communication
合作研究:NSF-AoF:CIF:小型:用于集成传感和通信的人工智能辅助波形和波束成形设计
- 批准号:
2326621 - 财政年份:2024
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
Collaborative Research: CIF: Small: Mathematical and Algorithmic Foundations of Multi-Task Learning
协作研究:CIF:小型:多任务学习的数学和算法基础
- 批准号:
2343600 - 财政年份:2024
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
Collaborative Research: NSF-AoF: CIF: Small: AI-assisted Waveform and Beamforming Design for Integrated Sensing and Communication
合作研究:NSF-AoF:CIF:小型:用于集成传感和通信的人工智能辅助波形和波束成形设计
- 批准号:
2326622 - 财政年份:2024
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
Collaborative Research:CIF:Small: Acoustic-Optic Vision - Combining Ultrasonic Sonars with Visible Sensors for Robust Machine Perception
合作研究:CIF:Small:声光视觉 - 将超声波声纳与可见传感器相结合,实现强大的机器感知
- 批准号:
2326904 - 财政年份:2024
- 资助金额:
$ 25万 - 项目类别:
Standard Grant