Collaborative Research: CIF: Small: Theory for Learning Lossless and Lossy Coding

协作研究:CIF:小型:学习无损和有损编码的理论

基本信息

项目摘要

An estimated 330,000 billion bytes of data is generated daily in various forms: video, images, and music, but also scientific, economic, and industrial content. This enormous amount of data has already transformed modern life in ways that are transparent (social media) and in ways that are not immediately visible (furthering scientific, business, and economic goals through better modeling, forecast and use of data). Data is communicated, often wirelessly, on massive scales in many formats: videos, images, and music, and in real time applications such as gaming, streaming content, video calls and telemedicine. In order to handle this amount of data, it needs to be compressed by algorithms that examine the data to understand the underlying structure and remove redundant descriptions, seeking thus to use fewer bits to represent the same. Traditional compression method includes the well-known JPEG (joint photographic experts group) compression for images from smartphones, for example. This is a lossy compression method, as some image quality is lost. Lossless compression, with no quality loss, is typically used for compressing computer files (e.g., with Zip) and for lossless music streaming. In recent years, machine learning has become very powerful and used to solve many problems like autonomous driving, speech recognition, and implementing chatbots. A recent focus is to use machine learning for data compression. The aim of this project is to understand the fundamental theory of machine learning for data compression, for example what type of machine learning algorithms can compress data well and how many samples are needed to learn compression well. Through this fundamental understanding of data compression using machine learning, the aim is to develop more powerful compression methods, leading to more efficient use of wireless spectrum and less energy consumption by mobile devices.Recently, there has been much effort in developing machine learning methods for source coding by both researchers and high tech companies. These methods have had some success in beating traditional source coding methods. The project aims to develop fundamental bounds for performance of learning for both lossless and lossy source coding. The problem is framed in a probably approximately correct (PAC) learning framework, both uniform and non-uniform. The first part of the research considers lossless source coding, both of interest in itself and as a basis of lossy source coding, and aims to develop bounds for learning. The project investigates what factors influence the convergence of learning. This is extended with an active learning framework, where the algorithms can adapt how much data they need to examine, using more data for more subtle models and less data for simpler models, and figuring out when the underlying model may be simple with what is known as a "stopping rule." The second part of the research considers lossy source coding, in particular almost lossless source coding and lossless coding of real-valued sources. The aim is to understand in what sense source coding can be learned (e.g., uniform vs non-uniform PAC), and based on this to develop performance bounds. Estimation, compression, and learning have always been known to be subtly different, and these nuances translate into quantifiably large implications for problems harnessing them; this research will resolve some of these tangles, particularly for sources with memory. The fundamental understanding of learning for coding developed through this project will in turn result in the development of better coding methods.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
估计每天以各种形式生成3,000亿个数据:视频,图像和音乐,以及科学,经济和工业内容。这些大量数据已经以透明的方式(社交媒体)以及无法立即可见的方式改变了现代生活(通过更好的建模,预测和数据使用来推动科学,业务和经济目标,从而进一步发展科学,商业和经济目标)。数据以多种格式的大规模传达,通常是无线传达的:视频,图像和音乐,以及在游戏,流媒体内容,视频呼叫和远程医疗等实时应用中。为了处理这一数量的数据,需要通过检查数据以了解基础结构并删除冗余描述的算法来压缩它,从而寻求使用更少的位来表示相同。传统的压缩方法包括众所周知的JPEG(联合摄影专家小组)用于智能手机图像的压缩。这是一种有损的压缩方法,因为丢失了一些图像质量。无损压缩,没有质量损失,通常用于压缩计算机文件(例如,使用ZIP)和无损音乐流。近年来,机器学习变得非常强大,并用于解决许多问题,例如自动驾驶,语音识别和实施聊天机器人。最近的重点是将机器学习用于数据压缩。该项目的目的是了解用于数据压缩的机器学习的基本理论,例如,哪种类型的机器学习算法可以很好地压缩数据,以及需要多少个样本才能很好地学习压缩。通过使用机器学习对数据压缩的基本理解,目的是开发更强大的压缩方法,从而更有效地利用无线频谱,而移动设备的能源消耗更少。实际上,在开发研究人员和高科技公司的机器学习方法上,已经付出了很大的努力。这些方法在击败传统源编码方法方面取得了一些成功。该项目旨在开发基本的界限,以进行无损和有损源编码的学习的性能。该问题以统一和不均匀的近似正确(PAC)学习框架构建。研究的第一部分考虑了无损来源编码,这本身就是感兴趣的,也是源源编码的基础,并旨在发展学习范围。该项目研究哪些因素会影响学习的融合。这是通过主动学习框架扩展的,在该框架中,算法可以调整他们需要检查的数据,使用更多数据来用于更微妙的模型,并为更简单的模型进行更少的数据,并弄清楚何时使用“停止规则”的基础模型可能很简单。研究的第二部分考虑了有损耗的源编码,特别是几乎无损的源编码和无损失的资源编码。目的是了解可以学习源编码的意义(例如,统一与非统一PAC),并基于此来开发性能界限。始终众所周知,估计,压缩和学习是微妙的不同,这些细微差别转化为对利用它们的问题的巨大含义。这项研究将解决其中一些缠结,特别是对于有记忆力的来源。对通过该项目开发的编码的学习的基本理解反过来将导致开发更好的编码方法。该奖项反映了NSF的法定任务,并被认为是值得通过基金会的知识分子优点和更广泛的影响评估标准通过评估来支持的。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Zixiang Xiong其他文献

An optimal packetization scheme for fine granularity scalable bitstream
细粒度可扩展比特流的最佳打包方案
Channel protection fundamentals
通道保护基础知识
  • DOI:
    10.1016/b978-012088480-3/50008-4
  • 发表时间:
    2007
  • 期刊:
  • 影响因子:
    10.4
  • 作者:
    R. Hamzaoui;V. Stanković;Zixiang Xiong;K. Ramchandran;R. Puri;A. Majumdar;J. Chou
  • 通讯作者:
    J. Chou
Optimal rate allocation in progressive joint source-channel coding for image transmission over CDMA networks
CDMA 网络图像传输渐进联合源信道编码中的最优速率分配
Practical rateless cooperation in multiple access channels using multiplexed Raptor codes
使用复用 Raptor 码在多接入信道中实现实用的无速率协作
Video Multicast over Heterogeneous Networks Based on Distributed Source Coding Principles
基于分布式信源编码原理的异构网络视频组播

Zixiang Xiong的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Zixiang Xiong', 18)}}的其他基金

Collaborative Research: CIF: Small: Beyond Compressed Sensing: Analog Coding for Communications
合作研究:CIF:小型:超越压缩感知:通信模拟编码
  • 批准号:
    2007527
  • 财政年份:
    2020
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
Collaborative Research: Delay and Energy: Design Tradeoffs in Spectrally Efficient Systems
合作研究:延迟和能量:频谱效率系统的设计权衡
  • 批准号:
    1923803
  • 财政年份:
    2019
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
CIF: Small: Multiterminal Video Coding: From Theory to Practice
CIF:小型:多终端视频编码:从理论到实践
  • 批准号:
    1216001
  • 财政年份:
    2012
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
CIF:Small: Collaborative Research: Minimum Energy Communications in Wireless Networks
CIF:Small:合作研究:无线网络中的最低能量通信
  • 批准号:
    1017829
  • 财政年份:
    2010
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
Collaborative Research: Capacity and Coding in Resource-Limited Wireless Networks
合作研究:资源有限无线网络中的容量和编码
  • 批准号:
    0729149
  • 财政年份:
    2007
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
Distributed Source Coding: Theory, Algorithms, and Applications
分布式源编码:理论、算法和应用
  • 批准号:
    0430720
  • 财政年份:
    2004
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
Scalable Compression and Transmission of Internet Multimedia
互联网多媒体的可扩展压缩和传输
  • 批准号:
    0104834
  • 财政年份:
    2001
  • 资助金额:
    $ 20万
  • 项目类别:
    Continuing Grant
CAREER: Progressive coding and transmission of images and video
职业:图像和视频的渐进编码和传输
  • 批准号:
    9874444
  • 财政年份:
    1999
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
CAREER: Progressive coding and transmission of images and video
职业:图像和视频的渐进编码和传输
  • 批准号:
    0096070
  • 财政年份:
    1999
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant

相似国自然基金

钛基骨植入物表面电沉积镁氢涂层及其促成骨性能研究
  • 批准号:
    52371195
  • 批准年份:
    2023
  • 资助金额:
    50 万元
  • 项目类别:
    面上项目
CLMP介导Connexin45-β-catenin复合体对先天性短肠综合征的致病机制研究
  • 批准号:
    82370525
  • 批准年份:
    2023
  • 资助金额:
    49 万元
  • 项目类别:
    面上项目
人工局域表面等离激元高灵敏传感及其系统小型化的关键技术研究
  • 批准号:
    62371132
  • 批准年份:
    2023
  • 资助金额:
    49 万元
  • 项目类别:
    面上项目
优先流对中俄原油管道沿线多年冻土水热稳定性的影响机制研究
  • 批准号:
    42301138
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
用于稳定锌负极的界面层/电解液双向调控研究
  • 批准号:
    52302289
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

Collaborative Research: CIF: Medium: Snapshot Computational Imaging with Metaoptics
合作研究:CIF:Medium:Metaoptics 快照计算成像
  • 批准号:
    2403122
  • 财政年份:
    2024
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
Collaborative Research: CIF-Medium: Privacy-preserving Machine Learning on Graphs
合作研究:CIF-Medium:图上的隐私保护机器学习
  • 批准号:
    2402815
  • 财政年份:
    2024
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
Collaborative Research: CIF: Small: Mathematical and Algorithmic Foundations of Multi-Task Learning
协作研究:CIF:小型:多任务学习的数学和算法基础
  • 批准号:
    2343599
  • 财政年份:
    2024
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
Collaborative Research: CIF: Small: Mathematical and Algorithmic Foundations of Multi-Task Learning
协作研究:CIF:小型:多任务学习的数学和算法基础
  • 批准号:
    2343600
  • 财政年份:
    2024
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
Collaborative Research:CIF:Small:Acoustic-Optic Vision - Combining Ultrasonic Sonars with Visible Sensors for Robust Machine Perception
合作研究:CIF:Small:声光视觉 - 将超声波声纳与可见传感器相结合,实现强大的机器感知
  • 批准号:
    2326905
  • 财政年份:
    2024
  • 资助金额:
    $ 20万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了