EAGER: Novel sampling algorithms for scaling up spectral methods for unsupervised learning
EAGER:用于扩大无监督学习光谱方法的新型采样算法
基本信息
- 批准号:1650080
- 负责人:
- 金额:$ 9万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2016
- 资助国家:美国
- 起止时间:2016-08-15 至 2018-07-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
In the era of big data, unsupervised learning has become increasingly important. At a high-level, unsupervised learning serves to reduce the data size, while capturing its important underlying structure. For a powerful and widely-used family of unsupervised learning techniques (those based on spectral methods), scaling up to large data sets poses significant computational challenges. This research project will develop extremely simple and lightweight sampling techniques for scaling up this family of unsupervised learning methods. Since big data is ubiquitous, these research advances are likely to be transformative to a range of fields. This project will benefit society through the research team's ongoing collaborations in climate science, agriculture, and finance. The team will also continue to engage the computer science community in this endeavor, by training students, developing tutorials, and broadening the participation of women and minorities in computing.This project will advance machine learning research by scaling up spectral methods for the analysis of large data sets. While spectral methods for the unsupervised learning tasks of clustering and embedding have found wide success in a variety of practical applications, scaling them up to large data sets poses significant computational challenges. In particular, the storage and computation needed to handle the affinity matrix (a matrix of pairwise similarities between data points) can be prohibitive. An approach that has found promise is to instead approximate this matrix in some sense. The goal of this project is to provide simple approximation techniques that manage the tradeoff between their space and time complexity vs. the quality of the approximation. The proposed approach involves sampling techniques that address this goal by exploiting latent structure in a data set, in order to minimize the amount of information that needs to be stored to (approximately) represent it. This leads to techniques that speed up the computation and reduce the memory requirements of spectral methods, while simultaneously providing better approximations. The project will also continue the team's momentum on leveraging advances in machine learning for data-driven discovery.
在大数据时代,无监督的学习变得越来越重要。在高级,无监督的学习中,学习可以减少数据大小,同时捕获其重要的基础结构。对于强大且普遍使用的无监督学习技术(基于光谱方法的家庭),扩展到大型数据集会带来重大的计算挑战。该研究项目将开发出极为简单且轻巧的抽样技术,以扩展这种无监督的学习方法。由于大数据无处不在,因此这些研究进步可能会变成一系列领域。该项目将通过研究团队在气候科学,农业和金融方面的持续合作来使社会受益。该团队还将通过培训学生,开发教程并扩大妇女和少数群体参与计算机的参与来继续与计算机科学界的参与。该项目将通过扩大频谱方法来扩展机器学习研究,以分析大型数据集。尽管无监督的学习和嵌入学习任务的光谱方法在各种实际应用中都取得了广泛的成功,但将它们扩展到大型数据集却带来了重大的计算挑战。特别是,处理亲和力矩阵所需的存储和计算(数据点之间成对相似性的矩阵)可能是令人难以置信的。一种发现承诺的方法是在某种意义上近似该矩阵。该项目的目的是提供简单的近似技术,以管理其空间和时间复杂性与近似质量之间的权衡。所提出的方法涉及采样技术,通过在数据集中利用潜在结构来解决该目标,以最大程度地减少需要存储(大约)代表它的信息的量。这导致技术可以加快计算并减少光谱方法的内存要求,同时提供更好的近似值。该项目还将继续在为数据驱动的发现方面利用机器学习方面的进步方面的势头。
项目成果
期刊论文数量(3)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Convergence Rate of Stochastic k-means
随机 k 均值的收敛率
- DOI:
- 发表时间:2017
- 期刊:
- 影响因子:0
- 作者:Tang, Cheng;Monteleoni, Claire
- 通讯作者:Monteleoni, Claire
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Claire Monteleoni其他文献
Claire Monteleoni的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Claire Monteleoni', 18)}}的其他基金
EAGER: Collaborative Research: Learning Relations between Extreme Weather Events and Planet-Wide Environmental Trends
EAGER:合作研究:学习极端天气事件与全球环境趋势之间的关系
- 批准号:
1451954 - 财政年份:2014
- 资助金额:
$ 9万 - 项目类别:
Standard Grant
相似国自然基金
novel-miR75靶向OPR2,CA2和STK基因调控人参真菌胁迫响应的分子机制研究
- 批准号:82304677
- 批准年份:2023
- 资助金额:30.00 万元
- 项目类别:青年科学基金项目
海南广藿香Novel17-GSO1响应p-HBA调控连作障碍的分子机制
- 批准号:82304658
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
白术多糖通过novel-mir2双靶向TRADD/MLKL缓解免疫抑制雏鹅的胸腺程序性坏死
- 批准号:
- 批准年份:2021
- 资助金额:30 万元
- 项目类别:青年科学基金项目
novel-miR-59靶向HMGAs介导儿童早衰症细胞衰老的作用及机制研究
- 批准号:32171163
- 批准年份:2021
- 资助金额:58.00 万元
- 项目类别:面上项目
novel_circ_001042/miR-298-5p/Capn1轴调节线粒体能量代谢在先天性肛门直肠畸形发生中的作用机制研究
- 批准号:
- 批准年份:2021
- 资助金额:55 万元
- 项目类别:面上项目
相似海外基金
Collaborative Research: EAGER--Novel Sampling and Isotopic Characterization of Upper Strato- to Mesospheric Photochemistry
合作研究:EAGER——上平层至中层光化学的新型采样和同位素表征
- 批准号:
2204474 - 财政年份:2022
- 资助金额:
$ 9万 - 项目类别:
Standard Grant
Collaborative Research: EAGER--Novel Sampling and Isotopic Characterization of Upper Strato- to Mesospheric Photochemistry
合作研究:EAGER——上平层至中层光化学的新型采样和同位素表征
- 批准号:
2204475 - 财政年份:2022
- 资助金额:
$ 9万 - 项目类别:
Standard Grant
EAGER: Design and Fabrication of a Novel Micro-Reactor for Molecular Sampling of Combustion
EAGER:设计和制造用于燃烧分子采样的新型微反应器
- 批准号:
1834656 - 财政年份:2018
- 资助金额:
$ 9万 - 项目类别:
Standard Grant