TensorLABE - Robust Characterization of Data Tensors and Synthetic Data Generation
TensorLABE - 数据张量的稳健表征和合成数据生成
基本信息
- 批准号:2223932
- 负责人:
- 金额:$ 15.65万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2022
- 资助国家:美国
- 起止时间:2022-09-01 至 2024-08-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Modern computational methods, such as Machine Learning (ML) based approaches, have produced impressive gains in efficiency and performance but are increasingly dependent on massive amounts of data. These data-driven approaches transition from the classical techniques of human-engineered source code to algorithms trained on a dataset to produce the desired solution, placing the data in the driver's seat. The proliferation of these data-driven technologies is being enabled and hastened by new hardware and software systems specifically designed to support the complex data-driven computation associated with these algorithms and the massive volumes of data accompanying them. But despite the impressive performance gains of these new hardware and software systems, understanding their design's data component has languished in favor of performance-driven advances in software and hardware-based solutions. The lack of data understanding has led to a number of undesirable outcomes such as unwanted bias in the data-driven solution, an inability to determine the actual suitability of a data set to solving a given problem ahead of time, an inability to determine if a data set has been manipulated or corrupted, and an inability to produce accurate synthetic data that can be used to train and test the performance of these software and hardware systems. This project aims to provide a robust framework for the characterization of large-scale tensor-based datasets to improve understanding of the data itself and enable the production of synthetic data that more accurately replicates real-world data for use in system design testing and validation.Specifically, this project proposes to advance knowledge in the fields of multilinear algebra, large-scale data analytics, machine learning, and artificial intelligence by incorporating a variety of tensor methods for statistical, structural, and performative data analyses to achieve more robust data characterization. A more holistic set of data characterizations will enable better assessment of data for bias and evaluation of datasets for suitability for a particular task. It will also allow the comparison of datasets to understand their differences and assess data for corruption or manipulation. A proof of concept will be established by incorporating the data characterization methods developed in the project into generating synthetic data with higher degrees of realism than conventional methods. The approach will be validated by testing the ability of the synthetic data to characterize software/hardware system performance more accurately.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
现代计算方法,例如基于机器学习(ML)的方法,在效率和性能方面产生了令人印象深刻的提高,但越来越多地取决于大量数据。这些数据驱动的方法从人工设计的源代码的经典技术过渡到在数据集中训练的算法以生成所需的解决方案,并将数据放置在驾驶员座椅中。这些数据驱动的技术的扩散是通过专门设计的新硬件和软件系统启用和加速的,该系统专门为支持与这些算法相关的复杂数据驱动的计算以及随附的数据量相关的大量数据。但是,尽管这些新的硬件和软件系统的性能令人印象深刻,但了解其设计的数据组件仍然倾向于以软件和基于硬件的解决方案的性能驱动的进步。缺乏数据理解导致了许多不良结果,例如数据驱动解决方案中不需要的偏见,无法确定数据集以提前解决给定问题的实际适用性,无法确定数据集已被操纵或损坏,并且是否可以使用准确的合成数据来培训和测试这些软件和这些软件和硬件和硬件和硬件和难以训练的系统。该项目旨在为表征基于大规模张量的数据集提供一个强大的框架,以提高对数据本身的了解并启用合成数据的生产,从而更准确地复制现实世界数据,以在系统设计测试和验证中使用。特别是在系统设计测试和验证中。统计,结构和性能数据分析的方法,以实现更强大的数据表征。一组更整体的数据特征将使数据集以适合特定任务的数据集评估数据和数据集评估。它还将允许数据集比较数据集的差异并评估数据是否损坏或操纵。将通过将项目中开发的数据表征方法纳入比传统方法更高的合成数据来确定概念证明。该方法将通过测试合成数据更准确地表征软件/硬件系统性能的能力来验证该方法。该奖项反映了NSF的法定任务,并使用基金会的知识分子优点和更广泛的影响审查标准,被认为值得通过评估来获得支持。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Tim Andersen其他文献
High-Throughput Virtual Screening Molecular Docking Software for Students and Educators
适合学生和教育工作者的高通量虚拟筛选分子对接软件
- DOI:
- 发表时间:
2018 - 期刊:
- 影响因子:0
- 作者:
Tim Andersen;Owen M. McDougal - 通讯作者:
Owen M. McDougal
Random Processes with High Variance Produce Scale Free Networks
具有高方差的随机过程产生无规模网络
- DOI:
- 发表时间:
2022 - 期刊:
- 影响因子:0
- 作者:
Josh Johnston;Tim Andersen - 通讯作者:
Tim Andersen
Tim Andersen的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Tim Andersen', 18)}}的其他基金
SemiSynBio: Nucleic Acid Memory
SemiSynBio:核酸记忆
- 批准号:
1807809 - 财政年份:2018
- 资助金额:
$ 15.65万 - 项目类别:
Continuing Grant
EAGER: Tensor500: A Streaming Analytics High Performance Computing Benchmark
EAGER:Tensor500:流分析高性能计算基准
- 批准号:
1849463 - 财政年份:2018
- 资助金额:
$ 15.65万 - 项目类别:
Standard Grant
EAGER: Stream500: A New Benchmark and Infrastructure for Streaming Analytics
EAGER:Stream500:流分析的新基准和基础设施
- 批准号:
1641774 - 财政年份:2016
- 资助金额:
$ 15.65万 - 项目类别:
Standard Grant
相似国自然基金
强壮前沟藻共生细菌降解膦酸酯产生促藻效应的分子机制
- 批准号:42306167
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
高效率强壮消息鉴别码的分析与设计
- 批准号:61202422
- 批准年份:2012
- 资助金额:23.0 万元
- 项目类别:青年科学基金项目
半定松弛与非凸二次约束二次规划研究
- 批准号:11271243
- 批准年份:2012
- 资助金额:60.0 万元
- 项目类别:面上项目
基于复合编码脉冲串的水下主动隐蔽性探测新方法研究
- 批准号:61271414
- 批准年份:2012
- 资助金额:60.0 万元
- 项目类别:面上项目
民航客运网络收益管理若干问题的研究
- 批准号:60776817
- 批准年份:2007
- 资助金额:20.0 万元
- 项目类别:联合基金项目
相似海外基金
Identification and characterization of microbiome-derived biomarkers via novel and robust systems-based approaches.
通过新颖且强大的基于系统的方法来鉴定和表征微生物组衍生的生物标志物。
- 批准号:
RGPIN-2022-05010 - 财政年份:2022
- 资助金额:
$ 15.65万 - 项目类别:
Discovery Grants Program - Individual
COLLABORATIVE RESEARCH: GCR: Characterization and Robust Multivariable Control of the Dynamics of Gas Exchange During Peritoneal Oxygenated Perfluorocarbon Perfusion
合作研究:GCR:腹膜全氟化碳灌注过程中气体交换动力学的表征和鲁棒多变量控制
- 批准号:
2227939 - 财政年份:2021
- 资助金额:
$ 15.65万 - 项目类别:
Continuing Grant
COLLABORATIVE RESEARCH: GCR: Characterization and Robust Multivariable Control of the Dynamics of Gas Exchange During Peritoneal Oxygenated Perfluorocarbon Perfusion
合作研究:GCR:腹膜全氟化碳灌注过程中气体交换动力学的表征和鲁棒多变量控制
- 批准号:
2121101 - 财政年份:2021
- 资助金额:
$ 15.65万 - 项目类别:
Continuing Grant
Robust Characterization of Brain-Heart Coupling Across Development and Modulations by Disordered Sleep
脑心耦合在发育和睡眠障碍调节中的稳健表征
- 批准号:
10293076 - 财政年份:2021
- 资助金额:
$ 15.65万 - 项目类别:
Policy-Robust Processing Networks: Characterization and Design
策略稳健的处理网络:表征和设计
- 批准号:
2139566 - 财政年份:2021
- 资助金额:
$ 15.65万 - 项目类别:
Standard Grant