EAGER: Algorithms for Analyzing Faulty Data Using Domain Information

EAGER：使用域信息分析错误数据的算法

基本信息

批准号：
2414736
负责人：
Funda Ergun
金额：
$ 30万
依托单位：
Indiana University
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2024
资助国家：
美国
起止时间：
2024-03-01 至 2026-02-28
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2414736&HistoricalAwards=false
关键词：
EAGER Algorithms Analyzing Faulty Data

项目摘要

The focus of this project is the building of a mathematical theory for analyzing large data that contains errors by taking advantage of domain knowledge regarding the processes that have created the data, as well as the error model. The project contains three thrusts, listed from the most well-defined to the most exploratory. The first thrust involves analyzing genomic data in order to investigate tumor evolution trees that lead to the development of cancer. The second involves analyzing faulty data generated by computer networks while utilizing information about the network such as its topology and delay pattern. The third is exploring other areas for which the techniques developed for the first two thrusts apply, making progress towards the goal of developing general techniques for analyzing faulty data in the absence of a known ground truth using domain information.In the model that this project assumes, the input contains errors that have been probabilistically generated according to a known distribution in unknown locations. The goal that the investigator would like to explore is the creation of sampling techniques that do not blindly take random samples from the prohibitively large space for the ground truth; rather, it is to use the knowledge about restrictions that limit the possible space that could have led to the noisy input and analyze this much smaller space. In particular, the first focus of this project is to explore how such information can be used to generate efficient sampling techniques in order to infer properties of tumor progression trees, and, later on, more general phylogenetic trees. Later parts of this project involve applying this knowledge to routing graphs and other data with underlying well-structured graphs. Since such techniques rely on graph-theoretic assumptions underlying the inputs, the goal for all three thrusts is to develop widely applicable probabilistic techniques that will help one analyze noisy graph information in general, pushing existing theoretical knowledge forward, as well as bringing a better understanding to applied areas with strong theoretical underpinnings.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

该项目的重点是建立一种数学理论，用于分析包含错误的大数据，通过利用有关创建数据的过程以及误差模型的域知识。该项目包含三个推力，从最明确的探索性到最探索性。第一个推力涉及分析基因组数据，以研究导致癌症发展的肿瘤进化树。第二个涉及分析计算机网络生成的错误数据，同时利用有关网络的信息，例如其拓扑和延迟模式。第三个是探索针对前两个推力开发的技术的其他领域，朝着开发通用技术的目标，以在没有使用域信息的情况下，在没有已知的基础真理的情况下分析错误的数据。在该项目假设的模型中，该输入包含根据未知位置中已知的分布产生的错误。研究人员想探索的目标是创建抽样技术，这些技术不会盲目地从极大的空间中盲目地取样，以实现地面真相。相反，它是利用有关限制可能导致嘈杂输入的可能空间的限制的知识，并分析了这个较小的空间。特别是，该项目的第一个重点是探索如何使用此类信息来生成有效的采样技术，以推断肿瘤进展树的特性，以及后来在更通用的系统发育树上。该项目的后期部分涉及将这些知识应用于路由图和其他基础结构良好的图形的数据。由于此类技术依赖于输入的基础理论假设，因此所有三个推力的目标是开发广泛适用的概率技术，这些技术将有助于一个人分析一般的嘈杂图形信息，从而使现有的理论知识向前迈进，并推动了具有强有力的理论授予的奖励，并将其带到了更好的理解。基金会的智力优点和更广泛的影响评论标准。

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

数据更新时间：{{ journalArticles.updateTime }}

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Funda Ergun其他文献

Network design for tolerating multiple link failures using Fast Re-route (FRR)

使用快速重新路由 (FRR) 来容忍多个链路故障的网络设计

DOI：
10.1109/drcn.2014.6816140
发表时间：
2014
期刊：
2014 10th International Conference on the Design of Reliable Communication Networks (DRCN)
影响因子：
0
作者：
R. Sinha;Funda Ergun;K. Oikonomou;K. Ramakrishnan
通讯作者：
K. Ramakrishnan