III: Small: Fast Subset Scan for Anomalous Pattern Detection

III:小:用于异常模式检测的快速子集扫描

基本信息

  • 批准号:
    0916345
  • 负责人:
  • 金额:
    $ 50万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2009
  • 资助国家:
    美国
  • 起止时间:
    2009-08-01 至 2013-07-31
  • 项目状态:
    已结题

项目摘要

This work will develop new methods for fast and scalable detection of anomalous patterns (subsets of the data that are interesting or unexpected) in massive, multivariate datasets. There will be a focus on real-world applications such as an emerging disease outbreak or a pattern of smuggling activity with complex, subtle, and probabilistic patterns that are difficult to spot with existing techniques. The research is based on two key insights. First, the pattern detection problem can be framed as a search over all subsets of the data, in which can be defined a measure of the "anomalousness" of a subset and then maximize this measure over all potentially relevant subsets. Second, it has been discovered that, for many spatial detection methods (including Kulldor's spatial scan statistic and many recently proposed variants), one can perform an exact search which efficiently maximizes the measure of anomalousness over all subsets of the data. The research team will explore this new combinatorial optimization method, investigate how it can be extended to constrained subset scans and to more general multivariate pattern detection problems, and examine how it can be incorporated into a subset scan framework, enabling the creation a variety of fast, scalable, and useful methods for anomalous pattern detection. Intellectual MeritThe research team will develop, implement, and evaluate a general probabilistic framework for efficient detection of anomalous patterns in both spatial and non-spatial datasets. The proposed work will address these challenging and important research questions:1)How can one define a useful measure of the "anomalousness" of a subset of the data, and efficiently optimize this measure over all subsets to find the most anomalous patterns?2) What are the necessary and sufficient conditions for a set function F (S ) to satisfy the "linear- time subset scanning" (LTSS) property, enabling exact unconstrained optimization of F (S ) over all 2 N subsets of N records while only requiring O(N ) subsets to be evaluated?3) How can one extend fast subset scanning methods to general multivariate datasets, and incorporate search constraints such as proximity, connectivity, and self-similarity?4) How can one deal with uncertainty about the effects of an anomalous pattern by searching over subsets of "input" and "output" attributes as well as subsets of records? Broader ImpactDevelopment and testing will be prioritized in three areas: 1) early detection of disease outbreaks, 2) detecting illicit container shipments, and 3) identifying anomalous trends in social networks. These applications will allow the demonstration the value of these methods across a wide spectrum of domains. Through existing collaborations, the algorithms will be incorporated into deployed systems for health and crime surveillance that contribute directly to the public good. The Principle Investigator's lab has over 5 years of history offering free machine learning software, and the software implementations of all algorithms developed through this grant will be made publicly available. The bulk of the funding will go to training graduate students who will become the next generation of researchers to explore new methods for anomalous pattern detection. Key Words: anomalous patterns; pattern detection; fast subset scan; scan statistics; optimization.
这项工作将开发新的方法,用于在大型多元数据集中快速,可扩展的检测异常模式(数据的子集(有趣或意外))。将重点关注现实世界中的应用,例如新兴疾病爆发或具有复杂,微妙和概率模式的走私活动模式,这些模式很难通过现有技术发现。该研究基于两个关键见解。首先,可以将模式检测问题作为对数据的所有子集进行搜索,在其中可以定义子集的“异常性”度量,然后在所有潜在相关的子集上最大化该度量。其次,已经发现,对于许多空间检测方法(包括Kulldor的空间扫描统计量和许多最近提出的变体),人们可以执行精确的搜索,从而有效地最大程度地提高了所有数据子集的异常性度量。研究团队将探索这种新的组合优化方法,研究如何将其扩展到受约束的子集扫描和更一般的多元模式检测问题,并研究如何将其纳入子集扫描框架中,从而使创建各种快速,可扩展性和有用的方法用于检测异常模式检测。知识分子的研究团队将开发,实施和评估一个普遍的概率框架,以有效检测空间和非空间数据集中的异常模式。拟议的工作将解决这些具有挑战性和重要的研究问题:1)如何定义数据子集的“异常”的有用度量,并有效地优化了该措施在所有子集上优化该措施以找到最异常的模式,以找到最必要的和足够的条件,以满足“线性 - 时间为sefss secss secans secans secans secants scan sects sects sects secants secants”(s)(s)(s)(s)(s)(s)(s)(s)(s)的范围(s)(s)(s)(s)(s)(s s)。 ) over all 2 N subsets of N records while only requiring O(N ) subsets to be evaluated?3) How can one extend fast subset scanning methods to general multivariate datasets, and incorporate search constraints such as proximity, connectivity, and self-similarity?4) How can one deal with uncertainty about the effects of an anomalous pattern by searching over subsets of "input" and "output" attributes as well as subsets of记录?在三个领域将优先考虑更广泛的影响开发和测试:1)早期发现疾病爆发,2)检测非法集装箱的运输; 3)确定社交网络中的异常趋势。这些应用将允许演示这些方法在各种域中的值。通过现有的合作,该算法将被纳入部署的健康和犯罪监视系统中,直接对公共利益做出贡献。主要调查员的实验室拥有5年的历史,可提供免费的机器学习软件,并且将公开提供通过该赠款开发的所有算法的软件实施。资金的大部分将用于培训研究生,他们将成为下一代研究人员,以探索新的模式检测方法。关键词:异常模式;模式检测;快速子集扫描;扫描统计;优化。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Daniel Neill其他文献

Identifying Significant Predictive Bias in Classifiers June 2017
识别分类器中的显着预测偏差 2017 年 6 月
  • DOI:
  • 发表时间:
    2017
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Zhe Zhang;Daniel Neill
  • 通讯作者:
    Daniel Neill
Anticorps dirigé contre il-17br
Anticorps dirigé against il-17br
  • DOI:
  • 发表时间:
    2010
  • 期刊:
  • 影响因子:
    0
  • 作者:
    A. N. McKenzie;Daniel Neill
  • 通讯作者:
    Daniel Neill
A novel MXI1-NUTM2B fusion detected in an undifferentiated ovarian cancer
  • DOI:
    10.1016/j.gore.2024.101653
  • 发表时间:
    2025-02-01
  • 期刊:
  • 影响因子:
  • 作者:
    Mohammed Elshafey;Malek Ghandour;Rebecca M. Adams;Daniel Neill;Radhika Gogoi
  • 通讯作者:
    Radhika Gogoi

Daniel Neill的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Daniel Neill', 18)}}的其他基金

The impact of thermally-regulated cell wall modifications on Streptococcus pneumoniae pathogenesis
热调节细胞壁修饰对肺炎链球菌发病机制的影响
  • 批准号:
    MR/X009130/1
  • 财政年份:
    2023
  • 资助金额:
    $ 50万
  • 项目类别:
    Research Grant
FAI: End-To-End Fairness for Algorithm-in-the-Loop Decision Making in the Public Sector
FAI:公共部门算法在环决策的端到端公平性
  • 批准号:
    2040898
  • 财政年份:
    2021
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
CAREER: Machine Learning and Event Detection for the Public Good
职业:公益机器学习和事件检测
  • 批准号:
    0953330
  • 财政年份:
    2010
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant

相似国自然基金

靶向Treg-FOXP3小分子抑制剂的筛选及其在肺癌免疫治疗中的作用和机制研究
  • 批准号:
    32370966
  • 批准年份:
    2023
  • 资助金额:
    50 万元
  • 项目类别:
    面上项目
化学小分子激活YAP诱导染色质可塑性促进心脏祖细胞重编程的表观遗传机制研究
  • 批准号:
    82304478
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
靶向小胶质细胞的仿生甘草酸纳米颗粒构建及作用机制研究:脓毒症相关性脑病的治疗新策略
  • 批准号:
    82302422
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
HMGB1/TLR4/Cathepsin B途径介导的小胶质细胞焦亡在新生大鼠缺氧缺血脑病中的作用与机制
  • 批准号:
    82371712
  • 批准年份:
    2023
  • 资助金额:
    49 万元
  • 项目类别:
    面上项目
小分子无半胱氨酸蛋白调控生防真菌杀虫活性的作用与机理
  • 批准号:
    32372613
  • 批准年份:
    2023
  • 资助金额:
    50 万元
  • 项目类别:
    面上项目

相似海外基金

EAGER: III: Small: Green Granular Neural Networks with Fast FPGA-based Incremental Transfer Learning
EAGER:III:小型:具有基于 FPGA 的快速增量迁移学习的绿色粒度神经网络
  • 批准号:
    2234227
  • 财政年份:
    2022
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
Small Intestine Targeted Fast Acting Oral Insulin Formulation
小肠靶向速效口服胰岛素制剂
  • 批准号:
    10385154
  • 财政年份:
    2021
  • 资助金额:
    $ 50万
  • 项目类别:
III: Small: Task-aware Materialization for Fast Data Analytics
III:小型:用于快速数据分析的任务感知物化
  • 批准号:
    1910014
  • 财政年份:
    2019
  • 资助金额:
    $ 50万
  • 项目类别:
    Continuing Grant
III: Small: Fast and Efficient Algorithms for Matrix Decompositions and Applications to Human Genetics
III:小:快速高效的矩阵分解算法及其在人类遗传学中的应用
  • 批准号:
    1661756
  • 财政年份:
    2016
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
III: Small: Fast and Efficient Algorithms for Matrix Decompositions and Applications to Human Genetics
III:小:快速高效的矩阵分解算法及其在人类遗传学中的应用
  • 批准号:
    1319280
  • 财政年份:
    2013
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了