III: Medium: Collaborative Research: Algorithms and Cyberinfrastructure for High-Precision Automated Quality Control of Hydro-Meteo Sensor Networks
III:媒介:合作研究:Hydro-Meteo 传感器网络高精度自动化质量控制的算法和网络基础设施
基本信息
- 批准号:1514550
- 负责人:
- 金额:$ 63.55万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2015
- 资助国家:美国
- 起止时间:2015-09-01 至 2019-08-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Advances in sensor technology are greatly expanding the range of quantities that can be measured while simultaneously reducing the cost. However, deployed sensors drift out of calibration and fail, so every sensor network requires quality control (QC) procedures to promptly detect these failures. Existing QC methods rely on human experts to carefully examine the data, which means that when the number of sensors in a network doubles, the number of experts must double too. This project will develop algorithms and software to increase the level of automation in sensor QC so that a smaller number of experts can manage a much larger network of sensors. The methods will be tested on weather data from Oklahoma (the Oklahoma Mesonet), Oregon (the Andrews Long-Term Ecological Network site), the US (the Earth Networks "WeatherBug" network), and sub-Saharan Africa (the TAHMO project), and if the methods are found to work well, they will be deployed in these networks at at the CUAHSI Water Data Center. Accurate weather data could significantly increase the productivity of farms and improve food security, particularly in Africa.The project will develop an open-source standards-compliant system, SENSOR-DX, that implements automated data QC. Existing probabilistic QC methods assume that correct sensor readings are jointly Gaussian and readings from broken sensors obey a uniform distribution. These assumptions lead to many QC mistakes. This project will develop a new approach in which novel nonparametric anomaly detection algorithms analyze the sensor data. Correct sensor readings have low anomaly scores, while broken sensor readings have high scores; both follow parametric distributions. Probabilistic methods can therefore model the distribution of the resulting anomaly scores instead of the joint distribution of the original sensor readings and infer (probabilistically) whether each sensor is working correctly. To enhance the fault-detection capability of the anomaly detection algorithms, the raw sensor data will be detrended and assembled into multiple views that highlight various correlations among sensor values. The project will develop a novel View-Anomaly-Diagnosis (VAD) framework in which anomaly detection algorithms are applied to the tuples in each view, and then the anomaly scores are combined via a probabilistic diagnostic model to infer which sensors are broken and which are functioning correctly. The project will study how good the detrending models need to be in order to enhance the accuracy of anomaly detection. The new anomaly detection algorithms are based on a new anomaly detection principle: "anomaly detection by overfitting". Existing methods fit a statistical model to "normal" behavior and then identify data points that do not fit well ("are underfit") and mark them as anomalies. The new principle measures how easy it is to "overfit" a statistical model that separates candidate anomalies from the rest of the data. The project will develop new algorithms based on this principle and understand how they relate to existing methods of anomaly detection by underfitting. The VAD framework will be implemented in the SENSOR-DX system: a series of Kepler workflows that provide support for connecting a new sensor network, training the detrending and anomaly detection models, performing real-time anomaly detection, and repairing bad sensor readings using predictive models. SENSOR-DX will also support semantic matching of new sensor data streams by extending the EnvThs controlled vocabulary thesaurus.For further information see the project web site at http://tahmo.org/sensor-dx
传感器技术的进步极大地扩大了可测量的数量范围,同时降低了成本。然而,部署的传感器会偏离校准并发生故障,因此每个传感器网络都需要质量控制 (QC) 程序来及时检测这些故障。现有的质量控制方法依赖人类专家仔细检查数据,这意味着当网络中的传感器数量加倍时,专家的数量也必须加倍。该项目将开发算法和软件,以提高传感器质量控制的自动化水平,以便更少的专家可以管理更大的传感器网络。这些方法将在俄克拉荷马州(俄克拉荷马州 Mesonet)、俄勒冈州(安德鲁斯长期生态网络站点)、美国(地球网络“WeatherBug”网络)和撒哈拉以南非洲地区(TAHMO 项目)的天气数据上进行测试,如果发现这些方法效果良好,它们将部署在 CUAHSI 水数据中心的这些网络中。准确的天气数据可以显着提高农场的生产力并改善粮食安全,特别是在非洲。该项目将开发一个符合开源标准的系统 SENSOR-DX,该系统可实现自动化数据质量控制。现有的概率质量控制方法假设正确的传感器读数是联合高斯分布的,并且来自损坏传感器的读数服从均匀分布。这些假设会导致许多质量控制错误。该项目将开发一种新方法,利用新颖的非参数异常检测算法分析传感器数据。正确的传感器读数具有较低的异常分数,而损坏的传感器读数具有较高的分数;两者都遵循参数分布。因此,概率方法可以对所得异常分数的分布(而不是原始传感器读数的联合分布)进行建模,并(概率地)推断每个传感器是否正常工作。 为了增强异常检测算法的故障检测能力,原始传感器数据将被去趋势化并组装成多个视图,突出传感器值之间的各种相关性。该项目将开发一种新颖的视图异常诊断(VAD)框架,其中异常检测算法应用于每个视图中的元组,然后通过概率诊断模型组合异常分数,以推断哪些传感器损坏了,哪些传感器损坏了。正常运行。该项目将研究去趋势模型需要有多好才能提高异常检测的准确性。新的异常检测算法基于新的异常检测原理:“通过过度拟合进行异常检测”。现有方法将统计模型拟合为“正常”行为,然后识别不太拟合(“拟合不足”)的数据点并将其标记为异常。新原理衡量了将候选异常与其余数据分开的统计模型“过度拟合”的容易程度。该项目将基于这一原理开发新算法,并了解它们与现有的欠拟合异常检测方法的关系。 VAD 框架将在 SENSOR-DX 系统中实现:一系列 Kepler 工作流程,为连接新传感器网络、训练去趋势和异常检测模型、执行实时异常检测以及使用预测修复不良传感器读数提供支持。模型。 SENSOR-DX 还将通过扩展 EnvThs 受控词汇同义词库来支持新传感器数据流的语义匹配。有关更多信息,请参阅该项目网站:http://tahmo.org/sensor-dx
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Thomas Dietterich其他文献
Thomas Dietterich的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Thomas Dietterich', 18)}}的其他基金
Collaborative Research: CompSustNet: Expanding the Horizons of Computational Sustainability
合作研究:CompSustNet:拓展计算可持续性的视野
- 批准号:
1521687 - 财政年份:2015
- 资助金额:
$ 63.55万 - 项目类别:
Continuing Grant
CyberSEES: Type 2: Computing and Visualizing Optimal Policies for Ecosystem Management
CyberSEES:类型 2:计算和可视化生态系统管理的最佳策略
- 批准号:
1331932 - 财政年份:2013
- 资助金额:
$ 63.55万 - 项目类别:
Standard Grant
Collaborative Research: AVATOL - Next Generation Phenomics for the Tree of Life
合作研究:AVATOL - 生命之树的下一代表型组学
- 批准号:
1208272 - 财政年份:2012
- 资助金额:
$ 63.55万 - 项目类别:
Standard Grant
Collaborative Research: CDI-Type II: BirdCast: Novel Machine Learning Methods for Understanding Continent-Scale Bird Migration
合作研究:CDI-Type II:BirdCast:用于理解大陆规模鸟类迁徙的新型机器学习方法
- 批准号:
1125228 - 财政年份:2011
- 资助金额:
$ 63.55万 - 项目类别:
Standard Grant
II-EN: A compute cluster and software tools for Monte-Carlo methods in artificial intelligence
II-EN:人工智能中蒙特卡罗方法的计算集群和软件工具
- 批准号:
0958482 - 财政年份:2010
- 资助金额:
$ 63.55万 - 项目类别:
Standard Grant
Collaborative Research: Computational Sustainability: Computational Methods for a Sustainable Environment, Economy, and Society
合作研究:计算可持续性:可持续环境、经济和社会的计算方法
- 批准号:
0832804 - 财政年份:2008
- 资助金额:
$ 63.55万 - 项目类别:
Continuing Grant
RI: Machine Learning for Robust Recognition of Invertebrate Specimens in Ecological Science
RI:机器学习在生态科学中对无脊椎动物标本的鲁棒识别
- 批准号:
0705765 - 财政年份:2007
- 资助金额:
$ 63.55万 - 项目类别:
Standard Grant
Off-the-shelf Learning Algorithms for Structural Supervised Learning
用于结构监督学习的现成学习算法
- 批准号:
0307592 - 财政年份:2003
- 资助金额:
$ 63.55万 - 项目类别:
Continuing Grant
SGER: Exploiting Contextual Knowledge to Design Input Representations for Machine Learning
SGER:利用上下文知识设计机器学习的输入表示
- 批准号:
0335525 - 财政年份:2003
- 资助金额:
$ 63.55万 - 项目类别:
Standard Grant
Student Participant Support for the International Conference on Machine Learning 2003
2003 年国际机器学习会议的学生参与者支持
- 批准号:
0331758 - 财政年份:2003
- 资助金额:
$ 63.55万 - 项目类别:
Standard Grant
相似国自然基金
基于机器学习和经典电动力学研究中等尺寸金属纳米粒子的量子表面等离激元
- 批准号:22373002
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
基于挥发性分布和氧化校正的大气半/中等挥发性有机物来源解析方法构建
- 批准号:42377095
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
中等质量黑洞附近的暗物质分布及其IMRI系统引力波回波探测
- 批准号:12365008
- 批准年份:2023
- 资助金额:32 万元
- 项目类别:地区科学基金项目
复合低维拓扑材料中等离激元增强光学响应的研究
- 批准号:12374288
- 批准年份:2023
- 资助金额:52 万元
- 项目类别:面上项目
中等垂直风切变下非对称型热带气旋快速增强的物理机制研究
- 批准号:42305004
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
相似海外基金
III : Medium: Collaborative Research: From Open Data to Open Data Curation
III:媒介:协作研究:从开放数据到开放数据管理
- 批准号:
2420691 - 财政年份:2024
- 资助金额:
$ 63.55万 - 项目类别:
Standard Grant
Collaborative Research: III: Medium: New Machine Learning Empowered Nanoinformatics System for Advancing Nanomaterial Design
合作研究:III:媒介:新的机器学习赋能纳米信息学系统,促进纳米材料设计
- 批准号:
2402311 - 财政年份:2023
- 资助金额:
$ 63.55万 - 项目类别:
Standard Grant
Collaborative Research: IIS: III: MEDIUM: Learning Protein-ish: Foundational Insight on Protein Language Models for Better Understanding, Democratized Access, and Discovery
协作研究:IIS:III:中等:学习蛋白质:对蛋白质语言模型的基础洞察,以更好地理解、民主化访问和发现
- 批准号:
2310114 - 财政年份:2023
- 资助金额:
$ 63.55万 - 项目类别:
Standard Grant
Collaborative Research: III: Medium: Towards Effective Detection and Mitigation for Shortcut Learning: A Data Modeling Framework
协作研究:III:媒介:针对捷径学习的有效检测和缓解:数据建模框架
- 批准号:
2310262 - 财政年份:2023
- 资助金额:
$ 63.55万 - 项目类别:
Standard Grant
Collaborative Research: III: Medium: Towards Effective Detection and Mitigation for Shortcut Learning: A Data Modeling Framework
协作研究:III:媒介:针对捷径学习的有效检测和缓解:数据建模框架
- 批准号:
2310260 - 财政年份:2023
- 资助金额:
$ 63.55万 - 项目类别:
Standard Grant