CSR: Medium:Collaborative Research:Holistic, Cross-Site, Hybrid System Anomaly Debugging for Large Scale Hosting Infrastructures
CSR:中:协作研究:大规模托管基础设施的整体、跨站点、混合系统异常调试
基本信息
- 批准号:1514256
- 负责人:
- 金额:$ 28.2万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2015
- 资助国家:美国
- 起止时间:2015-08-01 至 2020-07-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Large-scale shared hosting infrastructures such as multi-tenant cloud computing systems have become increasingly popular by allowing users to lease resources on-demand in a cost-effective way. As multiple tenants may share computing resources, hosting infrastructures are complex systems and prone to various system anomalies. Although software developers often perform rigorous offline testing, many subtle bugs only manifest themselves during large-scale production run. Many anomalies such as those where the system does not crash but fails to behave as expected are hard to reproduce and diagnose using existing techniques. Existing system anomaly diagnosis work can be broadly classified into two categories: 1) the black-box schemes which do not require source code and are suitable for online production-site diagnosis, and 2) the white-box schemes which require source code and expensive code instrumentation and are suitable for development site, offline diagnosis. Although white-box schemes provide fine-grained diagnosis, large-scale production hosting infrastructures are reluctant to adopt them due to their high-overhead and intrusive system recording approaches.The overarching objective of this project is to explore an innovative cross-site system anomaly debugging approach that intelligently integrates production-site black-box diagnosis with development-site white-box debugging into a more powerful hosting infrastructure debugging framework. This project will develop techniques for development-site, offline white-box debugging that takes the production-site fault inference results as guidance to find the exact anomaly causes. The project will focus on diagnosing non-crashing system anomalies (e.g., performance degradation, service outage, software hang, unexpected halt) that are common in real world hosting infrastructures but are difficult to debug using existing techniques. Techniques developed in this project will generate significant impact on improving the robustness of real world hosting infrastructures. The PIs will develop new course modules on the hosting infrastructure debugging for both graduate and undergraduate classes they regularly teaches. This project will develop programming courseware based on the research prototypes developed in this project. The PIs will use their power of role model and a set of outreach activities to recruit more female students to pursue systems research. The PIs will disseminate their results and collected data broadly through publication and technology transfer. Developed software artifacts and experimental datasets will be released for public use.
大规模共享托管基础设施(例如多租户云计算系统)允许用户以经济有效的方式按需租赁资源,因此变得越来越流行。由于多个租户可能共享计算资源,因此托管基础设施是复杂的系统,并且容易出现各种系统异常。尽管软件开发人员经常进行严格的离线测试,但许多细微的错误只有在大规模生产运行时才会显现出来。许多异常情况(例如系统未崩溃但未能按预期运行)很难使用现有技术来重现和诊断。现有的系统异常诊断工作大致可以分为两类:1)黑盒方案,不需要源代码,适合生产现场在线诊断;2)白盒方案,需要源代码,价格昂贵。代码插装,适合开发现场、离线诊断。尽管白盒方案提供了细粒度的诊断,但由于其高开销和侵入性系统记录方法,大规模生产托管基础设施不愿意采用它们。该项目的首要目标是探索一种创新的跨站点系统异常调试方法,将生产站点黑盒诊断与开发站点白盒调试智能集成到更强大的托管基础设施调试框架中。该项目将开发开发现场离线白盒调试技术,以生产现场故障推断结果为指导,找到准确的异常原因。该项目将专注于诊断非崩溃系统异常(例如性能下降、服务中断、软件挂起、意外停止),这些异常在现实世界的托管基础设施中很常见,但很难使用现有技术进行调试。该项目开发的技术将对提高现实世界托管基础设施的稳健性产生重大影响。 PI 将为他们定期教授的研究生和本科生课程开发有关托管基础设施调试的新课程模块。本项目将根据本项目开发的研究原型开发编程课件。 PI 将利用她们的榜样力量和一系列外展活动来招募更多女学生从事系统研究。 PI 将通过出版物和技术转让广泛传播其结果和收集的数据。开发的软件工件和实验数据集将发布供公众使用。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Shan Lu其他文献
Dyadobacter chenhuakuii sp. nov., Dyadobacter chenwenxiniae sp. nov., and Dyadobacter fanqingshengii sp. nov., isolated from soil of the Qinghai-Tibetan Plateau.
Dyadobacter chenhuakuii sp.
- DOI:
10.1099/ijsem.0.005747 - 发表时间:
2023-03-01 - 期刊:
- 影响因子:2.8
- 作者:
Caiyun Ma;Gui Zhang;Yanpeng Cheng;Wenjing Lei;Caixin Yang;Yue Liu;Jing Yang;Shan Lu;D. Jin;Liyun Liu;Jianguo Xu - 通讯作者:
Jianguo Xu
O R I G I N a L I N V E S T I G a T I O N Open Access
起源 投资 开放获取
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
Chun;Qian Yu;Pei Yu;Tie;Qiumei Zhang;Shan Lu;De - 通讯作者:
De
Transplantation of Flk-1+ human bone marrow-derived mesenchymal stem cells promotes behavioral recovery and anti-inflammatory and angiogenesis effects in an intracerebral hemorrhage rat model.
Flk-1人骨髓间充质干细胞移植可促进脑出血大鼠模型的行为恢复以及抗炎和血管生成作用。
- DOI:
10.3892/ijmm.2013.1290 - 发表时间:
2013-05-01 - 期刊:
- 影响因子:5.4
- 作者:
X. Bao;Fu;Shan Lu;Q. Han;M. Feng;Jun;Gui;R. Zhao;Renzhi Wang - 通讯作者:
Renzhi Wang
Fudania jinshanensis gen. nov., sp. nov., isolated from faeces of the Tibetan antelope (Pantholops hodgsonii) in China.
金山复丹亚属
- DOI:
10.1099/ijsem.0.003586 - 发表时间:
2019-09-01 - 期刊:
- 影响因子:2.8
- 作者:
Wentao Zhu;Jing Yang;Shan Lu;X. Lai;D. Jin;Ji Pu;Xiaoxia Wang;Yuyuan Huang;Sihui Zhang;Ying Huang;Yuanmeihui Tao;Zhihong Ren;Xiaomin Wu;Xiaoyan Zhang;Jianqing Xu;Jianguo Xu - 通讯作者:
Jianguo Xu
Ontogeny of Synovial Macrophages and the Roles of Synovial Macrophages From Different Origins in Arthritis
滑膜巨噬细胞的个体发育以及不同来源的滑膜巨噬细胞在关节炎中的作用
- DOI:
10.3389/fimmu.2019.01146 - 发表时间:
2019-05-24 - 期刊:
- 影响因子:7.3
- 作者:
Jiajie Tu;Wenming Hong;Yawei Guo;Pengying Zhang;Yilong Fang;Xinming Wang;Xiaoyun Chen;Shan Lu;Wei Wei - 通讯作者:
Wei Wei
Shan Lu的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Shan Lu', 18)}}的其他基金
CSR: Medium: Improving the Interface between Machine Learning and Software Systems
CSR:中:改进机器学习和软件系统之间的接口
- 批准号:
2313190 - 财政年份:2023
- 资助金额:
$ 28.2万 - 项目类别:
Standard Grant
CNS Core: Medium: Accurate Anytime Learning for Energy andTimeliness in Software Systems
CNS 核心:中:随时准确学习软件系统的能量和及时性
- 批准号:
1956180 - 财政年份:2020
- 资助金额:
$ 28.2万 - 项目类别:
Continuing Grant
NSF Student Travel Grant for 2020 ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS)
NSF 学生旅费资助 2020 年 ACM 国际编程语言和操作系统架构支持会议 (ASPLOS)
- 批准号:
1936025 - 财政年份:2020
- 资助金额:
$ 28.2万 - 项目类别:
Standard Grant
Student Travel Support for 2016 USENIX Annual Technical Conference
2016 年 USENIX 年度技术会议的学生旅行支持
- 批准号:
1632170 - 财政年份:2016
- 资助金额:
$ 28.2万 - 项目类别:
Standard Grant
BIGDATA: Collaborative Research: F: Holistic Optimization of Data-Driven Applications
BIGDATA:协作研究:F:数据驱动应用程序的整体优化
- 批准号:
1546543 - 财政年份:2015
- 资助金额:
$ 28.2万 - 项目类别:
Standard Grant
CAREER: Combating Performance Bugs in Software Systems
职业:对抗软件系统中的性能错误
- 批准号:
1514189 - 财政年份:2014
- 资助金额:
$ 28.2万 - 项目类别:
Continuing Grant
XPS: FULL: CCA: Production-Run Failure Recovery Based Approach to Reliable Parallel Software
XPS:完整:CCA:基于生产运行故障恢复的可靠并行软件方法
- 批准号:
1439091 - 财政年份:2014
- 资助金额:
$ 28.2万 - 项目类别:
Standard Grant
CAREER: Combating Performance Bugs in Software Systems
职业:对抗软件系统中的性能错误
- 批准号:
1054616 - 财政年份:2011
- 资助金额:
$ 28.2万 - 项目类别:
Continuing Grant
Fighting Concurrency Bugs through Effect-Oriented Approaches
通过面向效果的方法对抗并发错误
- 批准号:
1018180 - 财政年份:2010
- 资助金额:
$ 28.2万 - 项目类别:
Standard Grant
相似国自然基金
基于挥发性分布和氧化校正的大气半/中等挥发性有机物来源解析方法构建
- 批准号:42377095
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
基于机器学习和经典电动力学研究中等尺寸金属纳米粒子的量子表面等离激元
- 批准号:22373002
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
中等质量黑洞附近的暗物质分布及其IMRI系统引力波回波探测
- 批准号:12365008
- 批准年份:2023
- 资助金额:32 万元
- 项目类别:地区科学基金项目
复合低维拓扑材料中等离激元增强光学响应的研究
- 批准号:12374288
- 批准年份:2023
- 资助金额:52 万元
- 项目类别:面上项目
中等垂直风切变下非对称型热带气旋快速增强的物理机制研究
- 批准号:42305004
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
相似海外基金
Collaborative Research: CSR: Medium: Scaling Secure Serverless Computing on Heterogeneous Datacenters
协作研究:CSR:中:在异构数据中心上扩展安全无服务器计算
- 批准号:
2312207 - 财政年份:2023
- 资助金额:
$ 28.2万 - 项目类别:
Continuing Grant
Collaborative Research: CSR: Medium: MemDrive: Memory-Driven Full-Stack Collaboration for Autonomous Embedded Systems
协作研究:CSR:媒介:MemDrive:自主嵌入式系统的内存驱动全栈协作
- 批准号:
2312397 - 财政年份:2023
- 资助金额:
$ 28.2万 - 项目类别:
Continuing Grant
Collaborative Research: CSR: Medium: Adaptive Environmental Awareness for Collaborative Augmented Reality
协作研究:企业社会责任:媒介:协作增强现实的自适应环境意识
- 批准号:
2312762 - 财政年份:2023
- 资助金额:
$ 28.2万 - 项目类别:
Continuing Grant
Collaborative Research: CSR: Core: Medium: Scaling Unix/Linux Shell Programs
协作研究:CSR:核心:中:扩展 Unix/Linux Shell 程序
- 批准号:
2312346 - 财政年份:2023
- 资助金额:
$ 28.2万 - 项目类别:
Continuing Grant
Collaborative Research: CSR: Medium: Towards A Unified Memory-centric Computing System with Cross-layer Support
协作研究:CSR:中:迈向具有跨层支持的统一的以内存为中心的计算系统
- 批准号:
2310423 - 财政年份:2023
- 资助金额:
$ 28.2万 - 项目类别:
Continuing Grant