CSR: Medium:Combating Distributed Concurrency Bugs in Cloud Systems
CSR:中:对抗云系统中的分布式并发错误
基本信息
- 批准号:1563956
- 负责人:
- 金额:$ 80万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2016
- 资助国家:美国
- 起止时间:2016-08-01 至 2021-07-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
DCBA: Distributed Concurrency Bugs Annihilation Software systems are getting more complex, creating reliability issues that cause millions of dollars in economic loss. Beyond local software, distributed cloud software infrastructures (i.e., cloud systems) have emerged as a dominant backbone for many modern applications. Users expect high reliability from these systems, but guaranteeing their reliability proves to be challenging. Cloud systems run on hundreds/thousands of machines, execute complicated distributed protocols, and face a variety of hardware faults. This combination makes cloud systems prone to distributed concurrency bugs, which can cause catastrophic failures such as data loss, downtimes, and data loss/inconsistencies. This Distributed Concurrency Bugs Annihilation (DCBA) project will address this important issue and bring many direct benefits to the society; users from many areas (science, healthcare, business, education, military, and government) increasingly use cloud computing services and demand high availability and predictability. Combating distributed concurrency bugs is an important ingredient to such success. Distributed concurrency bugs are caused by non-deterministic order of distributed events such as message arrivals, faults, and reboots. This project, Distributed Concurrency Bugs Annihilation (DCBA), will find, remove, and prevent buggy interleavings of concurrent distributed events with the development of four approaches: (1) full, automated, and deep distributed system model checkers, (2) fast inference, detection, testing and fixing of order violations, (3) runtime statistical debugging, prevention, and recovery, and (4) design advancements that reduce the possibilities of distributed concurrency bugs to appear. This DCBA project will advance the state of cloud dependability research. Existing literature on distributed systems reliability focuses on monitoring, post-mortem debugging, deterministic record and replay, and verifiable programming language frameworks. The DCBA project will introduce advancements to approaches related to model checking, bug detection, bug fixing, runtime debugging, prevention and recovery. As more organizations build more distributed systems on farms of machines and services in cloud era, it is time for the dependability community to address distributed concurrency bugs in systematic and comprehensive manners. The DCBA initiative will have a profound impact to future distributed cloud systems.
DCBA:分布式并发错误歼灭 软件系统变得越来越复杂,产生可靠性问题,导致数百万美元的经济损失。 除了本地软件之外,分布式云软件基础设施(即云系统)已成为许多现代应用程序的主要支柱。 用户期望这些系统具有高可靠性,但保证其可靠性却具有挑战性。 云系统运行在数百/数千台机器上,执行复杂的分布式协议,并面临各种硬件故障。 这种组合使得云系统容易出现分布式并发错误,从而可能导致灾难性故障,例如数据丢失、停机和数据丢失/不一致。 这个分布式并发错误消灭(DCBA)项目将解决这个重要问题,并给社会带来许多直接的好处;来自许多领域(科学、医疗保健、商业、教育、军事和政府)的用户越来越多地使用云计算服务,并要求高可用性和可预测性。 对抗分布式并发错误是取得成功的重要因素。分布式并发错误是由分布式事件(例如消息到达、故障和重新启动)的不确定顺序引起的。该项目名为分布式并发错误消灭 (DCBA),将通过开发四种方法来发现、删除和防止并发分布式事件的错误交错:(1) 完整、自动化和深度分布式系统模型检查器,(2) 快速推理,顺序违规的检测、测试和修复,(3)运行时统计调试、预防和恢复,以及(4)减少分布式并发错误出现可能性的设计进步。 该 DCBA 项目将推动云可靠性研究的发展。 关于分布式系统可靠性的现有文献主要集中在监控、事后调试、确定性记录和重放以及可验证的编程语言框架。 DCBA 项目将引入与模型检查、错误检测、错误修复、运行时调试、预防和恢复相关的方法的进步。随着云时代越来越多的组织在机器和服务场上构建更多的分布式系统,可靠性社区是时候以系统且全面的方式解决分布式并发错误了。 DCBA倡议将对未来的分布式云系统产生深远的影响。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Haryadi Gunawi其他文献
Haryadi Gunawi的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Haryadi Gunawi', 18)}}的其他基金
Collaborative Research: PPoSS: LARGE: ScaleStuds: Foundations for Correctness Checkability and Performance Predictability of Systems at Scale
合作研究:PPoSS:大型:ScaleStuds:大规模系统正确性可检查性和性能可预测性的基础
- 批准号:
2119184 - 财政年份:2021
- 资助金额:
$ 80万 - 项目类别:
Continuing Grant
PPoSS: Planning: CP2: Towards Systems Correctness Checkability and Performance Predictability at Scale
PPoSS:规划:CP2:实现大规模系统正确性可检查性和性能可预测性
- 批准号:
2028427 - 财政年份:2020
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
USENIX FAST 2017 NSF Student Travel Support
USENIX FAST 2017 NSF 学生旅行支持
- 批准号:
1727380 - 财政年份:2017
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
CSR: Small: BreezeFS: File System Transformation for Cloud and Multistore Era
CSR:小型:BreezeFS:云和多存储时代的文件系统转型
- 批准号:
1526304 - 财政年份:2015
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
CAREER: DrCloud: Drill-Ready Cloud Computing
职业:DrCloud:可练习的云计算
- 批准号:
1350499 - 财政年份:2014
- 资助金额:
$ 80万 - 项目类别:
Continuing Grant
XPS:CLCCA:LigHTS: Lagging-Hardware Tolerant Systems" in the system.
系统中的“XPS:CLCCA:LigHTS:滞后硬件容忍系统”。
- 批准号:
1336580 - 财政年份:2013
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
DC: Small: Collaborative Research: DARE: Declarative and Scalable Recovery
DC:小型:协作研究:DARE:声明式和可扩展的恢复
- 批准号:
1321958 - 财政年份:2012
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
DC: Small: Collaborative Research: DARE: Declarative and Scalable Recovery
DC:小型:协作研究:DARE:声明式和可扩展的恢复
- 批准号:
1016924 - 财政年份:2010
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
相似国自然基金
复合低维拓扑材料中等离激元增强光学响应的研究
- 批准号:12374288
- 批准年份:2023
- 资助金额:52 万元
- 项目类别:面上项目
中等垂直风切变下非对称型热带气旋快速增强的物理机制研究
- 批准号:42305004
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
基于挥发性分布和氧化校正的大气半/中等挥发性有机物来源解析方法构建
- 批准号:42377095
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
基于机器学习和经典电动力学研究中等尺寸金属纳米粒子的量子表面等离激元
- 批准号:22373002
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
托卡马克偏滤器中等离子体的多尺度算法与数值模拟研究
- 批准号:12371432
- 批准年份:2023
- 资助金额:43.5 万元
- 项目类别:面上项目
相似海外基金
Collaborative Research: CyberTraining: Implementation: Medium: Training Users, Developers, and Instructors at the Chemistry/Physics/Materials Science Interface
协作研究:网络培训:实施:媒介:在化学/物理/材料科学界面培训用户、开发人员和讲师
- 批准号:
2321102 - 财政年份:2024
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
RII Track-4:@NASA: Bluer and Hotter: From Ultraviolet to X-ray Diagnostics of the Circumgalactic Medium
RII Track-4:@NASA:更蓝更热:从紫外到 X 射线对环绕银河系介质的诊断
- 批准号:
2327438 - 财政年份:2024
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
Collaborative Research: Topological Defects and Dynamic Motion of Symmetry-breaking Tadpole Particles in Liquid Crystal Medium
合作研究:液晶介质中对称破缺蝌蚪粒子的拓扑缺陷与动态运动
- 批准号:
2344489 - 财政年份:2024
- 资助金额:
$ 80万 - 项目类别:
Standard Grant
Collaborative Research: AF: Medium: The Communication Cost of Distributed Computation
合作研究:AF:媒介:分布式计算的通信成本
- 批准号:
2402836 - 财政年份:2024
- 资助金额:
$ 80万 - 项目类别:
Continuing Grant
Collaborative Research: AF: Medium: Foundations of Oblivious Reconfigurable Networks
合作研究:AF:媒介:遗忘可重构网络的基础
- 批准号:
2402851 - 财政年份:2024
- 资助金额:
$ 80万 - 项目类别:
Continuing Grant