CSR: Medium:Combating Distributed Concurrency Bugs in Cloud Systems

CSR:中:对抗云系统中的分布式并发错误

基本信息

  • 批准号:
    1563956
  • 负责人:
  • 金额:
    $ 80万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2016
  • 资助国家:
    美国
  • 起止时间:
    2016-08-01 至 2021-07-31
  • 项目状态:
    已结题

项目摘要

DCBA: Distributed Concurrency Bugs Annihilation Software systems are getting more complex, creating reliability issues that cause millions of dollars in economic loss. Beyond local software, distributed cloud software infrastructures (i.e., cloud systems) have emerged as a dominant backbone for many modern applications. Users expect high reliability from these systems, but guaranteeing their reliability proves to be challenging. Cloud systems run on hundreds/thousands of machines, execute complicated distributed protocols, and face a variety of hardware faults. This combination makes cloud systems prone to distributed concurrency bugs, which can cause catastrophic failures such as data loss, downtimes, and data loss/inconsistencies. This Distributed Concurrency Bugs Annihilation (DCBA) project will address this important issue and bring many direct benefits to the society; users from many areas (science, healthcare, business, education, military, and government) increasingly use cloud computing services and demand high availability and predictability. Combating distributed concurrency bugs is an important ingredient to such success. Distributed concurrency bugs are caused by non-deterministic order of distributed events such as message arrivals, faults, and reboots. This project, Distributed Concurrency Bugs Annihilation (DCBA), will find, remove, and prevent buggy interleavings of concurrent distributed events with the development of four approaches: (1) full, automated, and deep distributed system model checkers, (2) fast inference, detection, testing and fixing of order violations, (3) runtime statistical debugging, prevention, and recovery, and (4) design advancements that reduce the possibilities of distributed concurrency bugs to appear. This DCBA project will advance the state of cloud dependability research. Existing literature on distributed systems reliability focuses on monitoring, post-mortem debugging, deterministic record and replay, and verifiable programming language frameworks. The DCBA project will introduce advancements to approaches related to model checking, bug detection, bug fixing, runtime debugging, prevention and recovery. As more organizations build more distributed systems on farms of machines and services in cloud era, it is time for the dependability community to address distributed concurrency bugs in systematic and comprehensive manners. The DCBA initiative will have a profound impact to future distributed cloud systems.
DCBA:分布式并发错误歼灭软件系统越来越复杂,从而造成了可靠性问题,从而导致数百万美元的经济损失。 除本地软件外,分布式云软件基础架构(即云系统)已成为许多现代应用的主要骨干。 用户期望这些系统的可靠性很高,但是保证其可靠性被证明是具有挑战性的。 云系统在数百/数千台机器上运行,执行复杂的分布式协议,并面临各种硬件故障。 这种组合使云系统容易出现分布的并发错误,这可能会导致灾难性故障,例如数据丢失,下降和数据丢失/不一致。 这个分布的并发错误歼灭(DCBA)项目将解决这一重要问题,并为社会带来许多直接利益;来自许多领域(科学,医疗保健,商业,教育,军事和政府)的用户越来越多地使用云计算服务,并需要高可用性和可预测性。 对抗分布式并发错误是这种成功的重要组成部分。分布式并发错误是由分布式事件的非确定性顺序引起的,例如消息到达,故障和重启。 This project, Distributed Concurrency Bugs Annihilation (DCBA), will find, remove, and prevent buggy interleavings of concurrent distributed events with the development of four approaches: (1) full, automated, and deep distributed system model checkers, (2) fast inference, detection, testing and fixing of order violations, (3) runtime statistical debugging, prevention, and recovery, and (4) design advancements that reduce the possibilities分布式并发错误出现。 该DCBA项目将推进云可依赖性研究的状态。 有关分布式系统可靠性的现有文献着重于监视,验尸后调试,确定性记录和重播以及可验证的编程语言框架。 DCBA项目将向与模型检查,错误检测,错误修复,运行时调试,预防和恢复有关的方法引入进步。随着越来越多的组织在云时代的机器和服务农场上建立更多的分布式系统,现在是时候让可靠性社区以系统和全面的方式解决分布式的并发错误了。 DCBA计划将对未来的分布式云系统产生深远的影响。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Haryadi Gunawi其他文献

Haryadi Gunawi的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Haryadi Gunawi', 18)}}的其他基金

Collaborative Research: PPoSS: LARGE: ScaleStuds: Foundations for Correctness Checkability and Performance Predictability of Systems at Scale
合作研究:PPoSS:大型:ScaleStuds:大规模系统正确性可检查性和性能可预测性的基础
  • 批准号:
    2119184
  • 财政年份:
    2021
  • 资助金额:
    $ 80万
  • 项目类别:
    Continuing Grant
PPoSS: Planning: CP2: Towards Systems Correctness Checkability and Performance Predictability at Scale
PPoSS:规划:CP2:实现大规模系统正确性可检查性和性能可预测性
  • 批准号:
    2028427
  • 财政年份:
    2020
  • 资助金额:
    $ 80万
  • 项目类别:
    Standard Grant
USENIX FAST 2017 NSF Student Travel Support
USENIX FAST 2017 NSF 学生旅行支持
  • 批准号:
    1727380
  • 财政年份:
    2017
  • 资助金额:
    $ 80万
  • 项目类别:
    Standard Grant
CSR: Small: BreezeFS: File System Transformation for Cloud and Multistore Era
CSR:小型:BreezeFS:云和多存储时代的文件系统转型
  • 批准号:
    1526304
  • 财政年份:
    2015
  • 资助金额:
    $ 80万
  • 项目类别:
    Standard Grant
CAREER: DrCloud: Drill-Ready Cloud Computing
职业:DrCloud:可练习的云计算
  • 批准号:
    1350499
  • 财政年份:
    2014
  • 资助金额:
    $ 80万
  • 项目类别:
    Continuing Grant
XPS:CLCCA:LigHTS: Lagging-Hardware Tolerant Systems" in the system.
系统中的“XPS:CLCCA:LigHTS:滞后硬件容忍系统”。
  • 批准号:
    1336580
  • 财政年份:
    2013
  • 资助金额:
    $ 80万
  • 项目类别:
    Standard Grant
DC: Small: Collaborative Research: DARE: Declarative and Scalable Recovery
DC:小型:协作研究:DARE:声明式和可扩展的恢复
  • 批准号:
    1321958
  • 财政年份:
    2012
  • 资助金额:
    $ 80万
  • 项目类别:
    Standard Grant
DC: Small: Collaborative Research: DARE: Declarative and Scalable Recovery
DC:小型:协作研究:DARE:声明式和可扩展的恢复
  • 批准号:
    1016924
  • 财政年份:
    2010
  • 资助金额:
    $ 80万
  • 项目类别:
    Standard Grant

相似国自然基金

复合低维拓扑材料中等离激元增强光学响应的研究
  • 批准号:
    12374288
  • 批准年份:
    2023
  • 资助金额:
    52 万元
  • 项目类别:
    面上项目
基于管理市场和干预分工视角的消失中等企业:特征事实、内在机制和优化路径
  • 批准号:
    72374217
  • 批准年份:
    2023
  • 资助金额:
    41.00 万元
  • 项目类别:
    面上项目
托卡马克偏滤器中等离子体的多尺度算法与数值模拟研究
  • 批准号:
    12371432
  • 批准年份:
    2023
  • 资助金额:
    43.5 万元
  • 项目类别:
    面上项目
中等质量黑洞附近的暗物质分布及其IMRI系统引力波回波探测
  • 批准号:
    12365008
  • 批准年份:
    2023
  • 资助金额:
    32 万元
  • 项目类别:
    地区科学基金项目
中等垂直风切变下非对称型热带气旋快速增强的物理机制研究
  • 批准号:
    42305004
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

RII Track-4:@NASA: Bluer and Hotter: From Ultraviolet to X-ray Diagnostics of the Circumgalactic Medium
RII Track-4:@NASA:更蓝更热:从紫外到 X 射线对环绕银河系介质的诊断
  • 批准号:
    2327438
  • 财政年份:
    2024
  • 资助金额:
    $ 80万
  • 项目类别:
    Standard Grant
Collaborative Research: Topological Defects and Dynamic Motion of Symmetry-breaking Tadpole Particles in Liquid Crystal Medium
合作研究:液晶介质中对称破缺蝌蚪粒子的拓扑缺陷与动态运动
  • 批准号:
    2344489
  • 财政年份:
    2024
  • 资助金额:
    $ 80万
  • 项目类别:
    Standard Grant
Collaborative Research: AF: Medium: The Communication Cost of Distributed Computation
合作研究:AF:媒介:分布式计算的通信成本
  • 批准号:
    2402836
  • 财政年份:
    2024
  • 资助金额:
    $ 80万
  • 项目类别:
    Continuing Grant
Collaborative Research: AF: Medium: Foundations of Oblivious Reconfigurable Networks
合作研究:AF:媒介:遗忘可重构网络的基础
  • 批准号:
    2402851
  • 财政年份:
    2024
  • 资助金额:
    $ 80万
  • 项目类别:
    Continuing Grant
Collaborative Research: CIF: Medium: Snapshot Computational Imaging with Metaoptics
合作研究:CIF:Medium:Metaoptics 快照计算成像
  • 批准号:
    2403122
  • 财政年份:
    2024
  • 资助金额:
    $ 80万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了