NeTS:Small:Understanding the Impact of Unreliable Hardware on the Resilience of Networked Systems
NeTS:小:了解不可靠的硬件对网络系统弹性的影响
基本信息
- 批准号:1117049
- 负责人:
- 金额:$ 44万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2011
- 资助国家:美国
- 起止时间:2011-08-15 至 2015-07-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Networked systems have always been designed to operate even in the presence of failures, especially in communication links and storage. Until recently other components of such systems had relatively low probabilities of failures and for most networked systems, desired levels of resilience could be achieved using minimal redundancy added in an ad hoc manner. Two opposing trends are likely to make the task of achieving resilience significantly more difficult in the coming years: (a) increasing hardware failure probabilities: with the move towards finer nano-scale fabrication, chips are increasingly vulnerable to soft errors caused by external noise and are increasingly likely to fail early due to fatigue; (b) higher resilience requirements: as critical services continue to migrate to clouds, service providers are compelled into more stringent service-level agreements (SLAs), including higher reliability, higher availability, and tighter guarantees on service times. The above combination can dramatically increase the overhead of existing approaches for achieving desired levels of resilience. Intellectual merit: The first outcome of this project will be a holistic roadmap for resilience of networked systems. This resilience roadmap will take the roadmaps from the nano-scale CMOS (trends in chip cost, functionality, performance, power, and resilience that can be attained at chip level) and attempt to realistically project the future cost of currently-used networking and systems techniques for achieving desired level of resilience. The second outcome of this project is to develop resilience methods that scale gracefully in the face of increasing hardware failures. Such techniques will use novel partitioned redundancy strategies that achieve reliability at different levels across hardware and software layers. Broader Impacts. The resilience roadmap will provide unprecedented understanding of the trends in resilience and a uniquely realistic assessment of challenges and opportunities. This will significantly influence the research in the hardware as well as networking communities. A systematic design of scalable resilience methods will lead to significantly higher levels of resilience, lower costs - capital (equipment) as well recurring (especially, energy), and/or higher levels of performance. The utilitarian gains to society by the proposed project are likely to be substantial, since networked systems now constitute one of our most critical infrastructures and consume an increasingly large proportion of our resources.This project will draw upon two different disciplines, hardware architecture and networked systems, and involve detailed case studies and development of completely new theory and techniques, and will therefore provide unique educational and training opportunities for students and working professionals in these fields.Budget Impact Statement: The item numbers in this paragraph refer to those in Figure 9 and Section 3.2 (entitled 'Proposed Research Tasks and Plan') of our original proposal. We will undertake all tasks and sub-tasks proposed in item-1 (and all its sub-items). In item-2, we will undertake the development of a general framework to consider all basic redundancy schemes and alternative ways of deploying them (sub-item-2.1). We will also characterize the associated tradeoffs (sub-item-2.2) and the consequences of realistic constraints (sub-item-2.3). However, we will pursue the development of prototype tools (as outlined in sub-item-2.4), to the extent necessary to demonstrate the benefits of our approach and to conduct case studies (described in item-3). Finally, we will undertake the case studies as originally proposed in item-3.
网络系统一直被设计为即使在出现故障的情况下也能运行,特别是在通信链路和存储方面。直到最近,此类系统的其他组件的故障概率相对较低,并且对于大多数网络系统来说,可以使用以临时方式添加的最小冗余来实现所需的恢复水平。两种相反的趋势可能会使未来几年实现弹性的任务变得更加困难:(a) 硬件故障概率增加:随着向更精细的纳米级制造的发展,芯片越来越容易受到外部噪声和外部噪声引起的软错误的影响。由于疲劳而提前失败的可能性越来越大; (b) 更高的弹性要求:随着关键服务不断迁移到云端,服务提供商被迫签订更严格的服务级别协议(SLA),包括更高的可靠性、更高的可用性以及对服务时间更严格的保证。上述组合会显着增加现有方法实现所需弹性水平的开销。智力价值:该项目的第一个成果将是网络系统弹性的整体路线图。该弹性路线图将采用纳米级 CMOS 的路线图(可以在芯片级实现的芯片成本、功能、性能、功耗和弹性的趋势),并尝试实际预测当前使用的网络和系统的未来成本达到所需弹性水平的技术。该项目的第二个成果是开发弹性方法,能够在面对日益增加的硬件故障时优雅地扩展。此类技术将使用新颖的分区冗余策略,在硬件和软件层的不同级别上实现可靠性。更广泛的影响。复原力路线图将提供对复原力趋势的前所未有的理解,并对挑战和机遇进行独特的现实评估。这将极大地影响硬件和网络社区的研究。可扩展弹性方法的系统设计将带来显着更高水平的弹性、更低的成本——资本(设备)以及经常性(特别是能源)和/或更高水平的性能。 拟议项目给社会带来的功利收益可能是巨大的,因为网络系统现在构成了我们最关键的基础设施之一,并且消耗了越来越大比例的资源。该项目将利用两个不同的学科:硬件架构和网络系统,并涉及详细的案例研究和全新理论和技术的开发,因此将为这些领域的学生和工作专业人员提供独特的教育和培训机会。 预算影响说明:本段中的项目编号参考图 9 和第3.2节(标题为“拟议的研究任务和计划”)我们最初的提案。我们将承担第 1 项(及其所有子项)中提出的所有任务和子任务。在第 2 项中,我们将开发一个总体框架,以考虑所有基本冗余方案和部署它们的替代方法(第 2.1 项)。我们还将描述相关的权衡(子项-2.2)和现实约束的后果(子项-2.3)。然而,我们将继续开发原型工具(如第 2.4 项中所述),以证明我们的方法的好处并进行案例研究(如第 3 项中所述)。最后,我们将进行第 3 项中最初提出的案例研究。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Sandeep Gupta其他文献
Case Studies on Biological Treatment of Tannery Effluents in India
印度制革废水生物处理案例研究
- DOI:
10.1080/10473289.2003.10466250 - 发表时间:
2003 - 期刊:
- 影响因子:2.7
- 作者:
V. Tare;Sandeep Gupta;P. Bose - 通讯作者:
P. Bose
Collaborative circuit designs using the CRAFT repository
使用 CRAFT 存储库进行协作电路设计
- DOI:
10.1016/j.future.2018.01.018 - 发表时间:
2018 - 期刊:
- 影响因子:0
- 作者:
Adam Brinckman;E. Deelman;Sandeep Gupta;J. Nabrzyski;Soowang Park;Rafael Ferreira da Silva;I. Taylor;K. Vahi - 通讯作者:
K. Vahi
Forebrain roof plate morphogenesis and hippocampus development in the chick embryo.
鸡胚胎前脑顶板形态发生和海马发育。
- DOI:
10.1387/ijdb.190143js - 发表时间:
2020 - 期刊:
- 影响因子:0
- 作者:
Sandeep Gupta;N. Udaykumar;J. Sen - 通讯作者:
J. Sen
Continental like crust beneath the Andaman Island through joint inversion of receiver function and surface wave from ambient seismic noise
通过环境地震噪声的接收函数和表面波联合反演安达曼岛下方的类大陆壳
- DOI:
- 发表时间:
2016 - 期刊:
- 影响因子:0
- 作者:
Sandeep Gupta;Kajaljyoti Borah;G. Saha - 通讯作者:
G. Saha
The role of crystallized magma and crustal fluids in intraplate seismic activity in Talala region (Saurashtra), Western India: An insight from local earthquake tomography
结晶岩浆和地壳流体在印度西部塔拉拉地区(绍拉什特拉)板内地震活动中的作用:来自当地地震层析成像的见解
- DOI:
- 发表时间:
2016 - 期刊:
- 影响因子:0
- 作者:
P. Mahesh;Sandeep Gupta - 通讯作者:
Sandeep Gupta
Sandeep Gupta的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Sandeep Gupta', 18)}}的其他基金
SHF:Small:New models, design, and test methods for long-term aging of nanometer VLSI
SHF:Small:纳米VLSI长期老化的新模型、设计和测试方法
- 批准号:
1719047 - 财政年份:2017
- 资助金额:
$ 44万 - 项目类别:
Standard Grant
Theory, methods, and tools for cross-layered design of uniquely efficient failure-resistant systems
独特高效的抗故障系统的跨层设计的理论、方法和工具
- 批准号:
1255951 - 财政年份:2013
- 资助金额:
$ 44万 - 项目类别:
Continuing Grant
Verification of closed loop feedback/feed-forward control actions for safe medical devices
验证安全医疗设备的闭环反馈/前馈控制动作
- 批准号:
1231590 - 财政年份:2012
- 资助金额:
$ 44万 - 项目类别:
Standard Grant
CSR: Small: Understanding and Modeling the Trade-Offs in Data Centers for Next-Generation Sustainable Management
CSR:小:理解和建模数据中心的权衡以实现下一代可持续管理
- 批准号:
1218505 - 财政年份:2012
- 资助金额:
$ 44万 - 项目类别:
Standard Grant
SHB: Small: Toward Verifying Smart-Health Infrastructure Safety from their Impact on Human Physiology
SHB:小:验证智能健康基础设施安全性及其对人体生理学的影响
- 批准号:
1116385 - 财政年份:2011
- 资助金额:
$ 44万 - 项目类别:
Standard Grant
TC:Small:EDICT: Evaluation and Design of IC's for Trustworthiness
TC:Small:EDICT:IC 的可信度评估和设计
- 批准号:
1018937 - 财政年份:2010
- 资助金额:
$ 44万 - 项目类别:
Standard Grant
II-EN: BlueTool: Infrastructure for Innovative Cyberphysical Data Center Management Research
II-EN:BlueTool:创新网络物理数据中心管理研究的基础设施
- 批准号:
0855277 - 财政年份:2009
- 资助金额:
$ 44万 - 项目类别:
Continuing Grant
CT-ISG: Physiological Value based Security for Body Area Networks
CT-ISG:基于生理价值的身体区域网络安全
- 批准号:
0831544 - 财政年份:2008
- 资助金额:
$ 44万 - 项目类别:
Standard Grant
EMT/MISC: Theory and methods for design and synthesis of approximate logic circuits and systems: a paradigm for emerging technologies
EMT/MISC:近似逻辑电路和系统的设计和综合的理论和方法:新兴技术的范例
- 批准号:
0829946 - 财政年份:2008
- 资助金额:
$ 44万 - 项目类别:
Standard Grant
CSR-DMSS, SM: Next-Generation Thermal-Aware, Energy-Efficient Resource Management for Data Centers
CSR-DMSS、SM:数据中心的下一代热感知、节能资源管理
- 批准号:
0834797 - 财政年份:2008
- 资助金额:
$ 44万 - 项目类别:
Standard Grant
相似国自然基金
诊疗一体化PS-Hc@MB协同训练介导脑小血管病康复的作用及机制研究
- 批准号:82372561
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
非小细胞肺癌MECOM/HBB通路介导血红素代谢异常并抑制肿瘤起始细胞铁死亡的机制研究
- 批准号:82373082
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
基于胆碱能皮层投射纤维探讨脑小血管病在帕金森病步态障碍中的作用及机制研究
- 批准号:82301663
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
关于丢番图方程小素数解上界估计的研究
- 批准号:12301005
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
嗅球小胶质细胞P2X7受体在变应性鼻炎发生帕金森病样改变中的作用与机制研究
- 批准号:82371119
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
相似海外基金
NeTS: Small: RUI: Dynamic Mathematical Modeling Towards Understanding Information Diffusion in Online Social Networks
NeTS:小:RUI:理解在线社交网络中信息扩散的动态数学模型
- 批准号:
1218212 - 财政年份:2012
- 资助金额:
$ 44万 - 项目类别:
Standard Grant
NeTS: Small: Understanding Communication Strategies for Ad hoc Networks
NeTS:小型:了解自组织网络的通信策略
- 批准号:
1117039 - 财政年份:2011
- 资助金额:
$ 44万 - 项目类别:
Standard Grant
NeTS: Small: Understanding Network Failure
NetS:小型:了解网络故障
- 批准号:
1116904 - 财政年份:2011
- 资助金额:
$ 44万 - 项目类别:
Standard Grant
NeTS: Small: Collaborative Research: Understanding Traffic Dynamics in Cellular Data Networks and Applications to Resource Management
NetS:小型:协作研究:了解蜂窝数据网络中的流量动态和资源管理应用
- 批准号:
1117597 - 财政年份:2011
- 资助金额:
$ 44万 - 项目类别:
Standard Grant
NeTS: Small: Understanding, Managing and Trouble-Shooting the Evolving Cellular Data Networks
NeTS:小型:了解、管理和排除不断发展的蜂窝数据网络的故障
- 批准号:
1117536 - 财政年份:2011
- 资助金额:
$ 44万 - 项目类别:
Standard Grant