CSR: Medium: Pythia: An Application Analysis and Online Modeling Based Prediction Framework for Scalable Resource Management

CSR:中:Pythia:基于应用分析和在线建模的可扩展资源管理预测框架

基本信息

项目摘要

Computer applications that process large amounts of information are becoming common in a variety of science domains, such as High-Speed Physics, Economics, Genomics, Astronomy, and Meteorology. The overall goal of this project is to design software tools and technologies to support such applications efficiently on advanced computing systems. Moreover, the hardware that is used to implement such advanced systems often boasts of different types of resources, e.g., a conventional computer processor running alongside specialized graphic processing units, and this heterogeneity presents a major challenge when running the applications at the needed large scale. Having a better understanding of the applications behavior on the emerging hardware is key to sustaining these systems. To this end, the project designs and develops Pythia, software that models and predicts how applications would behave on given hardware. This information is then used to better utilize the resources, and achieve scalable and high performance computing systems.The intellectual value of this research involves three intermediate research goals. 1) Design an accurate application classifier using compile-time program analysis that captures workflow behavior and application characteristics, and provides detailed insights into expected runtime application interactions. 2) Design and develop an accurate simulation model that incorporates workflow and application characteristics into a heuristics engine to predict how the application will perform under given conditions and resources. 3) Design a distributed, flexible, efficient, and easy-to-use online oracle framework that captures the infrastructure heterogeneity and integrates with live systems to predict application behavior, which in turn can help guide application-attuned resource scheduling and management. Completion of the project will create tools and technologies for realization of more efficient and scalable computing systems. This work impacts a broad range of disciplines that regularly employ high-performance large-scale computing systems, especially for data-driven discovery. Consequently, use of Pythia will reduce the time-to-solution for modern and emerging applications, and therefore directly affect our way of life. The educational activities, which include recruiting and mentoring women and minority students, will help produce graduates with highly marketable skill sets. The integration of the research discoveries and software tools, which will be open source and made public, into the educational curriculum will help capture the interest of the next generation of computer scientists.
处理大量信息的计算机应用在各种科学领域中变得普遍,例如高速物理,经济学,基因组学,天文学和气象。该项目的总体目标是设计软件工具和技术,以在高级计算系统上有效地支持此类应用程序。此外,用于实施此类高级系统的硬件通常具有不同类型的资源,例如,传统的计算机处理器与专业的图形处理单元一起运行,并且这种异质性在需要大规模的大规模运行应用程序时会带来重大挑战。更好地了解新兴硬件上的应用程序行为是维持这些系统的关键。为此,该项目设计和开发Pythia,该软件是建模并预测应用程序在给定硬件上的行为。然后,该信息用于更好地利用资源,并实现可扩展和高性能计算系统。该研究的智力价值涉及三个中间研究目标。 1)使用编译时计划分析设计精确的应用程序分类器,该分析符捕获工作流程行为和应用程序特征,并为预期的运行时应用程序交互提供详细的见解。 2)设计和开发一个准确的仿真模型,该模型将工作流程和应用程序特征纳入启发式引擎中,以预测应用程序在给定条件和资源下的执行方式。 3)设计一个分布式,灵活,高效且易于使用的在线Oracle框架,该框架捕获基础架构异质性并与实时系统集成以预测应用程序行为,从而可以帮助指导应用程序攻击的资源调度和管理。该项目的完成将创建工具和技术,以实现更高效,更可扩展的计算系统。这项工作影响了广泛的学科,这些学科经常采用高性能的大规模计算系统,尤其是对于数据驱动的发现。因此,使用腓热将减少现代和新兴应用的解决时间,因此直接影响我们的生活方式。教育活动包括招聘和指导妇女和少数族裔学生,将有助于培养具有高度销售技能的毕业生。研究发现和软件工具的整合将是开源的,并将其公开化为教育课程,这将有助于吸引下一代计算机科学家的兴趣。

项目成果

期刊论文数量(30)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Toward Transparent Data Management in Multi-Layer Storage Hierarchy of HPC Systems
iez: Resource Contention Aware Load Balancing for Large-Scale Parallel File Systems
Improving I/O Performance of HPC Application Using Intra-Job Scheduling.
使用作业内调度提高 HPC 应用程序的 I/O 性能。
MOANA: Modeling and Analyzing I/O Variability in Parallel System Experimental Design
MOANA:并行系统实验设计中 I/O 可变性的建模和分析
  • DOI:
    10.1109/tpds.2019.2892129
  • 发表时间:
    2019
  • 期刊:
  • 影响因子:
    5.3
  • 作者:
    Cameron, Kirk W.;Anwar, Ali;Cheng, Yue;Xu, Li;Li, Bo;Ananth, Uday;Bernard, Jon;Jearls, Chandler;Lux, Thomas;Hong, Yili
  • 通讯作者:
    Hong, Yili
Toward scalable monitoring on large-scale storage for software defined cyberinfrastructure
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Ali Butt其他文献

Ali Butt的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Ali Butt', 18)}}的其他基金

Collaborative Research: CNS Core: Medium:HardLambda: A new FaaS Abstraction for Cross-Stack Resource Management in Disaggregated Datacenters
协作研究:CNS 核心:Medium:HardLambda:分解数据中心跨堆栈资源管理的新 FaaS 抽象
  • 批准号:
    2106634
  • 财政年份:
    2021
  • 资助金额:
    $ 75万
  • 项目类别:
    Standard Grant
SPX: Collaborative Research: Cross-stack Memory Optimizations for Boosting I/O Performance of Deep Learning HPC Applications
SPX:协作研究:用于提升深度学习 HPC 应用程序 I/O 性能的跨堆栈内存优化
  • 批准号:
    1919113
  • 财政年份:
    2019
  • 资助金额:
    $ 75万
  • 项目类别:
    Standard Grant
Workshop on Data Storage Research Vision
数据存储研究愿景研讨会
  • 批准号:
    1829096
  • 财政年份:
    2018
  • 资助金额:
    $ 75万
  • 项目类别:
    Standard Grant
CSR: Small: Collaborative Research: Scalable Fine-Grained Cloud Monitoring for Empowering IoT
CSR:小型:协作研究:支持物联网的可扩展细粒度云监控
  • 批准号:
    1615411
  • 财政年份:
    2016
  • 资助金额:
    $ 75万
  • 项目类别:
    Standard Grant
Student Travel Support for IEEE 23rd International Symposium on Modelling, Analysis and Simulation of Computer and Telecommunication Systems (MASCOTS 2015)
IEEE 第 23 届计算机和电信系统建模、分析和仿真国际研讨会 (MASCOTS 2015) 学生旅行支持
  • 批准号:
    1541504
  • 财政年份:
    2015
  • 资助金额:
    $ 75万
  • 项目类别:
    Standard Grant
DC: Small: Collaborative Research: Exploring Energy-Reliability Trade-offs in Data Storage Systems
DC:小型:协作研究:探索数据存储系统中的能源可靠性权衡
  • 批准号:
    1016408
  • 财政年份:
    2010
  • 资助金额:
    $ 75万
  • 项目类别:
    Standard Grant
Increasing Student Participation in Cluster Computing through IEEE Cluster 2010 Attendance
通过出席 IEEE Cluster 2010 提高学生对集群计算的参与
  • 批准号:
    1049858
  • 财政年份:
    2010
  • 资助金额:
    $ 75万
  • 项目类别:
    Standard Grant
CSR: Small: Towards Realizing Cloud HPC: An Adaptive Programming Model for Accelerator-based Clusters
CSR:小:迈向实现云 HPC:基于加速器的集群的自适应编程模型
  • 批准号:
    1016793
  • 财政年份:
    2010
  • 资助金额:
    $ 75万
  • 项目类别:
    Standard Grant
U.S. - Pakistan International Planning Visit: Economical Computing Substrate for Developing Regions
美国-巴基斯坦国际规划访问:发展中地区的经济计算基板
  • 批准号:
    0940048
  • 财政年份:
    2009
  • 资助金额:
    $ 75万
  • 项目类别:
    Standard Grant
CAREER: A Scalable Hierarchical Framework for High-Performance Data Storage
职业:高性能数据存储的可扩展分层框架
  • 批准号:
    0746832
  • 财政年份:
    2008
  • 资助金额:
    $ 75万
  • 项目类别:
    Continuing Grant

相似国自然基金

复合低维拓扑材料中等离激元增强光学响应的研究
  • 批准号:
    12374288
  • 批准年份:
    2023
  • 资助金额:
    52 万元
  • 项目类别:
    面上项目
中等垂直风切变下非对称型热带气旋快速增强的物理机制研究
  • 批准号:
    42305004
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
基于挥发性分布和氧化校正的大气半/中等挥发性有机物来源解析方法构建
  • 批准号:
    42377095
  • 批准年份:
    2023
  • 资助金额:
    49 万元
  • 项目类别:
    面上项目
基于机器学习和经典电动力学研究中等尺寸金属纳米粒子的量子表面等离激元
  • 批准号:
    22373002
  • 批准年份:
    2023
  • 资助金额:
    50 万元
  • 项目类别:
    面上项目
托卡马克偏滤器中等离子体的多尺度算法与数值模拟研究
  • 批准号:
    12371432
  • 批准年份:
    2023
  • 资助金额:
    43.5 万元
  • 项目类别:
    面上项目

相似海外基金

Collaborative Research: CyberTraining: Implementation: Medium: Training Users, Developers, and Instructors at the Chemistry/Physics/Materials Science Interface
协作研究:网络培训:实施:媒介:在化学/物理/材料科学界面培训用户、开发人员和讲师
  • 批准号:
    2321102
  • 财政年份:
    2024
  • 资助金额:
    $ 75万
  • 项目类别:
    Standard Grant
RII Track-4:@NASA: Bluer and Hotter: From Ultraviolet to X-ray Diagnostics of the Circumgalactic Medium
RII Track-4:@NASA:更蓝更热:从紫外到 X 射线对环绕银河系介质的诊断
  • 批准号:
    2327438
  • 财政年份:
    2024
  • 资助金额:
    $ 75万
  • 项目类别:
    Standard Grant
Collaborative Research: Topological Defects and Dynamic Motion of Symmetry-breaking Tadpole Particles in Liquid Crystal Medium
合作研究:液晶介质中对称破缺蝌蚪粒子的拓扑缺陷与动态运动
  • 批准号:
    2344489
  • 财政年份:
    2024
  • 资助金额:
    $ 75万
  • 项目类别:
    Standard Grant
Collaborative Research: AF: Medium: The Communication Cost of Distributed Computation
合作研究:AF:媒介:分布式计算的通信成本
  • 批准号:
    2402836
  • 财政年份:
    2024
  • 资助金额:
    $ 75万
  • 项目类别:
    Continuing Grant
Collaborative Research: AF: Medium: Foundations of Oblivious Reconfigurable Networks
合作研究:AF:媒介:遗忘可重构网络的基础
  • 批准号:
    2402851
  • 财政年份:
    2024
  • 资助金额:
    $ 75万
  • 项目类别:
    Continuing Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了