BIGDATA: F: DKM: Collaborative Research: Scalable Middleware for Managing and Processing Big Data on Next Generation HPC Systems

BIGDATA:F:DKM:协作研究:用于在下一代 HPC 系统上管理和处理大数据的可扩展中间件

基本信息

  • 批准号:
    1447861
  • 负责人:
  • 金额:
    $ 36万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2014
  • 资助国家:
    美国
  • 起止时间:
    2014-09-01 至 2017-08-31
  • 项目状态:
    已结题

项目摘要

Managing and processing large volumes of data and gaining meaningful insights is a significant challenge facing the Big Data community. Thus, it is critical that data-intensive computing middleware (such as Hadoop, HBase and Spark) to process such data are diligently designed, with high performance and scalability, in order to meet the growing demands of such Big Data applications. While Hadoop, Spark and HBase are gaining popularity for processing Big Data applications, these middleware and the associated Big Data applications are not able to take advantage of the advanced features on modern High Performance Computing (HPC) systems widely deployed all over the world, including many of of the multi-Petaflop systems in the XSEDE environment. Modern HPC systems and the associated middleware (such as MPI and Parallel File systems) have been exploiting the advances in HPC technologies (multi/many-core architectures, RDMA-enabled networking, NVRAMs and SSDs) during the last decade. However, Big Data middleware (such as Hadoop, HBase and Spark) have not embraced such technologies. These disparities are taking HPC and Big Data processing into "divergent trajectories." The proposed research, undertaken by a team of computer and application scientists from OSU and SDSC, aim to bring HPC and Big Data processing into a "convergent trajectory." The investigators will specifically address the following challenges: 1) designing novel communication and I/O runtime for Big Data processing while exploiting the features of modern multi-/many-core, networking and storage technologies; 2) redesigning Big Data middleware (such as Hadoop, HBase and Spark) to deliver performance and scalability on modern and next-generation HPC systems; and 3) demonstrating the benefits of the proposed approach for a set of driving Big Data applications on HPC system. The proposed work targets four major workloads and applications in the Big Data community (namely data analytics, query, interactive, and iterative) using the popular Big Data middleware (Hadoop, HBase and Spark). The proposed framework will be validated on a variety of Big Data benchmarks and applications. The proposed middleware and runtimes will be made publicly available to the community. The research enables curricular advancements via research in pedagogy for key courses in the new data analytics program at Ohio State and SDSC -- among the first of its kind nationwide.
管理和处理大量数据并获得有意义的见解是大数据社区面临的重大挑战。 因此,至关重要的是,数据密集型计算中间件(例如Hadoop,HBase和Spark)以高性能和可扩展性的方式处理此类数据,以满足对此类大数据应用程序不断增长的需求。 尽管Hadoop,Spark和HBase在处理大数据应用程序方面越来越受欢迎,但这些中间件和相关的大数据应用程序无法利用现代高性能计算(HPC)系统的高级功能,包括在世界范围内广泛部署,包括Xsede环境中的许多多petaflop系统。 现代的HPC系统和相关的中间件(例如MPI和并行文件系统)在过去的十次中利用了HPC Technologies(多核架构,启用RDMA的网络,NVRAM和SSD)的进步。但是,大数据中间件(例如Hadoop,HBase和Spark)尚未接受此类技术。这些差异将HPC和大数据处理用于“不同的轨迹”。由OSU和SDSC的计算机和应用科学家团队进行的拟议研究旨在将HPC和大数据处理带入“融合轨迹”。调查人员将具体解决以下挑战:1)设计新颖的通信和I/O运行时进行大数据处理,同时利用现代多核/多核,网络和存储技术的功能; 2)重新设计大数据中间件(例如Hadoop,HBase和Spark),以在现代和下一代HPC系统上提供性能和可伸缩性; 3)展示提出方法在HPC系统上驱动大数据应用程序的好处。 拟议的工作使用流行的大数据中间件(Hadoop,HBase和Spark),针对大数据社区(即数据分析,查询,交互式和迭代)的四个主要工作负载和应用程序。 所提出的框架将在各种大数据基准和应用程序上进行验证。 拟议的中间件和运行时间将公开向社区公开。 该研究可以通过在俄亥俄州立大学和SDSC的新数据分析计划中的教学法研究中进行课程进步,这是全国范围内的第一个。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Amitava Majumdar其他文献

Cyberinfrastructure Usage Modalities on the TeraGrid
TeraGrid 上的网络基础设施使用方式
A parallel Monte Carlo code for planar and SPECT imaging: implementation, verification and applications in /sup 131/I SPECT
用于平面和 SPECT 成像的并行蒙特卡罗代码:/sup 131/I SPECT 中的实现、验证和应用
Ground bounce considerations in DC parametric test generation using boundary scan
使用边界扫描生成直流参数测试时的地弹注意事项
Creating intelligent cyberinfrastructure for democratizing AI
创建智能网络基础设施以实现人工智能民主化
  • DOI:
    10.1002/aaai.12166
  • 发表时间:
    2024
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Dhabaleswar K. Panda;Vipin Chaudhary;Eric Fosler‐Lussier;R. Machiraju;Amitava Majumdar;Beth Plale;R. Ramnath;P. Sadayappan;Neelima Savardekar;Karen Tomko
  • 通讯作者:
    Karen Tomko
The MVAPICH Project: Evolution and Sustainability of an Open Source Production Quality MPI Library for HPC
MVAPICH 项目:HPC 开源生产质量 MPI 库的演变和可持续性
  • DOI:
    10.6084/m9.figshare.791563.v5
  • 发表时间:
    2013
  • 期刊:
  • 影响因子:
    0
  • 作者:
    D. Panda;K. Tomko;Karl W. Schulz;Amitava Majumdar
  • 通讯作者:
    Amitava Majumdar

Amitava Majumdar的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Amitava Majumdar', 18)}}的其他基金

Collaborative Research: Frameworks: hpcGPT: Enhancing Computing Center User Support with HPC-enriched Generative AI
协作研究:框架:hpcGPT:通过 HPC 丰富的生成式 AI 增强计算中心用户支持
  • 批准号:
    2411297
  • 财政年份:
    2024
  • 资助金额:
    $ 36万
  • 项目类别:
    Standard Grant
Category II: Exploring Neural Network Processors for AI in Science and Engineering
第二类:探索科学与工程中人工智能的神经网络处理器
  • 批准号:
    2005369
  • 财政年份:
    2020
  • 资助金额:
    $ 36万
  • 项目类别:
    Cooperative Agreement
Collaborative Research: CIBR: Building Capacity for Data-driven Neuroscience Research
合作研究:CIBR:数据驱动神经科学研究能力建设
  • 批准号:
    1935749
  • 财政年份:
    2020
  • 资助金额:
    $ 36万
  • 项目类别:
    Standard Grant
Collaborative Research: Frameworks: Designing Next-Generation MPI Libraries for Emerging Dense GPU Systems
协作研究:框架:为新兴密集 GPU 系统设计下一代 MPI 库
  • 批准号:
    1931450
  • 财政年份:
    2019
  • 资助金额:
    $ 36万
  • 项目类别:
    Standard Grant
Promoting International Collaboration on Developing Scalable, Portable & Efficient HPC Software for Modern HPC Platforms
促进开发可扩展、便携的国际合作
  • 批准号:
    1849519
  • 财政年份:
    2018
  • 资助金额:
    $ 36万
  • 项目类别:
    Standard Grant
SHF: Large: Collaborative Research: Next Generation Communication Mechanisms exploiting Heterogeneity, Hierarchy and Concurrency for Emerging HPC Systems
SHF:大型:协作研究:利用新兴 HPC 系统的异构性、层次结构和并发性的下一代通信机制
  • 批准号:
    1565336
  • 财政年份:
    2016
  • 资助金额:
    $ 36万
  • 项目类别:
    Standard Grant
Bilateral BBSRC-NSF/BIO: Collaborative Research: ABI Development: Seamless Integration of Neuroscience Models and Tools with HPC - Easy Path to Supercomputing for Neuroscience
双边 BBSRC-NSF/BIO:合作研究:ABI 开发:神经科学模型和工具与 HPC 的无缝集成 - 神经科学超级计算的简单途径
  • 批准号:
    1458840
  • 财政年份:
    2015
  • 资助金额:
    $ 36万
  • 项目类别:
    Standard Grant
SHF: Large: Collaborative Research: Unified Runtime for Supporting Hybrid Programming Models on Heterogeneous Architecture.
SHF:大型:协作研究:支持异构架构上的混合编程模型的统一运行时。
  • 批准号:
    1213056
  • 财政年份:
    2012
  • 资助金额:
    $ 36万
  • 项目类别:
    Standard Grant
Collaborative Research: SI2-SSI: A Comprehensive Performance Tuning Framework for the MPI Stack
合作研究:SI2-SSI:MPI 堆栈的综合性能调优框架
  • 批准号:
    1147926
  • 财政年份:
    2012
  • 资助金额:
    $ 36万
  • 项目类别:
    Standard Grant
Collaborative Research: ABI Development: Building A Community Resource for Neuroscientists
合作研究:ABI 开发:为神经科学家建立社区资源
  • 批准号:
    1146949
  • 财政年份:
    2012
  • 资助金额:
    $ 36万
  • 项目类别:
    Standard Grant

相似海外基金

BIGDATA: F: DKM: Collaborative Research: PXFS: ParalleX Based Transformative I/O System for Big Data
BIGDATA:F:DKM:协作研究:PXFS:基于 ParalleX 的大数据变革性 I/O 系统
  • 批准号:
    1447650
  • 财政年份:
    2014
  • 资助金额:
    $ 36万
  • 项目类别:
    Standard Grant
BIGDATA: F: DKM: Collaborative Research: PXFS: ParalleX Based Transformative I/O System for Big Data
BIGDATA:F:DKM:协作研究:PXFS:基于 ParalleX 的大数据变革性 I/O 系统
  • 批准号:
    1447771
  • 财政年份:
    2014
  • 资助金额:
    $ 36万
  • 项目类别:
    Standard Grant
BIGDATA: F: DKM: Collaborative Research: Making Big Data Active: From Petabytes to Megafolks in Milliseconds
BIGDATA:F:DKM:协作研究:使大数据活跃起来:在毫秒内从 PB 级到百万级数据
  • 批准号:
    1447720
  • 财政年份:
    2014
  • 资助金额:
    $ 36万
  • 项目类别:
    Standard Grant
BIGDATA: F: DKM: Collaborative Research: Making Big Data Active: From Petabytes to Megafolks in Milliseconds
BIGDATA:F:DKM:协作研究:使大数据活跃起来:在毫秒内从 PB 级到百万级数据
  • 批准号:
    1447826
  • 财政年份:
    2014
  • 资助金额:
    $ 36万
  • 项目类别:
    Standard Grant
BIGDATA: F: DKM: Collaborative Research: Scalable Middleware for Managing and Processing Big Data on Next Generation HPC Systems
BIGDATA:F:DKM:协作研究:用于在下一代 HPC 系统上管理和处理大数据的可扩展中间件
  • 批准号:
    1447804
  • 财政年份:
    2014
  • 资助金额:
    $ 36万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了