CAREER: Towards Efficient In-storage Indexing
职业:实现高效的存储内索引
基本信息
- 批准号:2338457
- 负责人:
- 金额:$ 61.55万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2024
- 资助国家:美国
- 起止时间:2024-07-01 至 2029-06-30
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Data indexing plays a crucial role in numerous modern technologies, including search engines, big data analytics, file systems, and databases. In this context, in-storage indexing devices (ISIDs) have emerged to enhance the functionalities of storage devices, leading to improved performance, efficiency, and cost-effective data processing. By storing index information alongside the data it indexes within the same storage device, ISIDs offer several advantages over traditional indexing methods. These advantages include reducing data movement, improving access speed, minimizing network impact, enabling efficient data management, and freeing host computing for critical tasks. To design efficient ISIDs, several challenges need to be addressed. Firstly, there is a need for low-cost and open-source research platforms to facilitate the reproduction and comparison of research work, promoting quick adoption of ISID advancements. Secondly, integrating the fragmented advancements of individual ISID components is crucial to capture their holistic impacts and interactions effectively. Thirdly, addressing diverse workload requests, interference in multi-tenant environments, and data distribution considerations requires new research methods for overall operation optimization. This CAREER research project aims to overcome these research challenges and promote the adoption of ISIDs, contributing to the advancements of storage systems. This project will explore and develop innovative methods to unleash the full potential of ISIDs in modern data management systems. By addressing the core challenges, the project seeks to revolutionize data storage systems and make significant contributions to the field of storage technology. This project will share the findings with undergraduate and graduate students through computer science programs and open up career opportunities to female students, underrepresented minorities, and first-generation college students. This project will disseminate the proposed techniques into the industry and foster technology transfer through new industrial collaborations. The developed infrastructure will be available to the research community through a web-based portal.This research makes significant empirical contributions to the ISID design and development space by addressing major challenges posed by in-storage indexing. Specifically, it advances the state of knowledge by investigating the following questions: (1) How can we design and develop new ISID models that accurately capture the behavior of internal modules, such as the index manager, request handler, data access parallelism, index-induced wear leveling, and garbage collection? These insights will enable scientific design advancements and detailed tradeoff analysis for ISIDs. (2) How can we develop efficient dynamic model calibration techniques using coarse measurements to parameterize queuing models that accurately capture burstiness and variability in ISIDs? (3) How can we emulate index manager delays using different data structures and sizes and utilize black-box and gray-box calibration techniques to identify ground truth for ISIDs? (4) How can we design a new re-configurable indexing architecture and index cache that ensures deterministic tail latency, low overhead prefetching and eviction, and improved membership checking through object signatures and ML-based feature learning in ISIDs? (5) How can we design tenant-local eviction policies that consider the effect of allocating space for index and data, recognizing the dependencies between them for efficient data access in ISIDs? (6) How can we minimize log-checking overhead and avoid in-storage hash computations while exploring the trade-off between consistency and performance by allowing read-only tenants to bypass the log and access their own consistent index in ISIDs? (7) Does capacity variance, which gracefully reduces ISID capacity as flash pages become bad, provide a better alternative to wear-leveling for ISIDs? Throughout the project, the PI will facilitate the connection of the proposed research with the contents and concepts of several courses on Systems at FIU.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
数据索引在许多现代技术中发挥着至关重要的作用,包括搜索引擎、大数据分析、文件系统和数据库。在此背景下,存储内索引设备(ISID)应运而生,以增强存储设备的功能,从而提高性能、效率和经济高效的数据处理。通过将索引信息与其索引的数据一起存储在同一存储设备中,ISID 与传统索引方法相比具有多种优势。这些优势包括减少数据移动、提高访问速度、最大限度地减少网络影响、实现高效的数据管理以及释放主机计算来执行关键任务。为了设计有效的 ISID,需要解决几个挑战。首先,需要低成本、开源的研究平台来促进研究工作的复制和比较,促进 ISID 进步的快速采用。其次,整合各个 ISID 组成部分的零散进步对于有效捕捉其整体影响和相互作用至关重要。第三,解决不同的工作负载请求、多租户环境中的干扰以及数据分布的考虑需要新的研究方法来优化整体运营。该职业研究项目旨在克服这些研究挑战并促进 ISID 的采用,为存储系统的进步做出贡献。该项目将探索和开发创新方法,以充分发挥 ISID 在现代数据管理系统中的潜力。通过解决核心挑战,该项目力求彻底改变数据存储系统,并为存储技术领域做出重大贡献。该项目将通过计算机科学项目与本科生和研究生分享研究结果,并为女学生、代表性不足的少数族裔和第一代大学生提供就业机会。该项目将向行业传播拟议的技术,并通过新的行业合作促进技术转让。开发的基础设施将通过基于网络的门户向研究界提供。这项研究通过解决存储内索引带来的主要挑战,为 ISID 设计和开发空间做出了重大的实证贡献。具体来说,它通过研究以下问题来提高知识状态:(1)我们如何设计和开发新的 ISID 模型来准确捕获内部模块的行为,例如索引管理器、请求处理程序、数据访问并行性、索引诱导磨损均衡和垃圾收集?这些见解将促进 ISID 的科学设计进步和详细的权衡分析。 (2) 我们如何开发有效的动态模型校准技术,使用粗略测量来参数化排队模型,从而准确捕获 ISID 的突发性和可变性? (3) 我们如何使用不同的数据结构和大小来模拟索引管理器延迟,并利用黑盒和灰盒校准技术来识别 ISID 的基本事实? (4) 我们如何设计一个新的可重新配置的索引架构和索引缓存,以确保确定性的尾部延迟、低开销的预取和驱逐,并通过 ISID 中的对象签名和基于 ML 的特征学习来改进成员资格检查? (5) 我们如何设计租户本地驱逐策略,考虑索引和数据分配空间的影响,识别它们之间的依赖关系,以实现 ISID 中的高效数据访问? (6) 如何通过允许只读租户绕过日志并访问自己在 ISID 中的一致索引来最小化日志检查开销并避免存储内哈希计算,同时探索一致性和性能之间的权衡? (7) 当闪存页变坏时,容量差异会优雅地减少 ISID 容量,这是否为 ISID 的磨损均衡提供了更好的替代方案?在整个项目中,PI 将促进拟议研究与 FIU 几门系统课程的内容和概念的联系。该奖项反映了 NSF 的法定使命,并通过使用基金会的智力价值和更广泛的影响进行评估,被认为值得支持审查标准。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Janki Bhimani其他文献
Janki Bhimani的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Janki Bhimani', 18)}}的其他基金
CSR: Small: Learning and Management in Tiered Memory Systems
CSR:小:分层内存系统中的学习和管理
- 批准号:
2323100 - 财政年份:2023
- 资助金额:
$ 61.55万 - 项目类别:
Standard Grant
Collaborative Research: CNS core: OAC core: Small: New Techniques for I/O Behavior Modeling and Persistent Storage Device Configuration
合作研究: CNS 核心:OAC 核心:小型:I/O 行为建模和持久存储设备配置新技术
- 批准号:
2008324 - 财政年份:2020
- 资助金额:
$ 61.55万 - 项目类别:
Standard Grant
相似国自然基金
人多能干细胞向具有感知功能的皮肤类器官高效分化体系研究
- 批准号:32300674
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
面向大规模有向图的高效稠密子图挖掘算法研究
- 批准号:62202412
- 批准年份:2022
- 资助金额:20 万元
- 项目类别:青年科学基金项目
面向大规模有向图的高效稠密子图挖掘算法研究
- 批准号:
- 批准年份:2022
- 资助金额:20 万元
- 项目类别:青年科学基金项目
基于工件切向超声振动的钛铝叶片榫齿高效深切成形磨削技术基础研究
- 批准号:52175415
- 批准年份:2021
- 资助金额:58 万元
- 项目类别:面上项目
晶向可控生长的高效深蓝光钙钛矿发光二极管
- 批准号:
- 批准年份:2020
- 资助金额:60 万元
- 项目类别:面上项目
相似海外基金
CAREER: Green Functions as a Service: Towards Sustainable and Efficient Distributed Computing Infrastructure
职业:绿色功能即服务:迈向可持续、高效的分布式计算基础设施
- 批准号:
2340722 - 财政年份:2024
- 资助金额:
$ 61.55万 - 项目类别:
Continuing Grant
CAREER: Towards 3D Omnidirectional and Efficient Wireless Power Transfer with Controlled 2D Near-Field Coil Array
职业:利用受控 2D 近场线圈阵列实现 3D 全向高效无线功率传输
- 批准号:
2338697 - 财政年份:2024
- 资助金额:
$ 61.55万 - 项目类别:
Continuing Grant
CAREER: Towards highly efficient UV emitters with lattice engineered substrates
事业:采用晶格工程基板实现高效紫外线发射器
- 批准号:
2338683 - 财政年份:2024
- 资助金额:
$ 61.55万 - 项目类别:
Continuing Grant
CAREER: Towards Efficient Cryptography for Next Generation Applications
职业:面向下一代应用的高效密码学
- 批准号:
2402031 - 财政年份:2023
- 资助金额:
$ 61.55万 - 项目类别:
Continuing Grant
CAREER: Towards Efficient and Scalable Zero-Knowledge Proofs
职业:迈向高效且可扩展的零知识证明
- 批准号:
2401481 - 财政年份:2023
- 资助金额:
$ 61.55万 - 项目类别:
Continuing Grant