III:Small: Towards Cross-Model Query Optimizations for Multi-model Heterogeneous Data Analytics
III:Small:面向多模型异构数据分析的跨模型查询优化
基本信息
- 批准号:1909875
- 负责人:
- 金额:$ 44.32万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2019
- 资助国家:美国
- 起止时间:2019-08-15 至 2023-07-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Large-scale analysis of complex, heterogeneous datasets is now an integral part of various social and natural sciences, digital journalism, law, enterprises, and numerous other application domains. Users in such fields are increasingly grappling with the need to perform holistic integrated analytics spanning a variety of data models beyond just structured or semi-structured data to include graph data, text data, etc. Such multi-model data repositories are also growing in volume due to the widespread availability of online data sources such as social media and news media, which have opened up new avenues for insight in various domains. To take advantage of these opportunities, it is necessary to develop joint understanding and processing of at least three data models - relations, graphs, and text - including their evolution over time. This project aims to enable faster and scalable cross-model data analytics.An emerging information architecture for such heterogeneous data problem is the "polystore" approach that uses multiple "uni-model" backend engines such as RDBMSs, graph DBMSs, Solr, etc., and provides a translation layer in the middle to farm out different parts of a cross-model query to different engines. This approach is gaining popularity because it allows us to exploit the full functionality and native performance of uni-model engines for the corresponding parts of the queries. Amongst polystores, there are loosely-coupled solutions that have a very thin processing layer whose task is to "stitch the parts" together, and primarily provide support for data placement, movement and transformation. This project will focus on the query architecture and optimization principles for a tighter-coupled polystore. A usable, efficient, and scalable data analytics platform for queries spanning three data models, viz., relations, graphs, and text (including temporal evolution), that arise from social media and other sources, will be designed. A cross-model dataflow optimizer will be created for this "tri-store" setting to study fundamental systems optimization principles and will be implemented within the AWESOME polystore system. Further, several novel cross-model query optimization techniques will be devised to exploit the semantics of these three data models. Special attention will be paid to the temporality of data such that the optimizations treat temporal evolution of the data as a first-class primitive and support such queries efficiently on top of the existing engines even though they may lack native support for temporal queries.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
复杂,异构数据集的大规模分析现在是各种社会和自然科学,数字新闻,法律,企业以及许多其他应用领域的组成部分。此类领域的用户越来越多地努力执行整体集成分析,涵盖了各种数据模型,而不仅仅是结构化或半结构化数据,包括图形数据,文本数据等。由于社交媒体和新闻媒体的广泛可用性,例如社交媒体和新闻媒体,这些媒体和新闻媒体的广泛可用性,这些多模型数据存储库也在增长。为了利用这些机会,有必要建立至少三个数据模型(关系,图形和文本)的共同理解和处理,包括随着时间的推移的演变。 该项目旨在启用更快,可扩展的跨模型数据分析。这种异质数据问题的新兴信息体系结构是使用多个“ uni-Model”后端引擎的“多店”方法,例如RDBMS,Graph DBMSS,SOLRR等,并在中间耕种了一个不同部分的交换层,以不同的是一个不同的交叉剂量,可以不同。这种方法越来越受欢迎,因为它使我们能够为查询的相应部分利用Uni-Model引擎的全部功能和本机性能。在多商店中,有一个松散耦合的解决方案,其处理层非常薄,其任务是将零件“缝制”在一起,并主要为数据放置,运动和转换提供支持。该项目将重点介绍更紧密耦合的多家房的查询体系结构和优化原理。将设计一个可用,高效且可扩展的数据分析平台,该平台涵盖了三个数据模型,即由社交媒体和其他来源引起的三个数据模型,即关系,关系,图形和文本(包括时间进化)。将为此“三店”设置创建跨模型数据流优化器,以研究基本系统优化原理,并将在Awesome Polystore系统中实现。此外,将设计几种新型的跨模型查询优化技术来利用这三个数据模型的语义。将特别关注数据的时间性,以便优化将数据的时间演变视为一流的原始性,即使它们可能缺乏本地对时间的查询,即使它们在现有发动机上有效地支持了这些查询,否则该奖项反映了NSF的立法任务,并通过对基础的知识效果进行评估,因此值得通过评估来进行评估。
项目成果
期刊论文数量(2)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
An Algebraic Approach for High-level Text Analytics
高级文本分析的代数方法
- DOI:10.1145/3400903.3400926
- 发表时间:2020
- 期刊:
- 影响因子:0
- 作者:Zheng, Xiuwen;Gupta, Amarnath
- 通讯作者:Gupta, Amarnath
PK2G - Declarative Construction and Quality Evaluation of Knowledge Graphs from Polystores
PK2G - Polystores 知识图的声明式构建和质量评估
- DOI:
- 发表时间:2023
- 期刊:
- 影响因子:0
- 作者:Zheng, X;Dasgupta S;Gupta, A
- 通讯作者:Gupta, A
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Amarnath Gupta其他文献
MedSMan: a streaming data management system over live multimedia
MedSMan:实时多媒体流数据管理系统
- DOI:
10.1145/1101149.1101174 - 发表时间:
2005 - 期刊:
- 影响因子:0
- 作者:
B. Liu;Amarnath Gupta;R. Jain - 通讯作者:
R. Jain
HIV Risk on Twitter: the Ethical Dimension of Social Media Evidence-based Prevention for Vulnerable Populations
Twitter 上的艾滋病毒风险:社交媒体针对弱势群体的循证预防的道德层面
- DOI:
10.24251/hicss.2017.216 - 发表时间:
2017 - 期刊:
- 影响因子:5.1
- 作者:
Nadir Weibel;Purvi Desai;L. Saul;Amarnath Gupta;S. Little - 通讯作者:
S. Little
An Optimized Tri-store System for Multi-model Data Analytics
用于多模型数据分析的优化三存储系统
- DOI:
10.48550/arxiv.2305.14391 - 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Xiuwen Zheng;S. Dasgupta;Arun C. S. Kumar;Amarnath Gupta - 通讯作者:
Amarnath Gupta
Augmenting Serialized Bureaucratic Data: The Case of Chinese Courts
扩充序列化官僚数据:中国法院的案例
- DOI:
- 发表时间:
2022 - 期刊:
- 影响因子:0
- 作者:
Xiaohan Wu;Margaret E. Roberts;R. Stern;B. Liebman;Amarnath Gupta;Luke Sanford - 通讯作者:
Luke Sanford
Characterizing the Interaction Between Phthalocyanine Tetrasulfonates and Mammalian Prion Protein
- DOI:
10.1016/j.bpj.2010.12.3221 - 发表时间:
2011-02-02 - 期刊:
- 影响因子:
- 作者:
Iveta Sosova;Abhilash Vincent;Amarnath Gupta;Max Anikovskiy;Angela Brigley;Michael T. Woodside - 通讯作者:
Michael T. Woodside
Amarnath Gupta的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Amarnath Gupta', 18)}}的其他基金
Digital Government: Web Based Information Technologies: Advancing Federal Information Infrastructures
数字政府:基于网络的信息技术:推进联邦信息基础设施
- 批准号:
9906005 - 财政年份:1999
- 资助金额:
$ 44.32万 - 项目类别:
Continuing Grant
相似国自然基金
TIM-4调控小胶质细胞向吞噬型转化促进蛛网膜下腔出血后血液清除的作用及机制
- 批准号:82301485
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
EGFR突变的肺腺癌向小细胞肺癌转变的分子机制及干预策略
- 批准号:82341002
- 批准年份:2023
- 资助金额:200 万元
- 项目类别:专项基金项目
巨噬细胞A20调控小管上皮细胞胞葬在AKI向CKD转变中的作用机制探讨
- 批准号:82270728
- 批准年份:2022
- 资助金额:50 万元
- 项目类别:面上项目
小胶质细胞外泌体调控卒中后星形胶质细胞亚型向神经干细胞转化的机制研究
- 批准号:82271320
- 批准年份:2022
- 资助金额:52 万元
- 项目类别:面上项目
小尺度场向电流时空分布特征及与沉降粒子关系的研究
- 批准号:42174191
- 批准年份:2021
- 资助金额:59.00 万元
- 项目类别:面上项目
相似海外基金
Collaborative Research: IIS-III: Small Towards Fair Outlier Detection
协作研究:IIS-III:小到公平的异常值检测
- 批准号:
2310481 - 财政年份:2023
- 资助金额:
$ 44.32万 - 项目类别:
Standard Grant
III: Small: A New Machine Learning Paradigm Towards Effective yet Efficient Foundation Graph Learning Models
III:小型:一种新的机器学习范式,实现有效且高效的基础图学习模型
- 批准号:
2321504 - 财政年份:2023
- 资助金额:
$ 44.32万 - 项目类别:
Standard Grant
III: Small: Towards Highly Accurate Map Services
III:小:迈向高精度地图服务
- 批准号:
2203553 - 财政年份:2022
- 资助金额:
$ 44.32万 - 项目类别:
Standard Grant
III: Small: Towards Explainable Personalization
III:小:迈向可解释的个性化
- 批准号:
2007492 - 财政年份:2020
- 资助金额:
$ 44.32万 - 项目类别:
Continuing Grant
III: Small: Towards the Foundations of Training Deep Neural Networks: New Theory and Algorithms
III:小:迈向训练深度神经网络的基础:新理论和算法
- 批准号:
2008981 - 财政年份:2020
- 资助金额:
$ 44.32万 - 项目类别:
Continuing Grant