III: Small: Collaborative Research: Keyphrase Extraction in Document Networks
III:小:协作研究:文档网络中的关键词提取
基本信息
- 批准号:1813571
- 负责人:
- 金额:$ 5.5万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2017
- 资助国家:美国
- 起止时间:2017-10-26 至 2019-08-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Keyphrases for a document concisely describe the document using a small set of phrases (i.e., sequences of contiguous words in a document). For example, the keyphrases "social networks" and "interest targeting" quickly provide us with a high-level topic description (i.e., a summary) of a document focused on targeting interest for recommending services such as products and news to users, in the context of social networks. Given today's very large collections of documents, these keyphrases are extremely important not only for summarizing a document, but also for the search and retrieval of relevant information. However, keyphrases are not always available directly. Instead, they need to be gleaned from the many details in documents. This project addresses the problem of automatic keyphrase extraction from research papers, which are enablers of the sharing and dissemination of scientific discoveries. The goal of the project is to explore accurate approaches that automatically discover and extract keyphrases in documents, using document networks, which will help users handle and digest more information in less time during these "big data" times. Educationally, this research will involve training of both graduate and undergraduate students in the active area of research of keyphrase extraction, which has high impact in many real-world applications such as online advertising, document categorization, recommendation, and summarization, Web search and discovery, and topic tracking in newswire. Although much research to date has been done on automatic keyphrase extraction, no previous approaches have captured the impact of documents on one another via the citation relation that connects documents in a network. This project will investigate models that take into consideration the linkage between citing and cited documents in a document network and will explore various qualitative and quantitative aspects of the question: "What are the key phrases or concepts in a document?" Scalable iterative algorithms will be designed and developed that capture different aspects of documents (e.g., topics or concepts), as well as the impact of one document on another (e.g., influence or topic evolution) in a document network. The results of this research will have a direct pipeline to the CiteSeerX digital library (http://citeseerx.ist.psu.edu). The software, tools, and benchmark datasets developed during the course of this project will be broadly disseminated via the project website (http://people.cs.ksu.edu/~ccaragea/keyphrases.html). All findings will be shared to the research community through publications in academic journals and presented in Information Retrieval, Text Mining and Natural Language Processing conferences.
文档的钥匙拼图使用一小部分短语(即文档中连续单词的序列)简单地描述文档。例如,键形的“社交网络”和“兴趣定位”迅速为我们提供了一份文档的高级主题描述(即摘要),该文档着重于在社交网络的背景下针对向用户推荐的服务(例如产品和新闻)的兴趣。鉴于当今的大量文档集合,这些钥匙纸不仅对于总结文档,而且对于搜索和检索相关信息非常重要。但是,键形并不总是直接可用。相反,他们需要从文档中的许多细节中收集到它们。该项目解决了从研究论文中提取自动键形的问题,这些问题是分享和传播科学发现的推动者。该项目的目标是探索使用文档网络自动发现和提取文档中的键形短语的准确方法,这将帮助用户在这些“大数据”时间的时间内更少的时间处理和消化更多信息。在教育上,这项研究将涉及培训研究生和本科生在Keyphrase提取研究的积极领域,该领域在许多真实世界的应用程序中具有很大的影响,例如在线广告,文档分类,建议,建议以及摘要,网络搜索和发现,以及在Newswire的主题跟踪。 尽管迄今为止已经对自动键形提取进行了大量研究,但以前的方法还没有通过连接网络中文档的引文关系捕获了文档对彼此的影响。该项目将调查考虑文档网络中引用和引用文档之间的联系的模型,并将探讨该问题的各种定性和定量方面:“文档中的关键短语或概念是什么?”将设计和开发可扩展的迭代算法,以捕获文档的不同方面(例如主题或概念),以及一个文档对文档网络中另一个文档(例如影响或主题进化)的影响。这项研究的结果将直接提供到Citeseerx数字图书馆(http://citeseerx.ist.ist.psu.edu)。在此项目过程中开发的软件,工具和基准数据集将通过项目网站(http://people.cs.ksu.edu/~ccaragea/keyphrases.html)广泛传播。所有发现将通过学术期刊的出版物与研究社区共享,并在信息检索,文本挖掘和自然语言处理会议中介绍。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Cornelia Caragea其他文献
Metadata Repository
元数据存储库
- DOI:
10.1007/978-0-387-39940-9_3058 - 发表时间:
2009 - 期刊:
- 影响因子:0
- 作者:
Cornelia Caragea;Vasant G Honavar;P. Boncz;P. Larson;S. Dietrich;Gonzalo Navarro;B. Thuraisingham;Yan Luo;Ouri E. Wolfson;S. Beitzel;Eric C. Jensen;O. Frieder;C. Jensen;N. Tradisauskas;E. Munson;A. Wun;K. Goda;Stephen E. Fienberg;Jiashun Jin;Guimei Liu;Nick Craswell;T. Pedersen;Cesare Pautasso;M. Moro;S. Manegold;B. Carminati;Marina Blanton;S. Bouchenak;Noël de Palma;Wei Tang;C. Quix;M. Jeusfeld;R. K. Pon;David J. Buttler;W. Meng;P. Zezula;Michal Batko;Vlastislav Dohnal;J. Domingo;Denilson Barbosa;I. Manolescu;Jeffrey Xu Yu;E. Cecchet;Vivien Quéma;Xifeng Yan;G. Santucci;D. Zeinalipour;Panos K. Chrysanthis;A. Deshpande;Carlos Guestrin;S. Madden;C. Leung;R. H. Güting;Amarnath Gupta;Heng Tao Shen;G. Weikum;Ramesh Jain;J. Yu;P. Ciaccia;K. Candan;M. Sapino;C. Meghini;F. Sebastiani;U. Straccia;F. Nack;V. S. Subrahmanian;Maria Vanina Martinez;D. Reforgiato;T. Westerveld;M. Sebillo;G. Vitiello;M. De Marsico;K. Voruganti;C. Parent;S. Spaccapietra;C. Vangenot;E. Zimányi;Prasan Roy;S. Sudarshan;E. Puppo;Peer Kröger;M. Renz;H. Schuldt;Solmaz Kolahi;A. Unwin;W. Cellary - 通讯作者:
W. Cellary
Scientific Keyphrase Identification and Classification by Pre-Trained Language Models Intermediate Task Transfer Learning
通过预训练语言模型进行科学的关键词识别和分类中间任务迁移学习
- DOI:
- 发表时间:
2020 - 期刊:
- 影响因子:0
- 作者:
Seoyeon Park;Cornelia Caragea - 通讯作者:
Cornelia Caragea
Semantic Tokenizer for Enhanced Natural Language Processing
用于增强自然语言处理的语义分词器
- DOI:
10.48550/arxiv.2304.12404 - 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Sandeep Mehta;Darpan Shah;Ravindra Kulkarni;Cornelia Caragea - 通讯作者:
Cornelia Caragea
A Group-Based Personalized Model for Image Privacy Classification and Labeling
基于群体的个性化图像隐私分类和标签模型
- DOI:
10.24963/ijcai.2017/552 - 发表时间:
2017 - 期刊:
- 影响因子:0
- 作者:
Haoti Zhong;A. Squicciarini;David J. Miller;Cornelia Caragea - 通讯作者:
Cornelia Caragea
MEDLINE/ PubMed
MEDLINE/PubMed
- DOI:
10.1007/978-0-387-39940-9_3039 - 发表时间:
2004 - 期刊:
- 影响因子:3.8
- 作者:
Cornelia Caragea;V. Honavar;P. Boncz;P. Larson;S. Dietrich;Gonzalo Navarro;Bhavani Thuraisingham;Yan Luo;Ouri E. Wolfson;S. Beitzel;Eric C. Jensen;Ophir Frieder;Christian S. Jensen;N. Tradisauskas;Ethan V. Munson;A. Wun;K. Goda;Stephen E. Fienberg;Jiashun Jin;Guimei Liu;Nick Craswell;T. Pedersen;Cesare Pautasso;M. Moro;S. Manegold;B. Carminati;Marina Blanton;Sara Bouchenak;Noël de Palma;Wei Tang;Christoph Quix;M. Jeusfeld;R. K. Pon;David J. Buttler;W. Meng;P. Zezula;Michal Batko;Vlastislav Dohnal;J. Domingo;Denilson Barbosa;Ioana Manolescu;Jeffrey Xu Yu;Emmanuel Cecchet;Vivien Quéma;Xifeng Yan;G. Santucci;D. Zeinalipour;Panos K. Chrysanthis;Amol Deshpande;Carlos Guestrin;Samuel Madden;Carson Kai;R. H. Güting;Amarnath Gupta;Heng Tao Shen;G. Weikum;Ramesh Jain;Jeffrey Xu Yu;Paolo Ciaccia;K. Candan;M. Sapino;C. Meghini;F. Sebastiani;U. Straccia;F. Nack;V. S. Subrahmanian;Maria Vanina Martinez;D. Reforgiato;T. Westerveld;M. Sebillo;G. Vitiello;Maria De Marsico;K. Voruganti;C. Parent;S. Spaccapietra;Christelle Vangenot;Esteban Zimányi;Prasan Roy;S. Sudarshan;E. Puppo;Peer Kröger;Matthias Renz;H. Schuldt;Solmaz Kolahi;A. Unwin;W. Cellary - 通讯作者:
W. Cellary
Cornelia Caragea的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Cornelia Caragea', 18)}}的其他基金
CHS: Small: Collaborative Research: Automating Relevance and Trust Detection in Social Media Data for Emergency Response
CHS:小型:协作研究:自动化社交媒体数据中的相关性和信任检测以进行紧急响应
- 批准号:
1903963 - 财政年份:2018
- 资助金额:
$ 5.5万 - 项目类别:
Standard Grant
TWC: Small: Collaborative: Towards Privacy Preserving Online Image Sharing
TWC:小型:协作:实现隐私保护在线图像共享
- 批准号:
1903714 - 财政年份:2018
- 资助金额:
$ 5.5万 - 项目类别:
Standard Grant
CRI: CI-SUSTAIN: Collaborative Research: CiteSeerX: Toward Sustainable Support of Scholarly Big Data
CRI:CI-SUSTAIN:协作研究:CiteSeerX:迈向学术大数据的可持续支持
- 批准号:
1853919 - 财政年份:2018
- 资助金额:
$ 5.5万 - 项目类别:
Standard Grant
CRI: CI-SUSTAIN: Collaborative Research: CiteSeerX: Toward Sustainable Support of Scholarly Big Data
CRI:CI-SUSTAIN:协作研究:CiteSeerX:迈向学术大数据的可持续支持
- 批准号:
1823292 - 财政年份:2018
- 资助金额:
$ 5.5万 - 项目类别:
Standard Grant
BIGDATA: IA: Collaborative Research: Domain Adaptation Approaches for Classifying Crisis Related Data on Social Media
大数据:IA:协作研究:社交媒体上危机相关数据分类的领域适应方法
- 批准号:
1741353 - 财政年份:2018
- 资助金额:
$ 5.5万 - 项目类别:
Standard Grant
CAREER: From Data to Knowledge: Extracting and Utilizing Concept Graphs in Online Environments
职业:从数据到知识:在线环境中提取和利用概念图
- 批准号:
1802358 - 财政年份:2017
- 资助金额:
$ 5.5万 - 项目类别:
Continuing Grant
CAREER: From Data to Knowledge: Extracting and Utilizing Concept Graphs in Online Environments
职业:从数据到知识:在线环境中提取和利用概念图
- 批准号:
1652674 - 财政年份:2017
- 资助金额:
$ 5.5万 - 项目类别:
Continuing Grant
TWC: Small: Collaborative: Towards Privacy Preserving Online Image Sharing
TWC:小型:协作:实现隐私保护在线图像共享
- 批准号:
1814255 - 财政年份:2017
- 资助金额:
$ 5.5万 - 项目类别:
Standard Grant
BIGDATA: IA: Collaborative Research: Domain Adaptation Approaches for Classifying Crisis Related Data on Social Media
大数据:IA:协作研究:社交媒体上危机相关数据分类的领域适应方法
- 批准号:
1802284 - 财政年份:2017
- 资助金额:
$ 5.5万 - 项目类别:
Standard Grant
CHS: Small: Collaborative Research: Automating Relevance and Trust Detection in Social Media Data for Emergency Response
CHS:小型:协作研究:自动化社交媒体数据中的相关性和信任检测以进行紧急响应
- 批准号:
1814271 - 财政年份:2017
- 资助金额:
$ 5.5万 - 项目类别:
Standard Grant
相似国自然基金
基于超宽频技术的小微型无人系统集群协作关键技术研究与应用
- 批准号:
- 批准年份:2020
- 资助金额:57 万元
- 项目类别:面上项目
异构云小蜂窝网络中基于协作预编码的干扰协调技术研究
- 批准号:61661005
- 批准年份:2016
- 资助金额:30.0 万元
- 项目类别:地区科学基金项目
密集小基站系统中的新型接入理论与技术研究
- 批准号:61301143
- 批准年份:2013
- 资助金额:24.0 万元
- 项目类别:青年科学基金项目
ScFVCD3-9R负载Bcl-6靶向小干扰RNA治疗EAMG的试验研究
- 批准号:81072465
- 批准年份:2010
- 资助金额:31.0 万元
- 项目类别:面上项目
基于小世界网络的传感器网络研究
- 批准号:60472059
- 批准年份:2004
- 资助金额:21.0 万元
- 项目类别:面上项目
相似海外基金
Collaborative Research: III: Small: High-Performance Scheduling for Modern Database Systems
协作研究:III:小型:现代数据库系统的高性能调度
- 批准号:
2322973 - 财政年份:2024
- 资助金额:
$ 5.5万 - 项目类别:
Standard Grant
Collaborative Research: III: Small: High-Performance Scheduling for Modern Database Systems
协作研究:III:小型:现代数据库系统的高性能调度
- 批准号:
2322974 - 财政年份:2024
- 资助金额:
$ 5.5万 - 项目类别:
Standard Grant
Collaborative Research: III: Small: A DREAM Proactive Conversational System
合作研究:III:小型:一个梦想的主动对话系统
- 批准号:
2336769 - 财政年份:2024
- 资助金额:
$ 5.5万 - 项目类别:
Standard Grant
Collaborative Research: III: Small: A DREAM Proactive Conversational System
合作研究:III:小型:一个梦想的主动对话系统
- 批准号:
2336768 - 财政年份:2024
- 资助金额:
$ 5.5万 - 项目类别:
Standard Grant
III: Small: Multiple Device Collaborative Learning in Real Heterogeneous and Dynamic Environments
III:小:真实异构动态环境中的多设备协作学习
- 批准号:
2311990 - 财政年份:2023
- 资助金额:
$ 5.5万 - 项目类别:
Standard Grant