喵ID:VBEWsW免责声明

WordPrep: Word-based Preposition Prediction Tool

WordPrep:基于单词的介词预测工具

基本信息

DOI:
10.1109/bigdata47090.2019.9005608
发表时间:
2019
期刊:
2019 IEEE International Conference on Big Data (Big Data)
影响因子:
--
通讯作者:
Anna Feldman
中科院分区:
文献类型:
--
作者: Pooja Bhagat;A. Varde;Anna Feldman研究方向: -- MeSH主题词: --
关键词: --
来源链接:pubmed详情页地址

文献摘要

As big data heads towards big knowledge, data management and machine learning techniques work together to address several interesting problems. In this paper, we address a problem in natural language processing that involves learning by mining from large text databases. More specifically, we deal with the problem of preposition prediction, especially for ESL (English as a second language) learners. Prepositions are function words that typically show a relationship between a noun or a pronoun and other elements of a sentence. They play a key role in determining the meaning of a sentence. Accurate prediction of correct prepositions in a sentence is a challenging job since preposition usage is one of the most subtle aspects of the English grammar, making it difficult for non-native speakers. This paper proposes an approach for preposition prediction called WordPrep based on which we build a tool. WordPrep relies on mining based on the words themselves rather than on their lexical or syntactic connotations. This addresses the challenges of prepositions appearing in idiomatic phrases or in different semantic contexts, due to which the actual words are better than their grammatical positions. Our proposed solution entails a direct data-driven approach to predict the missing preposition in a sentence by learning from matching tokens consisting of ngrams with words before and after the preposition. Using various searches and pattern-matching methods against a large number of database records from big text corpora, this approach predicts the missing preposition(s). We describe our pilot approach, tool implementation and experiments in this paper. This work is particularly helpful for pedagogical applications.
随着大数据迈向大知识,数据管理和机器学习技术协同解决几个有趣的问题。在本文中,我们探讨自然语言处理中的一个问题,该问题涉及从大型文本数据库中挖掘学习。更具体地说,我们处理介词预测问题,尤其是针对英语作为第二语言(ESL)的学习者。介词是功能词,通常表示名词或代词与句子其他成分之间的关系。它们在确定句子的意思方面起着关键作用。准确预测句子中正确的介词是一项具有挑战性的工作,因为介词的用法是英语语法中最微妙的方面之一,这对非母语人士来说很困难。本文提出一种介词预测方法,称为WordPrep,并基于此构建了一个工具。WordPrep依靠基于单词本身的挖掘,而非其词汇或句法内涵。这解决了介词出现在习语短语或不同语义语境中的挑战,因为在这些情况下实际的单词比其语法位置更重要。我们提出的解决方案需要一种直接的数据驱动方法,通过从由介词前后的单词组成的n元语法匹配标记中学习,来预测句子中缺失的介词。通过针对大型文本语料库中的大量数据库记录使用各种搜索和模式匹配方法,这种方法可以预测缺失的介词。我们在本文中描述了我们的初步方法、工具实现和实验。这项工作对教学应用特别有帮助。
参考文献(1)
被引文献(5)
WORDNET - A LEXICAL DATABASE FOR ENGLISH
DOI:
10.1145/219717.219748
发表时间:
1995-11-01
期刊:
COMMUNICATIONS OF THE ACM
影响因子:
22.7
作者:
MILLER, GA
通讯作者:
MILLER, GA

数据更新时间:{{ references.updateTime }}

Anna Feldman
通讯地址:
--
所属机构:
--
电子邮件地址:
--
免责声明免责声明
1、猫眼课题宝专注于为科研工作者提供省时、高效的文献资源检索和预览服务;
2、网站中的文献信息均来自公开、合规、透明的互联网文献查询网站,可以通过页面中的“来源链接”跳转数据网站。
3、在猫眼课题宝点击“求助全文”按钮,发布文献应助需求时求助者需要支付50喵币作为应助成功后的答谢给应助者,发送到用助者账户中。若文献求助失败支付的50喵币将退还至求助者账户中。所支付的喵币仅作为答谢,而不是作为文献的“购买”费用,平台也不从中收取任何费用,
4、特别提醒用户通过求助获得的文献原文仅用户个人学习使用,不得用于商业用途,否则一切风险由用户本人承担;
5、本平台尊重知识产权,如果权利所有者认为平台内容侵犯了其合法权益,可以通过本平台提供的版权投诉渠道提出投诉。一经核实,我们将立即采取措施删除/下架/断链等措施。
我已知晓