Doctoral Dissertation: Investigating the role of grammatical representation in language learnability

博士论文:研究语法表征在语言可学习性中的作用

基本信息

  • 批准号:
    1420785
  • 负责人:
  • 金额:
    $ 1.17万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2014
  • 资助国家:
    美国
  • 起止时间:
    2014-07-15 至 2015-12-31
  • 项目状态:
    已结题

项目摘要

Technologies which process natural language have become ubiquitous in the last decade. Web search engines, for example, process billions of pages of text, in order to determine which of those pages best match a user's search query. Many interfaces for interacting with computers -- for example, Apple's Siri personal assistant -- take voice-issued commands from their users, and must process these commands in order to follow the users' instructions. Finally, machine translation technologies have become available for many of the world's most common languages, allowing users to automatically translate text that they find in foreign books or websites. These technologies mostly rely on simple models of language, known as n-gram models or context-free grammars, which were developed in the 1950's and 1960's, and refined in later decades. These simple models of language have many advantages, most notably that they can be used to process large amounts of data very quickly. Because of their simplicity, however, these models are not able to capture many aspects of meaning in natural language. This has resulted in limitations for the technologies discussed above; virtual personal assistants are only able to process very simple types of instructions, and machine translations is still far from being as accurate as human translation. In the current project, Leon Bergen and Dr. Edward Gibson will be investigating more sophisticated kinds of language models, with the goal of increasing the ability of computers to understand language.Under the direction of Dr. Gibson, Mr. Berger will be studying language models known as mildly context-sensitive grammars. These grammars are able to express certain types of linguistic knowledge that humans have, but which cannot be expressed using simpler types of grammatical formalisms. For example, native speakers of English know that a declarative sentence like "Mary kicked the ball" is closely related in meaning to the question "What did Mary kick?" Although this fact seems obvious, it is difficult (or impossible) to express using simple types of grammars. However, mildly context-sensitive grammars can be used to express this knowledge in a very natural way. Mr. Bergen and Dr. Gibson will be studying whether mildly context-sensitive grammars can be automatically learned from examples of grammatical sentences. To do this, they will be using techniques from machine learning, a branch of computer science and statistics that develops algorithms that can automatically learn from data. The researchers will integrate these learning algorithms with their grammatical formalism, and will test whether their method learns an accurate grammar. The accuracy of the grammar will be evaluated using a corpus -- a collection of sentences -- in which every sentence has been manually annotated with its correct grammatical structure. If accurate mildly context-sensitive grammars can be learned in this manner, then this provides a potential method for improving the natural language processing technologies which were discussed above. In particular, because this method does not require an expert to write down the complete grammar for a language, it has the potential to be deployed without tremendous engineering effort, and may be deployed easily in foreign languages.
在过去的十年里,处理自然语言的技术已经变得无处不在。例如,网络搜索引擎处理数十亿页的文本,以确定哪些页面最匹配用户的搜索查询。许多与计算机交互的界面——例如苹果的 Siri 个人助理——接受用户发出的语音命令,并且必须处理这些命令才能遵循用户的指令。最后,机器翻译技术已经适用于世界上许多最常见的语言,允许用户自动翻译他们在外国书籍或网站上找到的文本。这些技术主要依赖于简单的语言模型,称为 n-gram 模型或上下文无关语法,这些模型在 20 世纪 50 年代和 1960 年代开发,并在后来的几十年中得到完善。这些简单的语言模型有很多优点,最值得注意的是它们可以用来非常快速地处理大量数据。然而,由于它们的简单性,这些模型无法捕捉自然语言中含义的许多方面。这导致了上述技术的局限性;虚拟个人助理只能处理非常简单的指令,机器翻译还远远没有达到人工翻译的准确度。在当前的项目中,Leon Bergen 和 Edward Gibson 博士将研究更复杂的语言模型,目标是提高计算机理解语言的能力。在 Gibson 博士的指导下,Berger 先生将研究语言称为轻度上下文相关语法的模型。这些语法能够表达人类拥有的某些类型的语言知识,但无法使用更简单类型的语法形式来表达。例如,以英语为母语的人知道像“玛丽踢了球”这样的陈述句在含义上与问题“玛丽踢了什么?”密切相关。尽管这个事实看起来很明显,但很难(或不可能)使用简单类型的语法来表达。然而,可以使用轻度上下文相关语法以非常自然的方式表达这些知识。伯根先生和吉布森博士将研究是否可以从语法句子的例子中自动学习轻度上下文相关的语法。为此,他们将使用机器学习技术,机器学习是计算机科学和统计学的一个分支,开发可以自动从数据中学习的算法。研究人员将把这些学习算法与其语法形式相结合,并测试他们的方法是否能学习准确的语法。语法的准确性将使用语料库(句子的集合)进行评估,其中每个句子都已用其正确的语法结构手动注释。如果可以通过这种方式学习准确的轻度上下文相关语法,那么这为改进上面讨论的自然语言处理技术提供了一种潜在的方法。特别是,由于这种方法不需要专家写下一种语言的完整语法,因此它有可能无需巨大的工程工作即可部署,并且可以轻松地用外语部署。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Edward Gibson其他文献

Assessing the inferential strength of epistemic must
评估认知必须的推理强度
  • DOI:
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    2.1
  • 作者:
    Giuseppe Ricciardi;Rachel Ryskin;Edward Gibson
  • 通讯作者:
    Edward Gibson
Variation in spatial concepts: Different frames of reference on different axes
空间概念的变化:不同轴上的不同参考系
Concepts Are Restructured During Language Contact: The Birth of Blue and Other Color Concepts in Tsimane’-Spanish Bilinguals
语言接触过程中概念的重组:提斯曼-西班牙语双语者中蓝色和其他颜色概念的诞生
  • DOI:
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Saima Malik;Kyle Mahowald;Bevil R. Conway;Edward Gibson
  • 通讯作者:
    Edward Gibson
Recent Advances in Imaging of Barrett’s Esophagus
巴雷特食管影像学的最新进展
  • DOI:
  • 发表时间:
    2018
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Shekhar Sharma;Edward Gibson;N. Uedo;Rajvinder Singh
  • 通讯作者:
    Rajvinder Singh
Can Language Models Be Tricked by Language Illusions? Easier with Syntax, Harder with Semantics
语言模型会被语言错觉所欺骗吗?

Edward Gibson的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Edward Gibson', 18)}}的其他基金

Evaluating meaning-based explanations of syntactic island effects cross-linguistically
跨语言评估句法岛效应的基于意义的解释
  • 批准号:
    2020840
  • 财政年份:
    2020
  • 资助金额:
    $ 1.17万
  • 项目类别:
    Standard Grant
Expanding the reach, impact and sustainability of ToyBox Study Malaysia: a kindergarten-based healthy behaviour intervention
扩大马来西亚玩具盒研究的范围、影响和可持续性:基于幼儿园的健康行为干预
  • 批准号:
    MR/V00607X/1
  • 财政年份:
    2020
  • 资助金额:
    $ 1.17万
  • 项目类别:
    Research Grant
Improving healthy energy balance- and obesity-related behaviours among preschoolers in Malaysia: feasibility of adapting the ToyBox-Study
改善马来西亚学龄前儿童的健康能量平衡和肥胖相关行为:采用玩具盒研究的可行性
  • 批准号:
    MR/P013805/1
  • 财政年份:
    2017
  • 资助金额:
    $ 1.17万
  • 项目类别:
    Research Grant
Workshop on Language Processing and Language Evolution: Special Session at the 2017 CUNY Conference on Human Sentence Processing
语言处理和语言进化研讨会:2017 年纽约市立大学人类句子处理会议特别会议
  • 批准号:
    1629983
  • 财政年份:
    2016
  • 资助金额:
    $ 1.17万
  • 项目类别:
    Standard Grant
Doctoral Dissertation Research: A Communicative Perspective on Quantitative Syntax
博士论文研究:数量句法的交际视角
  • 批准号:
    1551543
  • 财政年份:
    2016
  • 资助金额:
    $ 1.17万
  • 项目类别:
    Standard Grant
Doctoral Dissertation Research: Investigating cognitive and communicative pressures on natural language lexicons
博士论文研究:调查自然语言词典的认知和交际压力
  • 批准号:
    1451173
  • 财政年份:
    2015
  • 资助金额:
    $ 1.17万
  • 项目类别:
    Standard Grant
The role of noise in information-theoretic models of sentence comprehension and production
噪声在句子理解和生成的信息论模型中的作用
  • 批准号:
    1534318
  • 财政年份:
    2015
  • 资助金额:
    $ 1.17万
  • 项目类别:
    Standard Grant
Doctoral Dissertation Research: Causal Representations in Children's Transitive Sentences
博士论文研究:儿童及物句的因果表征
  • 批准号:
    1227892
  • 财政年份:
    2012
  • 资助金额:
    $ 1.17万
  • 项目类别:
    Standard Grant
Origins of Numerical Competence: Assessment of Number Sense in Piraha
数字能力的起源:皮拉哈语数感评估
  • 批准号:
    1022684
  • 财政年份:
    2010
  • 资助金额:
    $ 1.17万
  • 项目类别:
    Standard Grant
Doctoral Dissertation Research: Discovering Semantic Primitives
博士论文研究:发现语义原语
  • 批准号:
    1025309
  • 财政年份:
    2010
  • 资助金额:
    $ 1.17万
  • 项目类别:
    Standard Grant

相似国自然基金

面向论文引用与科研合作的"科学学"规律中的国别特征研究
  • 批准号:
    72374173
  • 批准年份:
    2023
  • 资助金额:
    41 万元
  • 项目类别:
    面上项目
基于科学论文论证结构的可循证领域知识体系构建研究
  • 批准号:
    72304137
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
基于社交媒体用户画像的科学论文传播模式与影响力性质研究
  • 批准号:
    72304274
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
细粒度与个性化的学生议论文评价方法研究
  • 批准号:
    62306145
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
基于深度语义理解的生物医学论文临床转化分析研究
  • 批准号:
  • 批准年份:
    2022
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

Doctoral Dissertation Research: Investigating the genomic underpinnings of the human hand and foot
博士论文研究:研究人类手脚的基因组基础
  • 批准号:
    2337516
  • 财政年份:
    2024
  • 资助金额:
    $ 1.17万
  • 项目类别:
    Standard Grant
Doctoral Dissertation Research: Investigating Temporal Morphology and Verbal Order in an Endangered Language
博士论文研究:研究濒危语言的时间形态和言语顺序
  • 批准号:
    2302393
  • 财政年份:
    2023
  • 资助金额:
    $ 1.17万
  • 项目类别:
    Standard Grant
Doctoral Dissertation Research Award: Investigating the Role of Scale in the Development of Flexible Irrigation Structures.
博士论文研究奖:调查规模在灵活灌溉结构发展中的作用。
  • 批准号:
    2314519
  • 财政年份:
    2023
  • 资助金额:
    $ 1.17万
  • 项目类别:
    Standard Grant
Doctoral Dissertation Research in Economics: Investigating the Impact of the 'Norm to Work' on Worker Power and Labor Market Outcomes
经济学博士论文研究:调查“工作规范”对工人权力和劳动力市场结果的影响
  • 批准号:
    2314163
  • 财政年份:
    2023
  • 资助金额:
    $ 1.17万
  • 项目类别:
    Standard Grant
Doctoral Dissertation Research: Investigating Sound Change in an Understudied Language: A Sociophonetic Study of Age and Locality Effects
博士论文研究:调查所研究语言的声音变化:年龄和地点效应的社会语音学研究
  • 批准号:
    2214689
  • 财政年份:
    2022
  • 资助金额:
    $ 1.17万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了