Doctoral Dissertation: Investigating the role of grammatical representation in language learnability
博士论文:研究语法表征在语言可学习性中的作用
基本信息
- 批准号:1420785
- 负责人:
- 金额:$ 1.17万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2014
- 资助国家:美国
- 起止时间:2014-07-15 至 2015-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Technologies which process natural language have become ubiquitous in the last decade. Web search engines, for example, process billions of pages of text, in order to determine which of those pages best match a user's search query. Many interfaces for interacting with computers -- for example, Apple's Siri personal assistant -- take voice-issued commands from their users, and must process these commands in order to follow the users' instructions. Finally, machine translation technologies have become available for many of the world's most common languages, allowing users to automatically translate text that they find in foreign books or websites. These technologies mostly rely on simple models of language, known as n-gram models or context-free grammars, which were developed in the 1950's and 1960's, and refined in later decades. These simple models of language have many advantages, most notably that they can be used to process large amounts of data very quickly. Because of their simplicity, however, these models are not able to capture many aspects of meaning in natural language. This has resulted in limitations for the technologies discussed above; virtual personal assistants are only able to process very simple types of instructions, and machine translations is still far from being as accurate as human translation. In the current project, Leon Bergen and Dr. Edward Gibson will be investigating more sophisticated kinds of language models, with the goal of increasing the ability of computers to understand language.Under the direction of Dr. Gibson, Mr. Berger will be studying language models known as mildly context-sensitive grammars. These grammars are able to express certain types of linguistic knowledge that humans have, but which cannot be expressed using simpler types of grammatical formalisms. For example, native speakers of English know that a declarative sentence like "Mary kicked the ball" is closely related in meaning to the question "What did Mary kick?" Although this fact seems obvious, it is difficult (or impossible) to express using simple types of grammars. However, mildly context-sensitive grammars can be used to express this knowledge in a very natural way. Mr. Bergen and Dr. Gibson will be studying whether mildly context-sensitive grammars can be automatically learned from examples of grammatical sentences. To do this, they will be using techniques from machine learning, a branch of computer science and statistics that develops algorithms that can automatically learn from data. The researchers will integrate these learning algorithms with their grammatical formalism, and will test whether their method learns an accurate grammar. The accuracy of the grammar will be evaluated using a corpus -- a collection of sentences -- in which every sentence has been manually annotated with its correct grammatical structure. If accurate mildly context-sensitive grammars can be learned in this manner, then this provides a potential method for improving the natural language processing technologies which were discussed above. In particular, because this method does not require an expert to write down the complete grammar for a language, it has the potential to be deployed without tremendous engineering effort, and may be deployed easily in foreign languages.
在过去的十年中,处理自然语言的技术已变得无处不在。 Web搜索引擎,例如处理数十亿页的文本,以确定其中哪些页面最能匹配用户的搜索查询。许多用于与计算机交互的接口(例如,苹果的Siri个人助理)从用户那里获取语音发出的命令,并且必须处理这些命令才能遵循用户的说明。最后,机器翻译技术已用于世界上许多最常见的语言,使用户可以自动翻译他们在外国书籍或网站中找到的文本。这些技术主要依赖于在1950年代和1960年代开发的简单语言模型,即N-gram模型或无上下文的语法模型,并在后来的几十年中进行了完善。这些简单的语言模型具有许多优势,最值得注意的是它们可以非常快速地处理大量数据。但是,由于它们的简单性,这些模型无法在自然语言中捕获意义的许多方面。这导致了上述技术的局限性;虚拟的个人助理只能处理非常简单的说明类型,而机器翻译仍然远非像人类翻译一样准确。在当前的项目中,莱昂·卑尔根(Leon Bergen)和爱德华·吉布森(Edward Gibson)博士将研究更复杂的语言模型,目的是提高计算机理解语言的能力。在吉布森(Gibson)的方向上,伯格(Berger)先生将研究称为温和上下文敏感语法的语言模型。这些语法能够表达人类拥有的某些类型的语言知识,但不能使用更简单的语法形式主义来表达。例如,以英语为母语的人知道,像“玛丽踢球”这样的声明性句子与“玛丽踢了什么?”的问题密切相关。尽管这一事实似乎很明显,但很难使用简单类型的语法表达(或不可能)。但是,有轻度上下文敏感的语法可用于以非常自然的方式表达这些知识。卑尔根先生和吉布森博士将研究是否可以自动从语法句子的例子中学到温和的上下文敏感语法。为此,他们将使用机器学习中的技术,这是计算机科学和统计的一个分支,这些分支开发了可以自动从数据中学习的算法。研究人员将将这些学习算法与语法形式主义相结合,并将测试他们的方法是否学会了精确的语法。语法将使用语料库(句子集合)评估语法的准确性,其中每个句子都用其正确的语法结构手动注释。如果可以以这种方式学习准确的温和上下文敏感的语法,那么这提供了一种潜在的方法来改善上面讨论的自然语言处理技术。特别是,由于这种方法不需要专家写下一种语言的完整语法,因此它有可能在没有巨大的工程工作的情况下将其部署,并且可以轻松地以外语部署。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Edward Gibson其他文献
Assessing the inferential strength of epistemic must
评估认知必须的推理强度
- DOI:
- 发表时间:
2023 - 期刊:
- 影响因子:2.1
- 作者:
Giuseppe Ricciardi;Rachel Ryskin;Edward Gibson - 通讯作者:
Edward Gibson
Variation in spatial concepts: Different frames of reference on different axes
空间概念的变化:不同轴上的不同参考系
- DOI:
- 发表时间:
2021 - 期刊:
- 影响因子:0
- 作者:
Benjamin Pitt;Alexandra Carstensen;Edward Gibson;Steven T. Piantadosi - 通讯作者:
Steven T. Piantadosi
Concepts Are Restructured During Language Contact: The Birth of Blue and Other Color Concepts in Tsimane’-Spanish Bilinguals
语言接触过程中概念的重组:提斯曼-西班牙语双语者中蓝色和其他颜色概念的诞生
- DOI:
- 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Saima Malik;Kyle Mahowald;Bevil R. Conway;Edward Gibson - 通讯作者:
Edward Gibson
Recent Advances in Imaging of Barrett’s Esophagus
巴雷特食管影像学的最新进展
- DOI:
- 发表时间:
2018 - 期刊:
- 影响因子:0
- 作者:
Shekhar Sharma;Edward Gibson;N. Uedo;Rajvinder Singh - 通讯作者:
Rajvinder Singh
Can Language Models Be Tricked by Language Illusions? Easier with Syntax, Harder with Semantics
语言模型会被语言错觉所欺骗吗?
- DOI:
- 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Yuhan Zhang;Edward Gibson;Forrest Davis - 通讯作者:
Forrest Davis
Edward Gibson的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Edward Gibson', 18)}}的其他基金
Evaluating meaning-based explanations of syntactic island effects cross-linguistically
跨语言评估句法岛效应的基于意义的解释
- 批准号:
2020840 - 财政年份:2020
- 资助金额:
$ 1.17万 - 项目类别:
Standard Grant
Expanding the reach, impact and sustainability of ToyBox Study Malaysia: a kindergarten-based healthy behaviour intervention
扩大马来西亚玩具盒研究的范围、影响和可持续性:基于幼儿园的健康行为干预
- 批准号:
MR/V00607X/1 - 财政年份:2020
- 资助金额:
$ 1.17万 - 项目类别:
Research Grant
Improving healthy energy balance- and obesity-related behaviours among preschoolers in Malaysia: feasibility of adapting the ToyBox-Study
改善马来西亚学龄前儿童的健康能量平衡和肥胖相关行为:采用玩具盒研究的可行性
- 批准号:
MR/P013805/1 - 财政年份:2017
- 资助金额:
$ 1.17万 - 项目类别:
Research Grant
Workshop on Language Processing and Language Evolution: Special Session at the 2017 CUNY Conference on Human Sentence Processing
语言处理和语言进化研讨会:2017 年纽约市立大学人类句子处理会议特别会议
- 批准号:
1629983 - 财政年份:2016
- 资助金额:
$ 1.17万 - 项目类别:
Standard Grant
Doctoral Dissertation Research: A Communicative Perspective on Quantitative Syntax
博士论文研究:数量句法的交际视角
- 批准号:
1551543 - 财政年份:2016
- 资助金额:
$ 1.17万 - 项目类别:
Standard Grant
Doctoral Dissertation Research: Investigating cognitive and communicative pressures on natural language lexicons
博士论文研究:调查自然语言词典的认知和交际压力
- 批准号:
1451173 - 财政年份:2015
- 资助金额:
$ 1.17万 - 项目类别:
Standard Grant
The role of noise in information-theoretic models of sentence comprehension and production
噪声在句子理解和生成的信息论模型中的作用
- 批准号:
1534318 - 财政年份:2015
- 资助金额:
$ 1.17万 - 项目类别:
Standard Grant
Doctoral Dissertation Research: Causal Representations in Children's Transitive Sentences
博士论文研究:儿童及物句的因果表征
- 批准号:
1227892 - 财政年份:2012
- 资助金额:
$ 1.17万 - 项目类别:
Standard Grant
Origins of Numerical Competence: Assessment of Number Sense in Piraha
数字能力的起源:皮拉哈语数感评估
- 批准号:
1022684 - 财政年份:2010
- 资助金额:
$ 1.17万 - 项目类别:
Standard Grant
Doctoral Dissertation Research: Discovering Semantic Primitives
博士论文研究:发现语义原语
- 批准号:
1025309 - 财政年份:2010
- 资助金额:
$ 1.17万 - 项目类别:
Standard Grant
相似国自然基金
细粒度与个性化的学生议论文评价方法研究
- 批准号:62306145
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
基于社交媒体用户画像的科学论文传播模式与影响力性质研究
- 批准号:72304274
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
基于科学论文论证结构的可循证领域知识体系构建研究
- 批准号:72304137
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
面向论文引用与科研合作的"科学学"规律中的国别特征研究
- 批准号:72374173
- 批准年份:2023
- 资助金额:41 万元
- 项目类别:面上项目
基于深度语义理解的生物医学论文临床转化分析研究
- 批准号:72204090
- 批准年份:2022
- 资助金额:30.00 万元
- 项目类别:青年科学基金项目
相似海外基金
Doctoral Dissertation Research: Investigating the genomic underpinnings of the human hand and foot
博士论文研究:研究人类手脚的基因组基础
- 批准号:
2337516 - 财政年份:2024
- 资助金额:
$ 1.17万 - 项目类别:
Standard Grant
Doctoral Dissertation Research: Investigating Temporal Morphology and Verbal Order in an Endangered Language
博士论文研究:研究濒危语言的时间形态和言语顺序
- 批准号:
2302393 - 财政年份:2023
- 资助金额:
$ 1.17万 - 项目类别:
Standard Grant
Doctoral Dissertation Research Award: Investigating the Role of Scale in the Development of Flexible Irrigation Structures.
博士论文研究奖:调查规模在灵活灌溉结构发展中的作用。
- 批准号:
2314519 - 财政年份:2023
- 资助金额:
$ 1.17万 - 项目类别:
Standard Grant
Doctoral Dissertation Research in Economics: Investigating the Impact of the 'Norm to Work' on Worker Power and Labor Market Outcomes
经济学博士论文研究:调查“工作规范”对工人权力和劳动力市场结果的影响
- 批准号:
2314163 - 财政年份:2023
- 资助金额:
$ 1.17万 - 项目类别:
Standard Grant
Doctoral Dissertation Research: Investigating Sound Change in an Understudied Language: A Sociophonetic Study of Age and Locality Effects
博士论文研究:调查所研究语言的声音变化:年龄和地点效应的社会语音学研究
- 批准号:
2214689 - 财政年份:2022
- 资助金额:
$ 1.17万 - 项目类别:
Standard Grant