Overview Incremental processing of both syntax and semantics, both in parsing and generation, is of significant interest for modelling the human language capability, and for build ing systems which interact with it. Formal linguistics has made significant contributions to this; one example is th framework Dynamic Syntax, which provides an inherently word-by-word incremental gr ammatical framework. However, making this practical for computational models or systems involves building grammars with broad coverage on real data – a significant challenge. Here, we describe a me thod for inducing such a grammar from a corpus in which sentences are paired with semantic logical forms. By taking a probabilistic view, we hypothesise possible lexical entries – including entries for anaphoric e lements – and learn a lexicon from their observed distributions without requiring annotation at the word le vel. The resulting grammar provides a resource for incremental semantic processing with good cove rage, while learning grammatical constraints similar to a hand-crafted version.
概述
在句法分析和生成过程中,对句法和语义的增量处理对于模拟人类语言能力以及构建与之交互的系统都具有重大意义。形式语言学在这方面做出了重大贡献;动态句法框架就是一个例子,它提供了一个内在的逐词增量语法框架。然而,要使其在计算模型或系统中具有实用性,就需要构建能广泛覆盖真实数据的语法——这是一项重大挑战。在此,我们描述了一种从句子与语义逻辑形式配对的语料库中归纳出这种语法的方法。通过采用概率观点,我们假设可能的词条——包括回指元素的词条——并从观察到的分布中学习词典,而不需要在词级进行标注。由此产生的语法为具有良好覆盖范围的增量语义处理提供了一种资源,同时学习到了与手工制作版本相似的语法约束。