RI: Medium: Learning Disentangled Representations for Text to Aid Interpretability and Transfer

RI：媒介：学习文本的解缠表示以帮助可解释性和迁移

基本信息

批准号：
1901117
负责人：
Byron Wallace
金额：
$ 100万
依托单位：
Northeastern University
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2019
资助国家：
美国
起止时间：
2019-07-01 至 2024-06-30
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=1901117&HistoricalAwards=false
关键词：
RI Medium Learning Disentangled Representations

项目摘要

Machine learning methods for natural language processing power many technologies that we use on a day-to-day basis, such as spam filters and translation software. The models underlying these techniques have become increasingly sophisticated, yielding improved performance but also increasing complexity. In particular, "neural network" based approaches have re-emerged as the dominant class of machine learning models for language processing. These approaches often perform better than their non-neural counterparts, but also have key downsides. First, training these models requires human effort and time to generate a sufficiently large set of training data in the form of manually annotated text. Second, it is often not obvious whether a model trained on one dataset will generalize to another. Finally, it is hard to discern why such models make the specific predictions that they do, largely because predictions are made on the basis of learned representations of texts which do not naturally afford transparency. This project proposes technical innovations to address these interrelated issues using "disentanglement". The idea is to design models such that the learned representations used to make predictions have known meaning. This approach has the potential to enable re-use of models (increasing efficiency and reducing human costs), and aid interpretability, so that one can have a better idea of why a model made a given prediction.To realize the above goals of improved interpretability and transferability of models, this work will develop and evaluate new models that learn representations in which certain dimensions are imbued with explicit semantics. This is a departure from current approaches, which indiscriminately code all attributes into a single (entangled) representation. To achieve disentanglement, this project will explore deep generative models and sparse, gated neural encoders. These will use inductive biases and light supervision strategies that guide models toward disentangled representations. For example, models will be penalized if distances in learned embedding spaces do not reflect human judgments concerning the relative similarities of instances with respect to specific aspects of interest. In other cases, "weak" supervision (e.g., rules) may provide adequate guidance for disentanglement. Finally, "probing" tasks constitute a third supervision strategy to be explored: This will involve the use of auxiliary tasks to provide "supervision" that guides individual aspect-wise embeddings of input. The project will develop and evaluate such models for representative problems in natural language processing, specifically: classification, sequence tagging, and summarization. Models will be evaluated both for predictive performance (including their generalizability to new domains and the efficiency with which they do so), and the degree to which learned representations are disentangled and capture the intended aspects.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

自然语言处理能力的机器学习方法我们每天使用的许多技术，例如垃圾邮件过滤器和翻译软件。这些技术的基础模型已经变得越来越复杂，从而提高了性能，但也提高了复杂性。特别是，基于“神经网络”的方法已重新出现为用于语言处理的机器学习模型的主要类别。这些方法的表现通常比非神经对应物更好，但也具有关键的弊端。首先，培训这些模型需要人力和时间，以手动注释的文本形式生成足够大的培训数据。其次，在一个数据集中训练的模型是否会推广到另一个数据集，这通常并不明显。最后，很难辨别为什么这样的模型会做出它们做的具体预测，这主要是因为预测是基于学习的文本表示，这些文本自然而然地提供透明度。该项目提出了技术创新，以使用“解开”来解决这些相互关联的问题。这个想法是设计模型，以使用于做出预测的学到的表示形式具有已知含义。这种方法有可能重复使用模型（提高效率和降低人力成本），并有助于解释性，从而可以更好地了解模型为何做出给定的预测的原因。要实现上述模型的可解释性和可转让性的目标，这项工作将开发和评估某些尺寸的新模型，其中某些维度具有explicens explicitip explacit explicit explicit explicit explicit explicit explicit explicit explicit explicits explicits explicits explacit emplacit sementssics semantsics。这是与当前方法的不同之处，将所有属性编码为单个（纠缠）表示。为了实现分离，该项目将探索深层生成模型和稀疏的封闭式神经编码器。这些将使用归纳性偏见和光监督策略，以指导模型朝向分离的表示形式。例如，如果在学习的嵌入空间中的距离不反映有关实例相对于感兴趣的特定方面的相对相似性的人类判断，则将受到惩罚。在其他情况下，“弱”监督（例如，规则）可能会为解散提供足够的指导。最后，“探测”任务构成了要探索的第三个监督策略：这将涉及使用辅助任务来提供“监督”，以指导个人方面的输入嵌入。该项目将开发和评估此类模型，以解决自然语言处理中的代表性问题，特别是：分类，序列标签和摘要。将对模型的预测性能进行评估（包括它们对新领域的概括性及其效率），以及学识渊博的表示形式的分解和捕获预期的方面的程度。该奖项反映了NSF的法定任务，并认为通过基金会的知识分子和更广泛的影响，通过评估来获得支持，并被认为是值得的。

项目成果

期刊论文数量（9）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Interpretability Analysis for Named Entity Recognition to Understand System Predictions and How They Can Improve

DOI：
10.1162/coli_a_00397
发表时间：
2021-03-01
期刊：
COMPUTATIONAL LINGUISTICS
影响因子：
9.3
作者：
Agarwal, Oshin;Yang, Yinfei;Nenkova, Ani
通讯作者：
Nenkova, Ani

Rate-Regularization and Generalization in Variational Autoencoders

变分自编码器中的速率正则化和泛化

DOI：
发表时间：
2021
期刊：
Proceedings of The 24th International Conference on Artificial Intelligence and Statistics
影响因子：
0
作者：
Bozkurt, A;Esmaeili, B.;Tristan, J.-B.;Brooks, D.;Dy, J.;van de Meent, J.-W.
通讯作者：
van de Meent, J.-W.

That’s the Wrong Lung! Evaluating and Improving the Interpretability of Unsupervised Multimodal Encoders for Medical Data

DOI：
10.48550/arxiv.2210.06565
发表时间：
2022-10
期刊：
Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing
影响因子：
0
作者：
Denis Jered McInerney;Geoffrey S. Young;Jan-Willem van de Meent;Byron Wallace
通讯作者：
Denis Jered McInerney;Geoffrey S. Young;Jan-Willem van de Meent;Byron Wallace

Biomedical Interpretable Entity Representations

生物医学可解释的实体表示

DOI：
发表时间：
2021
期刊：
Proceedings of the Association for Computational Linguistics (ACL
影响因子：
0
作者：
Garcia-Olano, Diego;Onoe, Yasumasa;Baldini, Ioana;Ghosh, Joydeep;Wallace, Byron C.;Varshney, Kush
通讯作者：
Varshney, Kush

Query-Focused EHR Summarization to Aid Imaging Diagnosis

DOI：
发表时间：
2020-04
期刊：
ArXiv
影响因子：
0
作者：
Denis Jered McInerney;B. Dabiri;Anne-Sophie Touret;Geoffrey Young;Jan-Willem van de Meent;Byron C. Wallace-Byron-C.-Wa
通讯作者：
Denis Jered McInerney;B. Dabiri;Anne-Sophie Touret;Geoffrey Young;Jan-Willem van de Meent;Byron C. Wallace-Byron-C.-Wa

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Byron Wallace其他文献

Edinburgh Research Explorer Living systematic reviews

爱丁堡研究探索者生活系统评论

DOI：
发表时间：
期刊：
影响因子：
0
作者：
James Thomas;Anna Noel;Iain J Marshall;Byron Wallace;Steven McDonald;Chris Mavergames;Paul Glasziou;I. Shemilt;Anneliese J Synnot;Tari Turner;Julian H. Elliott
通讯作者：
Julian H. Elliott