RI: Small: DaRE: Detection and Recognition of Euphemisms
RI:小:DaRE:委婉语的检测和识别
基本信息
- 批准号:2226006
- 负责人:
- 金额:$ 56.41万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2023
- 资助国家:美国
- 起止时间:2023-01-01 至 2025-12-31
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
To fully understand human language, machines need to be able to recognize and interpret expressions that contain hidden meanings. This project concentrates on euphemisms, mild or indirect phrases used in place of harsher or more offensive ones. Euphemisms are often used to mask profanity or refer to sensitive topics such as death, sex, religion, disability, or personal relationships in a polite way. People use euphemisms all the time, e.g., 'negative patient outcome', 'between jobs', 'financially fortunate', 'correctional facility','friendly fire', or 'sunshine unit'. Different cultures/languages use different euphemisms. Euphemisms change over time. Machines that process human language do not understand euphemisms yet. This project is devoted to making machines understand euphemisms in different languages, and therefore contributing to improving the capabilities of artificial intelligence. Additional benefits include interesting new generalizations about the nature of euphemisms and the training of a diverse cadre of undergraduate and graduate students in highly practical work on a difficult interdisciplinary problem. Montclair State University, a Hispanic Serving Institution, is known for its diverse student population and a large proportion of first-generation college students. Montclair State University puts great emphasis on justice and inclusivity in academia. This project is not an exception.Detecting and interpreting figurative language is a rapidly growing area in Natural Language Processing (NLP). Unfortunately, the processing of euphemisms is lacking in NLP thus far. The project addresses the following: 1) algorithm design for detecting and interpreting euphemisms, and 2) interpretability of black-box neural models by creating a series of new datasets and tasks that explore the embedding space of transformer language models for euphemism recognition. The key insights are 1) euphemistic expressions and their paraphrased counterparts differ in the strength of the sentiment they convey; 2) euphemistic and non-euphemistic interpretation is context-sensitive; 3) euphemisms are vaguer than the taboo expressions they substitute. The experiments test what linguistic properties of euphemisms the deep learning approaches capture and why. The algorithm developed can detect new euphemisms, not previously recorded in dictionaries, without human intervention. The computational work on euphemisms is important to further the understanding of how strategic use of language can bias people's perceptions of important and highly contentious actions and perhaps find ways how to de-bias language models. This work on euphemisms helps understand what topics are controversial or sensitive in a specific culture. Applying the algorithm to diachronic data and detecting the change in euphemism usage leads to a better understanding of culture changes. The corpora produced are useful for answering questions at the intersection of AI, NLP, linguistics, cultural anthropology, and social psychology. The range of languages provides a natural way of making interesting linguistic observations about euphemisms. Since euphemisms are a form of verbal behavior, finding a way to detect and interpret euphemisms automatically may lead to a better understanding of human behavior in general.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
为了充分理解人类语言,机器需要能够识别和解释包含隐藏含义的表达式。该项目集中于委婉语,轻度或间接短语代替更严厉或更令人反感的短语。委婉语通常用于掩盖亵渎性或以礼貌的方式掩盖敏感的话题,例如死亡,性别,宗教,残疾或人际关系。人们一直在使用委婉语,例如“负面的患者结果”,“在工作之间”,“财务上幸运”,“惩教设施”,“友好的火”或“阳光单位”。不同的文化/语言使用不同的委婉语。随着时间的流逝,委婉语会改变。处理人类语言的机器尚不理解委婉语。该项目致力于使机器了解不同语言的委婉语,因此有助于提高人工智能的能力。其他好处包括有关委婉语的性质的有趣的新概括,以及在困难的跨学科问题上进行高度实践工作的各种本科生和研究生的培训。西班牙裔服务机构蒙特克莱尔州立大学(Montclair State University)以其多样化的学生人数和大部分第一代大学生而闻名。蒙特克莱州立大学非常重视学术界的正义和包容性。该项目不是一个例外。检测和解释比喻性语言是自然语言处理(NLP)中快速增长的领域。不幸的是,到目前为止,NLP缺乏委婉语的处理。该项目解决以下内容:1)用于检测和解释委婉语的算法设计,以及2)通过创建一系列新的数据集和任务来探索委托书识别的变压器语言模型的嵌入空间,从而解释黑盒神经模型。关键的见解是1)委婉的表达及其释义在他们传达的情绪的强度上有所不同; 2)委婉的和非脑电图解释对上下文敏感; 3)委婉语比他们替代的禁忌表达更含糊。实验测试了深度学习方法捕获的委婉语的语言特性以及原因。开发的算法可以检测新的委婉语,而不是在没有人类干预的情况下在词典中记录的。关于委婉语的计算工作对于进一步了解语言的战略使用如何使人们对重要和高度争议的行动的看法偏见,并找到方法来消除语言模型的方式很重要。这项关于委婉语的工作有助于了解哪些主题在特定文化中是有争议的或敏感的。将算法应用于历时数据并检测委婉使用的变化会导致对文化变化的更好理解。生产的语料库对于在AI,NLP,语言学,文化人类学和社会心理学的交集中回答问题很有用。语言的范围提供了一种自然的方式,可以对委婉语进行有趣的语言观察。由于委婉语是一种言语行为的一种形式,因此找到一种自动发现和解释委婉语的方法可能会导致对一般人的行为有更好的理解。该奖项反映了NSF的法定使命,并被认为是值得通过基金会的知识分子优点和更广泛的审查标准来通过评估来获得支持的。
项目成果
期刊论文数量(6)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
A Report on the Euphemisms Detection Shared Task
- DOI:10.48550/arxiv.2211.13327
- 发表时间:2022-11
- 期刊:
- 影响因子:0
- 作者:Patrick Lee;Anna Feldman;J. Peng
- 通讯作者:Patrick Lee;Anna Feldman;J. Peng
Proceedings of the 3rd Workshop on Figurative Language Processing (FLP)
第三届形象语言处理(FLP)研讨会论文集
- DOI:
- 发表时间:2022
- 期刊:
- 影响因子:0
- 作者:Ghosh, Debanjan
- 通讯作者:Ghosh, Debanjan
NollySenti: Leveraging Transfer Learning and Machine Translation for Nigerian Movie Sentiment Classification
- DOI:10.48550/arxiv.2305.10971
- 发表时间:2023-05
- 期刊:
- 影响因子:0
- 作者:Iyanuoluwa Shode;David Ifeoluwa Adelani;J. Peng;Anna Feldman
- 通讯作者:Iyanuoluwa Shode;David Ifeoluwa Adelani;J. Peng;Anna Feldman
共 3 条
- 1
Anna Feldman其他文献
WordPrep: Word-based Preposition Prediction Tool
WordPrep:基于单词的介词预测工具
- DOI:10.1109/bigdata47090.2019.900560810.1109/bigdata47090.2019.9005608
- 发表时间:20192019
- 期刊:
- 影响因子:0
- 作者:Pooja Bhagat;A. Varde;Anna FeldmanPooja Bhagat;A. Varde;Anna Feldman
- 通讯作者:Anna FeldmanAnna Feldman
Experiments in Cross-Language Morphological Annotation Transfer
跨语言形态注释迁移实验
- DOI:10.1007/11671299_410.1007/11671299_4
- 发表时间:20062006
- 期刊:
- 影响因子:0
- 作者:Anna Feldman;Jirka Hana;Chris BrewAnna Feldman;Jirka Hana;Chris Brew
- 通讯作者:Chris BrewChris Brew
Evaluating and automating the annotation of a learner corpus
评估和自动化学习者语料库的注释
- DOI:10.1007/s10579-013-9226-310.1007/s10579-013-9226-3
- 发表时间:20132013
- 期刊:
- 影响因子:2.7
- 作者:Alexandr Rosen;Jirka Hana;Barbora Stindlová;Anna FeldmanAlexandr Rosen;Jirka Hana;Barbora Stindlová;Anna Feldman
- 通讯作者:Anna FeldmanAnna Feldman
Legend at ArAIEval Shared Task: Persuasion Technique Detection using a Language-Agnostic Text Representation Model
ArAIEval 共享任务的传奇:使用与语言无关的文本表示模型进行说服技术检测
- DOI:10.48550/arxiv.2310.0966110.48550/arxiv.2310.09661
- 发表时间:20232023
- 期刊:
- 影响因子:0
- 作者:O. E. Ojo;O. O. Adebanji;Hiram Calvo;Damian O. Dieke;Olumuyiwa E. Ojo;S.E. Akinsanya;Tolulope O. Abiola;Anna FeldmanO. E. Ojo;O. O. Adebanji;Hiram Calvo;Damian O. Dieke;Olumuyiwa E. Ojo;S.E. Akinsanya;Tolulope O. Abiola;Anna Feldman
- 通讯作者:Anna FeldmanAnna Feldman
Linguistic Fingerprints of Internet Censorship: the Case of SinaWeibo
互联网审查的语言指纹:以新浪微博为例
- DOI:10.1609/aaai.v34i01.538110.1609/aaai.v34i01.5381
- 发表时间:20202020
- 期刊:
- 影响因子:0
- 作者:Kei Yin Ng;Anna Feldman;Jing PengKei Yin Ng;Anna Feldman;Jing Peng
- 通讯作者:Jing PengJing Peng
共 17 条
- 1
- 2
- 3
- 4
Anna Feldman的其他基金
Workshop on Natural Language Processing for Internet Freedom
自然语言处理促进互联网自由研讨会
- 批准号:18281991828199
- 财政年份:2018
- 资助金额:$ 56.41万$ 56.41万
- 项目类别:Standard GrantStandard Grant
Student Support at the North American Association for Computational Linguistics Workshop on Computational Methods for Analysis of Narrative
北美计算语言学协会叙事分析计算方法研讨会的学生支持
- 批准号:15232851523285
- 财政年份:2015
- 资助金额:$ 56.41万$ 56.41万
- 项目类别:Standard GrantStandard Grant
RI: Small: RUI: AIR: Automatic Idiom Recognition
RI:小:RUI:AIR:自动成语识别
- 批准号:13198461319846
- 财政年份:2013
- 资助金额:$ 56.41万$ 56.41万
- 项目类别:Standard GrantStandard Grant
Undergraduate Research: Cross-Lingual Approaches to Morphosyntactic Tagging
本科生研究:形态句法标记的跨语言方法
- 批准号:10332751033275
- 财政年份:2010
- 资助金额:$ 56.41万$ 56.41万
- 项目类别:Continuing GrantContinuing Grant
RI:EAGER: A Montclair Group in Cognitive and Computational Aspects of Language and Speech Processing: An Exploration
RI:EAGER:蒙特克莱尔小组在语言和语音处理的认知和计算方面:探索
- 批准号:10484061048406
- 财政年份:2010
- 资助金额:$ 56.41万$ 56.41万
- 项目类别:Standard GrantStandard Grant
RI: Small: RUI: Resource-light Morphosyntactic Tagging of Morphologically Complex Languages
RI:小:RUI:形态复杂语言的轻资源形态句法标记
- 批准号:09162800916280
- 财政年份:2009
- 资助金额:$ 56.41万$ 56.41万
- 项目类别:Standard GrantStandard Grant
Workshop on Computational Approaches to Linguistic Creativity - Element 7495
语言创造力计算方法研讨会 - 元素 7495
- 批准号:09062440906244
- 财政年份:2009
- 资助金额:$ 56.41万$ 56.41万
- 项目类别:Standard GrantStandard Grant
相似国自然基金
靶向Treg-FOXP3小分子抑制剂的筛选及其在肺癌免疫治疗中的作用和机制研究
- 批准号:32370966
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
化学小分子激活YAP诱导染色质可塑性促进心脏祖细胞重编程的表观遗传机制研究
- 批准号:82304478
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
靶向小胶质细胞的仿生甘草酸纳米颗粒构建及作用机制研究:脓毒症相关性脑病的治疗新策略
- 批准号:82302422
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
HMGB1/TLR4/Cathepsin B途径介导的小胶质细胞焦亡在新生大鼠缺氧缺血脑病中的作用与机制
- 批准号:82371712
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
小分子无半胱氨酸蛋白调控生防真菌杀虫活性的作用与机理
- 批准号:32372613
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
相似海外基金
Powering Small Craft with a Novel Ammonia Engine
用新型氨发动机为小型船只提供动力
- 批准号:1009989610099896
- 财政年份:2024
- 资助金额:$ 56.41万$ 56.41万
- 项目类别:Collaborative R&DCollaborative R&D
"Small performances": investigating the typographic punches of John Baskerville (1707-75) through heritage science and practice-based research
“小型表演”:通过遗产科学和基于实践的研究调查约翰·巴斯克维尔(1707-75)的印刷拳头
- 批准号:AH/X011747/1AH/X011747/1
- 财政年份:2024
- 资助金额:$ 56.41万$ 56.41万
- 项目类别:Research GrantResearch Grant
人工知能に基づく非線形高次元小標本データ解析とその社会的応用
基于人工智能的非线性高维小样本数据分析及其社会应用
- 批准号:24K1484724K14847
- 财政年份:2024
- 资助金额:$ 56.41万$ 56.41万
- 项目类别:Grant-in-Aid for Scientific Research (C)Grant-in-Aid for Scientific Research (C)
Fragment to small molecule hit discovery targeting Mycobacterium tuberculosis FtsZ
针对结核分枝杆菌 FtsZ 的小分子片段发现
- 批准号:MR/Z503757/1MR/Z503757/1
- 财政年份:2024
- 资助金额:$ 56.41万$ 56.41万
- 项目类别:Research GrantResearch Grant
Bacteriophage control of host cell DNA transactions by small ORF proteins
噬菌体通过小 ORF 蛋白控制宿主细胞 DNA 交易
- 批准号:BB/Y004426/1BB/Y004426/1
- 财政年份:2024
- 资助金额:$ 56.41万$ 56.41万
- 项目类别:Research GrantResearch Grant