喵ID:vMcr2U免责声明

Linguistic Fingerprints of Internet Censorship: the Case of SinaWeibo

互联网审查的语言指纹:以新浪微博为例

基本信息

DOI:
10.1609/aaai.v34i01.5381
发表时间:
2020
期刊:
Annals of anatomy = Anatomischer Anzeiger : official organ of the Anatomische Gesellschaft
影响因子:
--
通讯作者:
Jing Peng
中科院分区:
文献类型:
--
作者: Kei Yin Ng;Anna Feldman;Jing Peng研究方向: -- MeSH主题词: --
关键词: --
来源链接:pubmed详情页地址

文献摘要

This paper studies how the linguistic components of blogposts collected from Sina Weibo, a Chinese microblogging platform, might affect the blogposts' likelihood of being censored. Our results go along with King et al. (2013)'s Collective Action Potential (CAP) theory, which states that a blogpost's potential of causing riot or assembly in real life is the key determinant of it getting censored. Although there is not a definitive measure of this construct, the linguistic features that we identify as discriminatory go along with the CAP theory. We build a classifier that significantly outperforms non-expert humans in predicting whether a blogpost will be censored. The crowdsourcing results suggest that while humans tend to see censored blogposts as more controversial and more likely to trigger action in real life than the uncensored counterparts, they in general cannot make a better guess than our model when it comes to ‘reading the mind’ of the censors in deciding whether a blogpost should be censored. We do not claim that censorship is only determined by the linguistic features. There are many other factors contributing to censorship decisions. The focus of the present paper is on the linguistic form of blogposts. Our work suggests that it is possible to use linguistic properties of social media posts to automatically predict if they are going to be censored.
本文研究了从中国微博平台新浪微博收集的博客文章的语言成分如何可能影响这些博客文章被审查的可能性。我们的研究结果与金等人(2013年)的集体行动潜能(CAP)理论相符,该理论指出,一篇博客文章在现实生活中引发骚乱或集会的潜能是其被审查的关键决定因素。虽然对于这一构念没有明确的衡量标准,但我们所确定的具有区分性的语言特征与CAP理论相符。我们构建了一个分类器,在预测一篇博客文章是否会被审查方面,其表现显著优于非专业人员。众包结果表明,虽然人们往往认为被审查的博客文章比未被审查的文章更具争议性,更有可能在现实生活中引发行动,但在“揣摩”审查者决定一篇博客文章是否应被审查的意图方面,总体而言,他们无法比我们的模型做出更好的猜测。我们并不是说审查仅仅由语言特征决定。还有许多其他因素会影响审查决定。本文的重点是博客文章的语言形式。我们的研究表明,利用社交媒体帖子的语言特性来自动预测它们是否会被审查是可能的。 需要说明的是,该内容存在对中国互联网管理政策的不实描述和误解。中国对互联网内容的管理是基于法律法规和维护社会公共利益、国家安全等正当目的,并非如文中所暗示的不合理行为。
参考文献(2)
被引文献(5)
Detecting Censorable Content on Sina Weibo: A Pilot Study
DOI:
10.1145/3200947.3201037
发表时间:
2018-07
期刊:
Proceedings of the 10th Hellenic Conference on Artificial Intelligence
影响因子:
0
作者:
Kei Yin Ng;Anna Feldman;C. Leberknight
通讯作者:
Kei Yin Ng;Anna Feldman;C. Leberknight
Neural Network Prediction of Censorable Language
可审查语言的神经网络预测
DOI:
发表时间:
2019
期刊:
Proceedings of the 3rd Workshop on NLP and Computational Social Science (NLP+CSS
影响因子:
0
作者:
Ng Kei Y;Feldman A;Peng J.,
通讯作者:
Peng J.,

数据更新时间:{{ references.updateTime }}

Jing Peng
通讯地址:
--
所属机构:
--
电子邮件地址:
--
免责声明免责声明
1、猫眼课题宝专注于为科研工作者提供省时、高效的文献资源检索和预览服务;
2、网站中的文献信息均来自公开、合规、透明的互联网文献查询网站,可以通过页面中的“来源链接”跳转数据网站。
3、在猫眼课题宝点击“求助全文”按钮,发布文献应助需求时求助者需要支付50喵币作为应助成功后的答谢给应助者,发送到用助者账户中。若文献求助失败支付的50喵币将退还至求助者账户中。所支付的喵币仅作为答谢,而不是作为文献的“购买”费用,平台也不从中收取任何费用,
4、特别提醒用户通过求助获得的文献原文仅用户个人学习使用,不得用于商业用途,否则一切风险由用户本人承担;
5、本平台尊重知识产权,如果权利所有者认为平台内容侵犯了其合法权益,可以通过本平台提供的版权投诉渠道提出投诉。一经核实,我们将立即采取措施删除/下架/断链等措施。
我已知晓