Collaborative Research: Updating the Militarized Dispute Data Through Crowdsourcing: MID5
协作研究:通过众包更新军事化争端数据:MID5
基本信息
- 批准号:1528624
- 负责人:
- 金额:$ 36.74万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2015
- 资助国家:美国
- 起止时间:2015-09-15 至 2019-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
General Summary The Correlates of War Project's Militarized Interstate Dispute (MID) Data is the most prominent and heavily used data collection in the study of international conflict. The most recent version (MID4) was released in 2014 and brings the period covered to 1816-2010. The MID4 project utilized automated text classification procedures to make the process of identifying relevant news stories more efficient. Over the course of that project, the PIs determined the primary bottleneck in the workflow was the coding of those news documents. To address this inefficiency, The PIs completed a pilot project to determine whether crowdsourcing techniques could be used to code these documents. In the pilot, non-expert workers were paid small sums to read documents and to answer sets of questions, the answers to which were used to identify features of possible militarized incidents (the events that comprise MIDs). A systematic comparison of the crowdsourced responses with those of MID4 Project's trained coders revealed that the crowdsourced codings were completely accurate for 68 percent of the news reports coded; more importantly, high agreement among crowd responses on specific reports was strongly associated with correct coding. This enables the PIs to detect which documents require further expert involvement. As a result, the PIs can produce a majority of the MID data in near-realtime and at limited financial cost. These procedures are applied on the MID5 Project, which will update the MID data for the period 2011-2017.Technical Summary The MID5 project workflow begins with document retrieval from LexisNexis and document classification using the software and methods implemented in MID4. We discard the negatively classified documents, and proceed to extract metadata from the positively classified documents including the document title, the news agency that published the report, the date, and any actors mentioned in the text. Crowd workers are recruited through Amazon's Mechanical Turk and paid a wage to read one of these documents and answer a line of simple, objective questions about it. The questionnaire is predefined, but some extracted metadata is automatically inserted into the questionnaire to improve the quality of responses. Several workers complete a questionnaire for each document, leaving the PIs with problems of aggregation: how to combine multiple worker responses, possibly regarding multiple related questions, into usable data necessary to code the militarized incident. In the pilot study, the PIs show that Bayesian networks are the most effective way to achieve this aggregation. Recently, the PIs have made advances in semi-supervised text classification with hybrid, Deep Restricted Boltzmann Machines, which outperform previous methods in this task.
一般摘要战争项目的军事跨州争议(中)数据是国际冲突研究中最突出和最常用的数据收集。最新版本(MID4)于2014年发布,并将涵盖的时期带到了1816-2010。 MID4项目利用自动文本分类程序使识别相关新闻报道的过程更加有效。在该项目的整个过程中,PIS确定工作流程中的主要瓶颈是这些新闻文件的编码。为了解决此效率低下,PIS完成了一个试点项目,以确定是否可以使用众包技术来编码这些文档。在飞行员中,非专家工人被支付了少量款项来阅读文档并回答一组问题,这些答案被用来识别可能的军事事件的特征(其中构成了MIDS的事件)。对众包的响应与MID4项目训练有素的编码人员的系统比较表明,在68%的编码新闻报道中,众包编码完全准确。更重要的是,人群对特定报告的反应高度同意与正确的编码密切相关。这使PI可以检测哪些文件需要更多的专家参与。结果,PI可以在几乎实时和有限的财务成本中生成大部分中数据数据。这些过程应用于MID5项目,该过程将更新2011 - 2017年期间的MID数据。技术摘要MID5项目工作流程从LexisNexis的文档检索开始,并使用Mid4中实施的软件和方法进行了文档分类。我们丢弃了负面分类的文件,然后继续从正面分类的文档中提取元数据,包括文档标题,发布报告的新闻社,日期以及文本中提到的任何参与者。人群工人是通过亚马逊的机械土耳其人招募的,并付了工资来阅读其中一份文件,并回答了一系列简单,客观的问题。问卷是预定义的,但是一些提取的元数据会自动插入问卷中以提高回答质量。几名工人为每个文档完成了一份问卷,使PI遇到了汇总问题:如何将多个工人的回答(可能与多个相关问题)结合到编码军事化事件所需的可用数据中。在试点研究中,PIS表明,贝叶斯网络是实现这一聚集的最有效方法。最近,PI与混合,深度限制的Boltzmann机器在半监督文本分类方面取得了进步,在此任务中,这表现优于以前的方法。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Vito D'Orazio其他文献
Advancing Measurement of Foreign Policy Similarity
推进外交政策相似性的衡量
- DOI:
10.31235/osf.io/fuet4 - 发表时间:
2012 - 期刊:
- 影响因子:0
- 作者:
Vito D'Orazio - 通讯作者:
Vito D'Orazio
The MID5 Dataset, 2011–2014: Procedures, coding rules, and description
MID5 数据集,2011-2014:程序、编码规则和描述
- DOI:
10.1177/0738894221995743 - 发表时间:
2021 - 期刊:
- 影响因子:2.1
- 作者:
Glenn Palmer;Roseanne W. McManus;Vito D'Orazio;Michael R. Kenwick;Mikaela Karstens;C. Bloch;Nick Dietrich;Kayla Kahn;Kellan H. Ritter;Michael J. Soules - 通讯作者:
Michael J. Soules
An Online Structured Political Event Dataset based on CAMEO Ontology
基于CAMEO本体的在线结构化政治事件数据集
- DOI:
- 发表时间:
2020 - 期刊:
- 影响因子:0
- 作者:
S. Salam;Patrick T. Brandt;Vito D'Orazio;J. Holmes;Javiar Osorio;L. Khan - 通讯作者:
L. Khan
Error-Correction and Aggregation in Crowd-Sourcing of Geopolitical Incident Information
地缘政治事件信息众包中的纠错和聚合
- DOI:
10.1007/978-3-319-16268-3_47 - 发表时间:
2015 - 期刊:
- 影响因子:3.2
- 作者:
Alexander Ororbia;Yang Xu;Vito D'Orazio;D. Reitter - 通讯作者:
D. Reitter
Updating the Militarized Interstate Dispute Data: A Response to Gibler, Miller, and Little
更新军事化州际争端数据:对吉布勒、米勒和利特尔的回应
- DOI:
10.1093/isq/sqz045 - 发表时间:
2020 - 期刊:
- 影响因子:2.6
- 作者:
Glenn Palmer;Vito D'Orazio;Michael R. Kenwick;Roseanne W. McManus - 通讯作者:
Roseanne W. McManus
Vito D'Orazio的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
相似国自然基金
蚕蛹雌雄在线识别中尾部姿态形变矫正和识别模型更新策略研究
- 批准号:32301700
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
城市更新中公共产品配建的激励方式和额度测算研究
- 批准号:52378070
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
青藏高原“中更新世”岩面艺术的热释光年代学研究
- 批准号:42371161
- 批准年份:2023
- 资助金额:48 万元
- 项目类别:面上项目
瘤内鞘氨醇单胞菌调控c-Maf核转位协同Sox2转录激活在维持TAMs自我更新中的作用机制研究
- 批准号:82372597
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
CDK14在调控乳腺干细胞自我更新和乳腺癌发展中的功能研究
- 批准号:
- 批准年份:2022
- 资助金额:54 万元
- 项目类别:面上项目
相似海外基金
Collaborative Research: Updating iVirus - the CyVerse-powered analytical toolkit for viruses of microbes
协作研究:更新 iVirus - CyVerse 支持的微生物病毒分析工具包
- 批准号:
2149505 - 财政年份:2022
- 资助金额:
$ 36.74万 - 项目类别:
Continuing Grant
Collaborative Research: Updating iVirus - the CyVerse-powered analytical toolkit for viruses of microbes
协作研究:更新 iVirus - CyVerse 支持的微生物病毒分析工具包
- 批准号:
2149506 - 财政年份:2022
- 资助金额:
$ 36.74万 - 项目类别:
Continuing Grant
Collaborative Research: A New Nonlinear Modal Updating Framework for Soft, Hydrated Materials
协作研究:用于软水合材料的新型非线性模态更新框架
- 批准号:
1728186 - 财政年份:2017
- 资助金额:
$ 36.74万 - 项目类别:
Standard Grant
Collaborative Research: A New Nonlinear Modal Updating Framework for Soft, Hydrated Materials
协作研究:用于软水合材料的新型非线性模态更新框架
- 批准号:
1727761 - 财政年份:2017
- 资助金额:
$ 36.74万 - 项目类别:
Standard Grant
Collaborative Research: Updating the Militarized Dispute Data Through Crowdsourcing: MID5
协作研究:通过众包更新军事化争端数据:MID5
- 批准号:
1528409 - 财政年份:2015
- 资助金额:
$ 36.74万 - 项目类别:
Continuing Grant