Dependence Modelling with Vine Copulas for the Integration of Unstructured and Structured Data
使用 Vine Copulas 进行依赖建模以集成非结构化和结构化数据
基本信息
- 批准号:EP/W021986/1
- 负责人:
- 金额:$ 10.13万
- 依托单位:
- 依托单位国家:英国
- 项目类别:Research Grant
- 财政年份:2022
- 资助国家:英国
- 起止时间:2022 至 无数据
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
The project will develop a statistical data integration methodology, never considered before, that utilizes multiple sources of information to provide more accurate predictions than those currently available. Today we are living in the Big Data era, where masses of data in traditional formats are produced by companies and organizations and large quantities of information, mostly unstructured, are generated by social media, every second. However, are we effectively and efficiently exploiting all the information available to us from official and social media sources? The answer to this question is definitely, no. Most of the statistical approaches used to solve real-world problems are based on a single source of information and, although preliminary work attempting to leverage social media data exists, there are currently no comprehensive and functional methodologies able to fully capitalize on unstructured information and its associations with other available structured data. The consequence is that precious information contained in unstructured online data continues to be neglected and lost. While technology and digitalization advances are shaping the world, statistics is struggling to keep pace and it is currently in critical and urgent need of revolutionizing its methods and practices. This proposal aims at filling this gap, giving life to a pioneering and transformative statistical data integration methodology, fully leveraging the power of different sources of information, such as traditional and online-generated data. The project will support early-stage research on integrating unstructured and structured data using a new methodology based on vine copulas that will form the basis of future analyses, which will lead to a radical transformation of current data approaches, propelling statistics towards the future era. For this research, which is early-stage, yet will bring immediately usable results, the methodology will be applied to data of crimes committed in the South West region of the UK, integrating official police information, provided by our project partner Devon and Cornwall Police (DCP), with crime data discussed on different social media platforms. Our approach will provide a more thorough and realistic appraisal of the volume and severity of crimes in specific locations of the South West, since it will also account for hidden crimes, unreported to the police, but emerging from social media. The results of this project will be used by DCP to more effectively plan and organize their interventions and to efficiently allocate resources in targeted areas. Providing a deeper and more accurate knowledge of the geographical locations of criminal offences, including unreported crimes, this project will assist the police to better support communities in high criminal risk areas with timely interventions, making people feel more protected and safer. This will promote social inclusion and more equitable communities, especially in disadvantaged areas that are mostly affected by high criminality levels, including crimes which are not reported via traditional channels. This project, initially targeting the South West of the UK, will lay the foundation for future grant applications extending the geographical area under assessment at national level. In addition, due to the endless number of possible applications of our methodology, this project will be the milestone that will generate further breakthroughs in any other area of science where multiple data sources are available and accurate predictions are needed. This project is timely since it addresses the urgent need to fully leverage the social media information currently available, but not taken advantage of. This research will provide a key opportunity for the UK to secure a leading international position at the forefront of advances in knowledge extraction, leading to huge social and economic benefits.
该项目将开发一种以前从未考虑过的统计数据集成方法,该方法利用多种信息源来提供比目前可用的更准确的预测。今天,我们生活在大数据时代,公司和组织产生大量传统格式的数据,社交媒体每秒产生大量信息(大部分是非结构化的)。然而,我们是否有效且高效地利用了官方和社交媒体来源提供的所有信息?这个问题的答案肯定是,不会。大多数用于解决现实世界问题的统计方法都基于单一信息源,尽管存在尝试利用社交媒体数据的初步工作,但目前还没有能够充分利用非结构化信息及其与其他可用的结构化数据的关联。其结果是,非结构化在线数据中包含的宝贵信息继续被忽视和丢失。尽管技术和数字化的进步正在改变世界,但统计学正在努力跟上步伐,目前迫切需要彻底改变其方法和实践。该提案旨在填补这一空白,为开创性和变革性的统计数据集成方法注入活力,充分利用不同信息源(例如传统数据和在线生成数据)的力量。该项目将支持使用基于 vine copula 的新方法整合非结构化和结构化数据的早期研究,该方法将构成未来分析的基础,这将导致当前数据方法的根本转变,推动统计走向未来时代。这项研究尚处于早期阶段,但将立即带来可用的结果,该方法将应用于英国西南地区的犯罪数据,整合由我们的项目合作伙伴德文郡和康沃尔郡警方提供的官方警方信息(DCP),在不同的社交媒体平台上讨论犯罪数据。我们的方法将对西南特定地点的犯罪数量和严重程度进行更彻底、更现实的评估,因为它还将解释未向警方报告、但从社交媒体出现的隐藏犯罪。 DCP 将利用该项目的结果来更有效地规划和组织干预措施,并在目标领域有效地分配资源。该项目可以更深入、更准确地了解刑事犯罪(包括未报告犯罪)的地理位置,将协助警方更好地支持犯罪高风险地区的社区并及时进行干预,让人们感到受到更多保护和更安全。这将促进社会包容和更公平的社区,特别是在主要受到高犯罪率影响的弱势地区,包括不通过传统渠道举报的犯罪行为。该项目最初针对英国西南部,将为未来扩大国家一级评估地理区域的拨款申请奠定基础。此外,由于我们的方法有无数的可能应用,该项目将成为一个里程碑,它将在任何其他可用多个数据源且需要准确预测的科学领域产生进一步的突破。该项目是及时的,因为它解决了充分利用当前可用但尚未利用的社交媒体信息的迫切需要。这项研究将为英国提供一个关键机会,以确保其在知识提取领域的前沿国际领先地位,从而带来巨大的社会和经济效益。
项目成果
期刊论文数量(10)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Data Integration and Graphical Models for Cryptocurrencies
加密货币的数据集成和图形模型
- DOI:
- 发表时间:2022
- 期刊:
- 影响因子:0
- 作者:Dalla Valle L
- 通讯作者:Dalla Valle L
Bayesian Nonparametric Modeling of Conditional Multidimensional Dependence Structures
条件多维依赖结构的贝叶斯非参数建模
- DOI:10.1080/10618600.2023.2173604
- 发表时间:2023
- 期刊:
- 影响因子:2.4
- 作者:Barone R
- 通讯作者:Barone R
Social Media Integration of Flood Data: A Vine Copula-Based Approach
- DOI:10.3808/jei.202200471
- 发表时间:2022-01-05
- 期刊:
- 影响因子:7
- 作者:Ansell, L.;Dalla Valle, L.
- 通讯作者:Dalla Valle, L.
A new data integration framework for Covid-19 social media information.
- DOI:10.1038/s41598-023-33141-y
- 发表时间:2023-04-15
- 期刊:
- 影响因子:4.6
- 作者:Ansell, Lauren;Dalla Valle, Luciana
- 通讯作者:Dalla Valle, Luciana
Approximate Bayesian conditional copulas
近似贝叶斯条件联结函数
- DOI:10.1016/j.csda.2021.107417
- 发表时间:2022
- 期刊:
- 影响因子:1.8
- 作者:Grazian C
- 通讯作者:Grazian C
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Luciana Dalla Valle其他文献
Luciana Dalla Valle的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
相似国自然基金
定制亲疏油图案与仿生微造型耦合的复合沟槽阵列表面润滑增效机理及应用
- 批准号:
- 批准年份:2022
- 资助金额:30 万元
- 项目类别:青年科学基金项目
几何造型与机器学习融合的图像数据拟合问题研究
- 批准号:
- 批准年份:2022
- 资助金额:54 万元
- 项目类别:面上项目
产能共享背景下的制造型企业运营决策研究:基于信息共享与数据质量的视角
- 批准号:72271252
- 批准年份:2022
- 资助金额:44 万元
- 项目类别:面上项目
构造型深部岩体动力灾害的孕育和发生全过程机理研究
- 批准号:
- 批准年份:2022
- 资助金额:54 万元
- 项目类别:面上项目
盾构主轴承激光微造型协同相变硬化的抗疲劳机理及主动设计
- 批准号:
- 批准年份:2021
- 资助金额:30 万元
- 项目类别:青年科学基金项目
相似海外基金
非構造型交渉における効率性と平等性の葛藤に関する実験的検討
非结构化谈判中效率与平等冲突的实验研究
- 批准号:
24KJ2078 - 财政年份:2024
- 资助金额:
$ 10.13万 - 项目类别:
Grant-in-Aid for JSPS Fellows
教員養成における問題解決型学習の開発-対立や葛藤を乗り越えるArtの手法の活用
教师培训中发展问题解决学习——运用艺术方法克服矛盾和冲突
- 批准号:
23K02428 - 财政年份:2023
- 资助金额:
$ 10.13万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
単分子化学による構造化ナノグラフェンの表面合成
单分子化学表面合成结构化纳米石墨烯
- 批准号:
21F21058 - 财政年份:2021
- 资助金额:
$ 10.13万 - 项目类别:
Grant-in-Aid for JSPS Fellows
動的境界条件付き楕円型方程式の解構造と漸近解析
动态边界条件椭圆方程的解结构及渐近分析
- 批准号:
19J12579 - 财政年份:2019
- 资助金额:
$ 10.13万 - 项目类别:
Grant-in-Aid for JSPS Fellows
The Instructional Materials Development of the Multi-cultural Oriented Issues to Advance "the Global Partnership" Through the Japan-US Collaborative Action Research
通过日美协同行动研究推进“全球伙伴关系”的多元文化导向问题教材的开发
- 批准号:
18K02688 - 财政年份:2018
- 资助金额:
$ 10.13万 - 项目类别:
Grant-in-Aid for Scientific Research (C)