III: Medium: Bias Tracking and Reduction Methods for High-Dimensional Exploratory Visual Analysis and Selection
III:中:高维探索性视觉分析和选择的偏差跟踪和减少方法
基本信息
- 批准号:1704018
- 负责人:
- 金额:$ 108.16万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2017
- 资助国家:美国
- 起止时间:2017-07-01 至 2022-11-30
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Exploratory visualization and analysis of large and complex datasets is growing increasingly common across a range of domains. For example, online companies track users to learn about their products, computer security logs capture detailed traces of network activity, and health care systems capture detailed longitudinal records for their patients. In all of these fields, large and complex data repositories are being created with the goal supporting data-driven, evidence-based decision making. However, today's visualization tools -- a critical part of an analyst's toolbox -- are often overwhelmed when applied to high-dimensional datasets (i.e., datasets with large numbers of variables). Real-world datasets can often have many thousands of variables; a stark contrast to the much smaller number of dimensions supported by most visualizations. This gap in dimensionality puts the validity of any analysis at great risk of bias, potentially leading to serious, hidden errors. This research project will develop a new approach to high-dimensional exploratory visualization that will help detect and reduce selection bias and other problems with data interpretation during exploratory high-dimensional data visualization. The project's results, including open-source software, will be broadly applicable across domains. In addition, the project will be evaluated with users in a health outcomes research setting. This offers significant potential to improve health care around the world. This project develops a set of Contextual Visualization Methods for exploratory data analysis which are designed to support the discovery of more robust and generalizable insights from high-dimensional data. These methods are built upon a recognition that the very summarization that makes many visual methods effective also inherently obscures aspects of a high-dimensional dataset that may be critical to accurate interpretation of a user's visual findings. More specifically, the subset of data (comprising both dimensions and records) that is actively accounted for within a visualization -- the data focus -- must be interpreted within the context of the many dimensions and data records that have been omitted or are not clearly represented within a visualization--the data context. The methods that this project develops, therefore, are designed to (1) explicitly model and analyze the data context, and (2) convey the relationship between the data focus and the context in order to better inform users about hidden problems such as confounding variables and selection bias. The primary technical contributions of the project include: (1) inline replication for visual validation; (2) baselined selection methods for high-dimensional visualization; (3) interactive rebalancing for representative visualization. In addition, open-source software will be developed and evaluated with real-world data and practitioners. The products of this research project -- including new methods, software products, and evaluation results -- will be disseminated through a project website (https://vaclab.web.unc.edu/contextual-visualization/).
大型和复杂数据集的探索性可视化和分析在各个范围内越来越普遍。例如,在线公司跟踪用户以了解其产品,计算机安全日志捕获网络活动的详细痕迹,并且医疗保健系统为患者捕获了详细的纵向记录。在所有这些领域中,都将创建大型且复杂的数据存储库,以支持数据驱动的,基于证据的决策。但是,当应用于分析师工具箱的关键部分时,当应用于高维数据集(即具有大量变量的数据集)时,通常会不知所措。现实世界中的数据集通常可以具有数千个变量。与大多数可视化支持的尺寸数量少得多的形成鲜明对比。维度上的差距使任何分析的有效性都有偏见的极大风险,可能导致严重的隐藏错误。该研究项目将开发一种新的方法来进行高维探索性可视化,该方法将有助于检测和减少选择偏见以及在探索性高维数据可视化过程中使用数据解释的其他问题。该项目的结果(包括开源软件)将在范围内广泛适用。此外,将在健康成果研究环境中与用户一起评估该项目。这为改善全球医疗保健提供了巨大的潜力。该项目为探索性数据分析开发了一组上下文可视化方法,旨在支持从高维数据中发现更健壮和可推广的见解。这些方法建立在一种认识的基础上,即使许多视觉方法有效的摘要也固有地掩盖了高维数据集的方面,这对于准确解释用户的视觉发现至关重要。更具体地说,必须在可视化范围内积极解释的数据子集(包括维度和记录)的子集(数据焦点 - 必须在省略的许多维度和数据记录的上下文中解释,或者在可视化 - 数据上下文中省略或未明确表示。因此,该项目开发的方法旨在(1)显式建模和分析数据上下文,以及(2)传达数据焦点与上下文之间的关系,以便更好地告知用户隐藏的问题,例如混淆变量和选择偏见。该项目的主要技术贡献包括:(1)视觉验证的内联复制; (2)用于高维可视化的基准选择方法; (3)交互式重新平衡以实现代表性可视化。此外,将使用现实世界数据和从业人员开发和评估开源软件。该研究项目的产品(包括新方法,软件产品和评估结果)将通过项目网站(https://vaclab.web.unc.unc.edu/contextual-visualization/)进行传播。
项目成果
期刊论文数量(11)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Selection Bias Tracking and Detailed Subset Comparison for High-Dimensional Data
- DOI:10.1109/tvcg.2019.2934209
- 发表时间:2019-06
- 期刊:
- 影响因子:5.2
- 作者:D. Borland;Wenyuan Wang;Jonathan Zhang;Joshua Shrestha;D. Gotz
- 通讯作者:D. Borland;Wenyuan Wang;Jonathan Zhang;Joshua Shrestha;D. Gotz
Enabling Longitudinal Exploratory Analysis of Clinical COVID Data
实现临床 COVID 数据的纵向探索性分析
- DOI:10.1109/vahc53616.2021.00008
- 发表时间:2021
- 期刊:
- 影响因子:0
- 作者:Borland, David;Brain, Irena;Fecho, Karamarie;Pfaff, Emily;Xu, Hao;Champion, James;Bizon, Chris;Gotz, David
- 通讯作者:Gotz, David
Visual Analysis of High-Dimensional Event Sequence Data via Dynamic Hierarchical Aggregation
- DOI:10.1109/tvcg.2019.2934661
- 发表时间:2019-06
- 期刊:
- 影响因子:5.2
- 作者:D. Gotz;Jonathan Zhang;Wenyuan Wang;Joshua Shrestha;D. Borland
- 通讯作者:D. Gotz;Jonathan Zhang;Wenyuan Wang;Joshua Shrestha;D. Borland
Adaptive Contextualization Methods for Combating Selection Bias during High-Dimensional Visualization
在高维可视化过程中对抗选择偏差的自适应情境化方法
- DOI:10.1145/3009973
- 发表时间:2017
- 期刊:
- 影响因子:3.4
- 作者:Gotz, David;Sun, Shun;Cao, Nan;Kundu, Rita;Meyer, Anne-Marie
- 通讯作者:Meyer, Anne-Marie
Selection-Bias-Corrected Visualization via Dynamic Reweighting
- DOI:10.1109/tvcg.2020.3030455
- 发表时间:2021-02-01
- 期刊:
- 影响因子:5.2
- 作者:Borland, David;Zhang, Jonathan;Gotz, David
- 通讯作者:Gotz, David
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
David Gotz其他文献
Scalable and adaptive streaming for non-linear media
非线性媒体的可扩展和自适应流媒体
- DOI:
10.1145/1180639.1180717 - 发表时间:
2006 - 期刊:
- 影响因子:0
- 作者:
David Gotz - 通讯作者:
David Gotz
RCLens: Interactive Rare Category Exploration and Identification
RCLens:交互式稀有类别探索和识别
- DOI:
10.1109/tvcg.2017.2711030 - 发表时间:
2018-07 - 期刊:
- 影响因子:5.2
- 作者:
Hanfei Lin;Siyuan Gao;David Gotz;Fan Du;Jingrui He;Nan Cao - 通讯作者:
Nan Cao
Institute for Research on Poverty Discussion Paper no. 1040-94 Taxes and the Poor: A Microsimulation Study of Implicit and Explicit Taxes
贫困研究所讨论论文编号。
- DOI:
- 发表时间:
1994 - 期刊:
- 影响因子:0
- 作者:
Manish Kumar;David Gotz;T. Nutley;Jason Smith - 通讯作者:
Jason Smith
A Survey on Visual Analytics of Social Media Data
社交媒体数据可视化分析调查
- DOI:
10.1109/tmm.2016.2614220 - 发表时间:
2016-11 - 期刊:
- 影响因子:7.3
- 作者:
Yingcai Wu;Nan Cao;David Gotz;Yap-Peng Tan;Daniel A. Keim - 通讯作者:
Daniel A. Keim
Z-Glyph: Visualizing outliers in multivariate data
Z-Glyph:可视化多元数据中的异常值
- DOI:
10.1177/1473871616686635 - 发表时间:
2018 - 期刊:
- 影响因子:2.3
- 作者:
Nan Cao;Yu-Ru Lin;David Gotz;Fan Du - 通讯作者:
Fan Du
David Gotz的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('David Gotz', 18)}}的其他基金
III: Medium: Counterfactual-Based Supports For Visual Causal Inference
III:媒介:基于反事实的视觉因果推理支持
- 批准号:
2211845 - 财政年份:2022
- 资助金额:
$ 108.16万 - 项目类别:
Standard Grant
NSF Student Travel Support for the 2019 IEEE Visualization Doctoral Colloquium (IEEE VIS DC)
NSF 学生为 2019 年 IEEE 可视化博士座谈会 (IEEE VIS DC) 提供的旅行支持
- 批准号:
1925878 - 财政年份:2019
- 资助金额:
$ 108.16万 - 项目类别:
Standard Grant
QuBBD: Collaborative Research: Interactive Ensemble clustering for mixed data with application to mood disorders
QuBBD:协作研究:混合数据的交互式集成聚类及其在情绪障碍中的应用
- 批准号:
1557593 - 财政年份:2015
- 资助金额:
$ 108.16万 - 项目类别:
Standard Grant
相似国自然基金
复合低维拓扑材料中等离激元增强光学响应的研究
- 批准号:12374288
- 批准年份:2023
- 资助金额:52 万元
- 项目类别:面上项目
基于管理市场和干预分工视角的消失中等企业:特征事实、内在机制和优化路径
- 批准号:72374217
- 批准年份:2023
- 资助金额:41.00 万元
- 项目类别:面上项目
托卡马克偏滤器中等离子体的多尺度算法与数值模拟研究
- 批准号:12371432
- 批准年份:2023
- 资助金额:43.5 万元
- 项目类别:面上项目
中等质量黑洞附近的暗物质分布及其IMRI系统引力波回波探测
- 批准号:12365008
- 批准年份:2023
- 资助金额:32 万元
- 项目类别:地区科学基金项目
中等垂直风切变下非对称型热带气旋快速增强的物理机制研究
- 批准号:42305004
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
相似海外基金
RII Track-4:@NASA: Bluer and Hotter: From Ultraviolet to X-ray Diagnostics of the Circumgalactic Medium
RII Track-4:@NASA:更蓝更热:从紫外到 X 射线对环绕银河系介质的诊断
- 批准号:
2327438 - 财政年份:2024
- 资助金额:
$ 108.16万 - 项目类别:
Standard Grant
Collaborative Research: Topological Defects and Dynamic Motion of Symmetry-breaking Tadpole Particles in Liquid Crystal Medium
合作研究:液晶介质中对称破缺蝌蚪粒子的拓扑缺陷与动态运动
- 批准号:
2344489 - 财政年份:2024
- 资助金额:
$ 108.16万 - 项目类别:
Standard Grant
Collaborative Research: AF: Medium: The Communication Cost of Distributed Computation
合作研究:AF:媒介:分布式计算的通信成本
- 批准号:
2402836 - 财政年份:2024
- 资助金额:
$ 108.16万 - 项目类别:
Continuing Grant
Collaborative Research: AF: Medium: Foundations of Oblivious Reconfigurable Networks
合作研究:AF:媒介:遗忘可重构网络的基础
- 批准号:
2402851 - 财政年份:2024
- 资助金额:
$ 108.16万 - 项目类别:
Continuing Grant
Collaborative Research: CIF: Medium: Snapshot Computational Imaging with Metaoptics
合作研究:CIF:Medium:Metaoptics 快照计算成像
- 批准号:
2403122 - 财政年份:2024
- 资助金额:
$ 108.16万 - 项目类别:
Standard Grant