From Human-Powered to Automated Video Description for Blind and Low Vision Users
针对盲人和低视力用户的从人力到自动视频描述
基本信息
- 批准号:10568469
- 负责人:
- 金额:$ 64.99万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2023
- 资助国家:美国
- 起止时间:2023-07-01 至 2028-06-30
- 项目状态:未结题
- 来源:
- 关键词:AddressArtificial IntelligenceBreedingCanis familiarisCollaborationsCommunicationCommunitiesComputer Vision SystemsData SetDecision MakingDiagnosisEducationEmotionalEmploymentEnsureEnvironmentEthicsFaceFeelingFutureGerman populationGoalsHealthHumanHuman VolunteersImageIndividualInternetInterventionKnowledgeLifeMachine LearningMissionModernizationNatural Language ProcessingOutcomeOutputPathway interactionsPersonal SatisfactionPersonsPositioning AttributePublic HealthResearchSafetySelf-Help DevicesSystemTimeTrainingUnited StatesUnited States National Institutes of HealthVisionVisualVisual impairmentWorkassistive robotblindcommunity engagementcomputer human interactiondesigndigitaldiverse dataglobal healthhuman-in-the-loopimprovedinnovationlarge scale datanovelopen sourceresponsesatisfactionsocialsocial exclusiontoolvirtualvision rehabilitationvisual informationvolunteer
项目摘要
Project Summary
Approximately 12 million people in the United States have been diagnosed with a visual impairment. These
individuals face unique challenges in our modern environment, where much critical information related to
education, employment, entertainment, and community is presented in the form of digital videos. Inaccessible
information can result in social exclusion or become life threatening if individuals require access to it in order to
make decisions related to their health and safety. For example, in a personal or global health crisis, individuals
may need to access the mass amounts of information conveyed via videos or dynamic infographics in order to
make informed decisions. To address this need, the online platform YouDescribe allows blind and low vision
(BLV) users to request amateur volunteers to create video descriptions, also referred to as audio descriptions
(AD), of YouTube videos. However, the platform has been unable to keep up with the overwhelming demand,
and 92.5% of videos on the YouDescribe user wish list remain undescribed. The overall objective of this proposal
is to build an AI-driven system, suitable for use on a wide-scale, to automatically generate descriptions of online
videos, as well as answer questions asked by BLV users about the content of videos. The rationale for this
project is that AI-based tools are necessary to facilitate timely access to the deluge of new videos appearing on
the Internet every day. The proposed work encompasses three specific aims: 1) develop an AI-based tool in
collaboration with sighted describers that more efficiently produces video descriptions and increases the
availability of accessible videos. The goal is to create an AI-driven NarrationBot that will decrease the time
required for novice volunteers to produce video descriptions by 80%; 2) develop an AI-based tool in collaboration
with BLV individuals that offers user-driven access to visual information in online videos. The goal is to develop
an AI-driven QABot that allows users to pause a video, ask questions about content, and receive immediate
answers (e.g., “What breed is the dog?”, “German shepherd”) that are accurate 80% of the time; and 3) develop
and publicly release large-scale datasets to improve machine learning for video accessibility. These novel
datasets will be used to increase the quality and accuracy of NarrationBot and QABot until AI-generated
descriptions and answers need minimal intervention from human volunteers and can serve BLV users directly.
The proposed research is innovative because it focuses on videos, whereas existing AI-driven efforts to address
this problem have focused primarily on static photos or images. It is also one of only a few efforts to directly
partner with BLV individuals to develop AI-driven systems that produce visual descriptions or answer visual
questions. The proposed research is significant because it will result in open-source, AI-driven tools that will give
BLV individuals unprecedented control over their ability to independently navigate the information-rich world of
online videos, thus improving their health and wellbeing.
项目摘要
在美国,大约有1200万人被诊断出视觉障碍。这些
个人在我们的现代环境中面临独特的挑战,在这里,许多关键信息与
教育,就业,娱乐和社区以数字视频的形式介绍。无法访问
如果个人需要访问该信息以便
做出与他们的健康和安全有关的决定。例如,在个人或全球健康危机中,个人
可能需要访问通过视频或动态图表传达的大量信息,以便
做出明智的决定。为了满足这一需求,在线平台YouDeScribe允许盲目和低视力
(BLV)用户要求业余志愿者创建视频描述,也称为音频说明
(AD),YouTube视频。但是,该平台无法跟上压倒性的需求,
YouDeScribe用户愿望列表上的92.5%的视频仍未描述。该提议的总体目标
是构建适用于大规模使用的AI驱动系统,以自动生成在线描述
视频以及BLV用户询问的有关视频内容的回答问题。理由
项目是,基于AI的工具是必要的,以促进及时访问出现在
互联网每天。拟议的工作包括三个具体目的:1)在
与视力描述的人合作,更有效地产生视频描述并增加
可访问视频的可用性。目的是创建一个AI驱动的Narrationbot,以减少时间
新手志愿者需要制作视频描述80%所需的要求; 2)开发合作的基于AI的工具
在在线视频中,与BLV个人一起提供对用户驱动的视觉信息的访问。目标是发展
AI驱动的QABOT,允许用户暂停视频,询问有关内容的问题并立即接收
答案(例如,“狗是什么品种?”,“德国牧羊犬”),准确是80%的时间; 3)发展
并公开发布大规模数据集,以改善机器学习以供视频访问性。这些小说
数据集将用于提高Narrationbot和Qabot的质量和准确性,直到AI生成
描述和答案需要人类志愿者的最少干预,并且可以直接为BLV用户服务。
拟议的研究具有创新性,因为它专注于视频,而现有的AI驱动努力来解决
这个问题主要集中在静态照片或图像上。这也是直接直接的少数努力之一
与BLV个人合作开发AI驱动系统,以产生视觉描述或回答视觉
问题。拟议的研究很重要,因为它将产生开源的,AI驱动的工具,该工具将提供
BLV个人对自己独立导航信息丰富世界的能力的空前控制
在线视频,从而改善了他们的健康和福祉。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Pooyan Fazli其他文献
Pooyan Fazli的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
相似国自然基金
人工智能驱动的营销模式和消费者行为研究
- 批准号:72332006
- 批准年份:2023
- 资助金额:165 万元
- 项目类别:重点项目
基于“人工智能算法+高精度遥感数据”的棉花表型信息识别及解析
- 批准号:32360436
- 批准年份:2023
- 资助金额:32 万元
- 项目类别:地区科学基金项目
巴氏杀菌乳中金黄色葡萄球菌和肠毒素A风险预测和溯源的人工智能模型构建研究
- 批准号:32302241
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
制造企业人工智能工作场景下员工AI认同影响机制与员工主动行为内在机理研究
- 批准号:72362025
- 批准年份:2023
- 资助金额:27 万元
- 项目类别:地区科学基金项目
基于原子贡献与人工智能的萃取精馏溶剂分子设计研究
- 批准号:22308037
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
相似海外基金
GENOMICE (Game Exploring Nuances in Offspring to Master Interactions of Chromosome Expression)
GENOMICE(探索后代细微差别以掌握染色体表达相互作用的游戏)
- 批准号:
10760456 - 财政年份:2023
- 资助金额:
$ 64.99万 - 项目类别:
Computer Vision for Malaria Microscopy: Automated Detection and Classification of Plasmodium for Basic Science and Pre-Clinical Applications
用于疟疾显微镜的计算机视觉:用于基础科学和临床前应用的疟原虫自动检测和分类
- 批准号:
10576701 - 财政年份:2023
- 资助金额:
$ 64.99万 - 项目类别:
IEEE International Symposium on Biomedical Imaging (ISBI) 2020
IEEE 国际生物医学成像研讨会 (ISBI) 2020
- 批准号:
9914410 - 财政年份:2020
- 资助金额:
$ 64.99万 - 项目类别:
Efficient software and algorithms for analyzing markers data on general pedigree
用于分析一般谱系标记数据的高效软件和算法
- 批准号:
8115481 - 财政年份:2010
- 资助金额:
$ 64.99万 - 项目类别:
Efficient software and algorithms for analyzing markers data on general pedigree
用于分析一般谱系标记数据的高效软件和算法
- 批准号:
7495734 - 财政年份:2007
- 资助金额:
$ 64.99万 - 项目类别: