Collaborative Research: CIBR: Leaping the Specimen Digitization Gap: Connecting Novel Tools, Machine Learning and Public Participation to Label Digitization Efforts
合作研究:CIBR:跨越标本数字化差距:将新工具、机器学习和公众参与与标签数字化工作联系起来
基本信息
- 批准号:2027234
- 负责人:
- 金额:$ 29.24万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2021
- 资助国家:美国
- 起止时间:2021-01-15 至 2024-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
National efforts to digitize natural history collections have transformed previously siloed, unstandardized resources into a networked, openly available information nexus usable to meet grand scientific and societal challenges. Despite these enormous strides, major bottlenecks in this digitization process still exist, especially in areas where automation approaches have been most challenging. In particular, capturing analog specimen data into digital format and converting text descriptions of collecting locations into mappable geocoordinates, have remained boutique efforts. Because of these bottlenecks, as many as 91% of digitized specimens are missing key elements that hamper ability to use these specimen records more effectively. This project will develop key workflows to dramatically increase the speed at which specimen data can be captured and made available broadly to data providers and consumers. These workflows include novel approaches that use both computer and human intelligence to advance our ability to capture specimen information. One key workflow focuses on the challenge of automated conversion of imaged specimen labels into properly formatted and usable digital text. Critical to the success of this workflow are human validation checkpoints that will be implemented using a popular citizen science platform, Notes from Nature. A second workflow focuses on new tools that take advantage of previous efforts to assign mappable coordinates based on specimen collection location to automatically add such mapping information for specimens missing those data. Finally, this effort will create tools for easy access to these new data in and out of common use databases, making the data immediately available for museum providers and researchers alike. This effort will connect public participation in science to these novel tools and technologies. Further, it will train diverse graduate students and undergraduate students in bioinformatics and museum science.This effort has three design goals that together will dramatically reduce the digitization gap in museum specimen data. The first design goal will combine machine learning methods with public participation in scientific research (PPSR) via the successful Notes from Nature (NfN) project to speed up label digitization and facilitate obtaining locality data. A key part of the first design goal utilizes supervised machine learning approaches and object character recognition (OCR) when possible but also includes “humans in the loop” using the NfN platform to gather fast quality feedback from human volunteers at key points. This approach also provides a means to create high-quality training datasets needed for improving automation steps, ultimately further reducing human effort. The second design goal will integrate locality data interpretation through GEOLocate with a Biodiversity Enhanced Locality Service (BELS), which will make it possible to look up pre-existing localities that have been georeferenced using best practices. A third goal is to connect these workflows and services to Symbiota, a community digitization hub, to allow easy inflow and outflow of content back to digitization networks. Providers will be able to easily access new data along with associated metadata about processing steps, all returned using established standards and best practices. The key to this effort will be engagement with the community, including researchers, collections staff, and Zooniverse volunteers. Engagement will focus on virtual training and working with an advisory committee in order to grow capacity and community involvement.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
数字化自然历史收藏的全国努力已将以前孤立的,未标准化的资源转变为一个可以解决宏伟的科学和社会挑战的网络,公开可用的信息。尽管有这些进步良好,但仍存在此数字化过程中的主要瓶颈,尤其是在自动化方法最挑战的领域。特别是,将模拟标本数据捕获为数字格式,并将收集位置的文本描述转换为可映射的地理辅助物,这仍然是精品店的努力。由于这些瓶颈,多达91%的数字化标本缺少关键元素,这些元素会阻碍更有效地使用这些标本记录的能力。该项目将开发关键工作流程,以大大提高可以捕获标本数据并广泛提供给数据提供商和消费者的标本数据的速度。这些工作流程包括使用计算机和人类智能来提高我们捕获标本信息的能力的新颖方法。一个关键的工作流程重点是将成像标签自动转换为正确格式化和可用的数字文本的挑战。对于此工作流程的成功至关重要的是人类验证检查点,该检查点将使用自然界的流行公民科学平台实施。第二个工作流程集中于新工具,这些工具可以利用以前的努力根据标本集合位置分配可映射的坐标,以自动为缺少这些数据的样品添加此类映射信息。最后,这项工作将创建工具,以便于在共同用途数据库中访问这些新数据,从而使博物馆提供者和研究人员立即提供数据。这项工作将把科学参与与这些新颖的工具和技术联系起来。此外,它将在生物信息学和博物馆科学领域培训潜水员的研究生和本科生。这项工作具有三个设计目标,可以大大减少博物馆标本数据中的数字化差距。第一个设计目标将通过大自然(NFN)项目的成功说明将机器学习方法与公众参与科学研究(PPSR)相结合,以加快标签数字化并促进获得局部数据。第一个设计目标的关键部分是在可能的情况下利用监督的机器学习方法和对象角色识别(OCR),但还使用NFN平台包括“循环中的人”,以收集关键点的人类志愿者的快速质量反馈。这种方法还提供了一种创建改进自动化步骤所需的高质量培训数据集的方法,最终进一步减少了人类的努力。第二个设计目标将通过Geolocation与生物多样性增强的区域服务(BELS)整合局部数据解释,这将使您可以查找使用最佳实践进行授话的预先存在的地方。第三个目标是将这些工作流和服务与社区数字化集线器Symbiota连接起来,以使内容易于流入和插入内容回到数字化网络。提供商将能够轻松地访问有关处理步骤的关联元数据以及所有使用既定标准和最佳实践返回的元数据。这项工作的关键将是与社区互动,包括研究人员,收藏人员和Zooniverse志愿者。参与将重点放在虚拟培训上,并与咨询委员会合作,以提高能力和社区参与。该奖项反映了NSF的法定任务,并被认为是通过基金会的知识分子优点和更广泛的影响审查标准来评估而被视为珍贵的支持。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Robert Guralnick其他文献
The automorphism groups of a family of maximal curves
- DOI:
10.1016/j.jalgebra.2012.03.036 - 发表时间:
2012-07-01 - 期刊:
- 影响因子:
- 作者:
Robert Guralnick;Beth Malmskog;Rachel Pries - 通讯作者:
Rachel Pries
On rational and concise words
- DOI:
10.1016/j.jalgebra.2015.02.003 - 发表时间:
2015-05-01 - 期刊:
- 影响因子:
- 作者:
Robert Guralnick;Pavel Shumyatsky - 通讯作者:
Pavel Shumyatsky
Primitive monodromy groups of genus at most two
- DOI:
10.1016/j.jalgebra.2014.06.020 - 发表时间:
2014-11-01 - 期刊:
- 影响因子:
- 作者:
Daniel Frohardt;Robert Guralnick;Kay Magaard - 通讯作者:
Kay Magaard
Robert Guralnick的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Robert Guralnick', 18)}}的其他基金
IntBIO Collaborative Research: Assessing drivers of the nitrogen-fixing symbiosis at continental scales
IntBIO 合作研究:评估大陆尺度固氮共生的驱动因素
- 批准号:
2316267 - 财政年份:2023
- 资助金额:
$ 29.24万 - 项目类别:
Standard Grant
Collaborative Research: Ranges: Building Capacity to Extend Mammal Specimens from Western North America
合作研究:范围:建设能力以扩展北美西部的哺乳动物标本
- 批准号:
2228392 - 财政年份:2023
- 资助金额:
$ 29.24万 - 项目类别:
Continuing Grant
Collaborative Research: Phenobase: Community, infrastructure, and data for global-scale analyses of plant phenology
合作研究:Phenobase:用于全球范围植物物候分析的社区、基础设施和数据
- 批准号:
2223512 - 财政年份:2022
- 资助金额:
$ 29.24万 - 项目类别:
Continuing Grant
Collaborative Research: LightningBug, An Integrated Pipeline to Overcome The Biodiversity Digitization Gap
合作研究:LightningBug,克服生物多样性数字化差距的综合管道
- 批准号:
2104152 - 财政年份:2021
- 资助金额:
$ 29.24万 - 项目类别:
Continuing Grant
Collaborative Research: Origins and drivers of extinction of Caribbean Avifauna
合作研究:加勒比鸟类灭绝的起源和驱动因素
- 批准号:
2033905 - 财政年份:2021
- 资助金额:
$ 29.24万 - 项目类别:
Continuing Grant
Collaborative Research: Genealogy of Odonata (GEODE): Dispersal and color as drivers of 300 million years of global dragonfly evolution
合作研究:蜻蜓目 (GEODE) 谱系:传播和颜色是 3 亿年全球蜻蜓进化的驱动力
- 批准号:
2002457 - 财政年份:2020
- 资助金额:
$ 29.24万 - 项目类别:
Continuing Grant
IIBR RoL: Collaborative Research: A Rules Of Life Engine (RoLE) Model to Uncover Fundamental Processes Governing Biodiversity
IIBR RoL:协作研究:揭示生物多样性基本过程的生命规则引擎 (RoLE) 模型
- 批准号:
1927286 - 财政年份:2019
- 资助金额:
$ 29.24万 - 项目类别:
Standard Grant
Cohomology and Representations of Finite and Algebraic Groups with Applications
有限代数群的上同调和表示及其应用
- 批准号:
1901595 - 财政年份:2019
- 资助金额:
$ 29.24万 - 项目类别:
Continuing Grant
Collaborative Research: ABI Innovation: FuTRES, an Ontology-Based Functional Trait Resource for Paleo- and Neo-biologists
合作研究:ABI 创新:FuTRES,为古生物学家和新生物学家提供的基于本体的功能性状资源
- 批准号:
1759898 - 财政年份:2018
- 资助金额:
$ 29.24万 - 项目类别:
Standard Grant
Cohomology, Representations, and Coverings of Curves
曲线的上同调、表示和覆盖
- 批准号:
1600056 - 财政年份:2016
- 资助金额:
$ 29.24万 - 项目类别:
Continuing Grant
相似国自然基金
钛基骨植入物表面电沉积镁氢涂层及其促成骨性能研究
- 批准号:52371195
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
CLMP介导Connexin45-β-catenin复合体对先天性短肠综合征的致病机制研究
- 批准号:82370525
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
人工局域表面等离激元高灵敏传感及其系统小型化的关键技术研究
- 批准号:62371132
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
优先流对中俄原油管道沿线多年冻土水热稳定性的影响机制研究
- 批准号:42301138
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
用于稳定锌负极的界面层/电解液双向调控研究
- 批准号:52302289
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
相似海外基金
Collaborative Research: CIBR: Leaping the Specimen Digitization Gap: Connecting Novel Tools, Machine Learning and Public Participation to Label Digitization Efforts
合作研究:CIBR:跨越标本数字化差距:将新工具、机器学习和公众参与与标签数字化工作联系起来
- 批准号:
2027241 - 财政年份:2021
- 资助金额:
$ 29.24万 - 项目类别:
Standard Grant
Collaborative Research: CIBR: Incorporating Crystallography and Cryo-EM Tools in Foldit
合作研究:CIBR:在 Foldit 中结合晶体学和冷冻电镜工具
- 批准号:
2051305 - 财政年份:2021
- 资助金额:
$ 29.24万 - 项目类别:
Standard Grant
Collaborative Research: CIBR: Incorporating Crystallography and Cryo-EM tools into Foldit
合作研究:CIBR:将晶体学和冷冻电镜工具纳入 Foldit
- 批准号:
2051282 - 财政年份:2021
- 资助金额:
$ 29.24万 - 项目类别:
Standard Grant
Collaborative Research: CIBR: The OpenBehavior Project
合作研究:CIBR:开放行为项目
- 批准号:
1948181 - 财政年份:2021
- 资助金额:
$ 29.24万 - 项目类别:
Continuing Grant
Collaborative Research: CIBR: Leaping the Specimen Digitization Gap: Connecting Novel Tools, Machine Learning and Public Participation to Label Digitization Efforts
合作研究:CIBR:跨越标本数字化差距:将新工具、机器学习和公众参与与标签数字化工作联系起来
- 批准号:
2027228 - 财政年份:2021
- 资助金额:
$ 29.24万 - 项目类别:
Standard Grant