Embracing new technologies to streamline improve and sustain InterPro and its contributing databases
采用新技术来简化、改进和维护 InterPro 及其贡献数据库
基本信息
- 批准号:BB/F010435/1
- 负责人:
- 金额:$ 39.16万
- 依托单位:
- 依托单位国家:英国
- 项目类别:Research Grant
- 财政年份:2008
- 资助国家:英国
- 起止时间:2008 至 无数据
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
New DNA sequencing technologies have led to a flood of new data in sequence databases being submitted by individual scientists, genome sequencing projects and metagenomics projects. These sequences enter the databases with little or no annotation, limiting their usefulness to the scientific community. This has inspired the development of new tools for automatic annotation of the encoded protein sequences. One of the most successful developments in this area has been in the production of so-called protein 'signatures', diagnostic methods that are able to characterise newly-determined sequences in terms of the protein families to which they belong and/or the structural or functional domains they contain. Protein signature approaches have been adopted by a number of databases, and ten of the top such resources are integrated into the InterPro database. InterPro, and its accompanying protein analysis software tool, InterProScan, is now one of the leading protein functional classification resources in the world. However, despite its success, InterPro and its partners are currently suffering from a lack of financial support. The level of funding required to maintain and improve a database of this size is often underestimated. The amount of incoming data is increasing exponentially, and databases now struggle to provide their data to the public in a timely way, while at the same time maintaining the necessary high standards of data quality. Moreover, as they become more popular, and user demands increase, these core databases endure mounting pressure not only to keep up with the expanding volume of data and growing community requirements, but also to be early adopters of newly emerging technologies. This proposal aims to resolve these issues by embracing new technologies to enhance and further develop InterPro and its source databases. It aims to streamline production processes both to provide more regular data releases and to better cope with increased volumes of data. With more formalised Consortium activities and coordination thereof, we will make more efficient use of resources and share tasks to ensure long-term sustainability of the databases. Specifically we aim to: - Streamline data production procedures to enable a faster turn-around time for releasing the data; - Develop and integrate new annotation tools and standards to make the rate-limiting annotation step quicker and easier, and share tasks, such as annotation, to remove redundancy in effort; - Work closely together to improve quality-assurance procedures for protein matches; - Coordinate the upgrade of InterProScan and other HMM-based databases to the latest HMMer version; - Improve the InterProScan protein domain-finding software; - Exploit new technologies for database linking and data exchange; and - Extend the functionality of the Web interface to better meet the needs of the user community. The planned improvements to InterProScan and the protein match procedures will improve the quality, as well as the speed of protein functional classification; streamlining the production processes will enable the databases to get new protein domains and families out to the public as soon as they become available. New technologies will facilitate easier linking between different databases, and will provide the public with access to data from different sources. They will also open the door to more complex analyses, by providing improved programmatic access to the data. In addition, these new processes and technologies will allow InterPro and its member databases to cope with the ever-increasing flood of new data and make it accessible to the public in more regular releases. Ultimately, these improvements will make InterPro and its partners easier and more efficient to maintain, paving the way to a more sustainable future and increasing their benefit and usefulness to the scientific community.
新的DNA测序技术导致个人科学家,基因组测序项目和宏基因组学项目提交的序列数据库中大量新数据。这些序列几乎没有或根本没有注释进入数据库,从而将其有用性限制在科学界。这启发了开发用于自动注释编码蛋白质序列的新工具。该领域最成功的发展之一是生产所谓的蛋白质“特征”,即能够根据其属于和/或结构或/或它们包含的功能域。蛋白质签名方法已被许多数据库采用,并且十个此类资源中的十个集成在InterPro数据库中。 InterPro及其随附的蛋白质分析软件工具Intercoscan现在是世界上领先的蛋白质功能分类资源之一。但是,尽管它取得了成功,但Interpro及其合作伙伴目前仍缺乏财政支持。维持和改善此大小的数据库所需的资金水平通常被低估了。传入数据的数量呈指数级增加,数据库现在很难及时向公众提供数据,同时保持必要的高标准数据质量标准。此外,随着它们变得越来越受欢迎,用户需求增加,这些核心数据库不仅会遵守扩大的数据和不断增长的社区需求,而且还应成为新兴新兴技术的早期采用者。该建议旨在通过采用新技术来增强和进一步开发Interpro及其源数据库来解决这些问题。它旨在简化生产过程,以提供更多的常规数据发行,并更好地应对增加的数据。通过更正式的财团活动及其协调,我们将更有效地利用资源和共享任务,以确保数据库的长期可持续性。具体而言,我们的目标是: - 简化数据生产过程,以启用更快的转折时间以释放数据; - 开发和集成新的注释工具和标准,以使限制速率注释更快,更容易,并共享诸如注释之类的任务,以消除努力的冗余; - 紧密合作以改善蛋白质匹配的质量助方法; - 协调Interplade和其他基于HMM的数据库的升级到最新的HMMER版本; - 改善裂解蛋白域调查软件; - 利用新技术来链接和数据交换; - 扩展Web界面的功能,以更好地满足用户社区的需求。计划改进裂纹和蛋白质匹配程序将提高质量以及蛋白质功能分类的速度。简化生产过程将使数据库在可用后立即将新的蛋白质领域和家庭脱离公众。新技术将有助于更轻松地在不同数据库之间链接,并将为公众提供来自不同来源的数据的访问。他们还将通过提供改进的数据访问数据来打开更复杂分析的大门。此外,这些新的过程和技术将允许Interpro及其成员数据库应对不断增加的新数据洪水,并使公众在更常规的版本中可以访问。最终,这些改进将使Interpro及其合作伙伴更容易,更有效地维护,铺平了通往更可持续的未来的道路,并增加了对科学界的利益和实用性。
项目成果
期刊论文数量(7)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
InterPro: the integrative protein signature database.
- DOI:10.1093/nar/gkn785
- 发表时间:2009-01
- 期刊:
- 影响因子:14.9
- 作者:Hunter S;Apweiler R;Attwood TK;Bairoch A;Bateman A;Binns D;Bork P;Das U;Daugherty L;Duquenne L;Finn RD;Gough J;Haft D;Hulo N;Kahn D;Kelly E;Laugraud A;Letunic I;Lonsdale D;Lopez R;Madera M;Maslen J;McAnulla C;McDowall J;Mistry J;Mitchell A;Mulder N;Natale D;Orengo C;Quinn AF;Selengut JD;Sigrist CJ;Thimma M;Thomas PD;Valentin F;Wilson D;Wu CH;Yeats C
- 通讯作者:Yeats C
Challenges in homology search: HMMER3 and convergent evolution of coiled-coil regions.
- DOI:10.1093/nar/gkt263
- 发表时间:2013-07
- 期刊:
- 影响因子:14.9
- 作者:Mistry J;Finn RD;Eddy SR;Bateman A;Punta M
- 通讯作者:Punta M
InterPro in 2011: new developments in the family and domain prediction database.
- DOI:10.1093/nar/gkr948
- 发表时间:2012-01
- 期刊:
- 影响因子:14.9
- 作者:Hunter S;Jones P;Mitchell A;Apweiler R;Attwood TK;Bateman A;Bernard T;Binns D;Bork P;Burge S;de Castro E;Coggill P;Corbett M;Das U;Daugherty L;Duquenne L;Finn RD;Fraser M;Gough J;Haft D;Hulo N;Kahn D;Kelly E;Letunic I;Lonsdale D;Lopez R;Madera M;Maslen J;McAnulla C;McDowall J;McMenamin C;Mi H;Mutowo-Muellenet P;Mulder N;Natale D;Orengo C;Pesseat S;Punta M;Quinn AF;Rivoire C;Sangrador-Vegas A;Selengut JD;Sigrist CJ;Scheremetjew M;Tate J;Thimmajanarthanan M;Thomas PD;Wu CH;Yeats C;Yong SY
- 通讯作者:Yong SY
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Alex Bateman其他文献
Bioinformatics Advance Access published May 31, 2007
生物信息学高级访问发表于 2007 年 5 月 31 日
- DOI:
10.1007/s10015-009-0735-5 - 发表时间:
2007 - 期刊:
- 影响因子:0.9
- 作者:
Alex Bateman - 通讯作者:
Alex Bateman
Bioinformatics Applications Note Databases and Ontologies Codex: Exploration of Semantic Changes between Ontology Versions
生物信息学应用笔记数据库和本体法典:本体版本之间语义变化的探索
- DOI:
- 发表时间:
- 期刊:
- 影响因子:0
- 作者:
Michael Hartung;Anika Groß;E. Rahm;Alex Bateman - 通讯作者:
Alex Bateman
Alex Bateman的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Alex Bateman', 18)}}的其他基金
Improving accuracy, coverage, and sustainability of functional protein annotation in InterPro, Pfam and FunFam using Deep Learning methods
使用深度学习方法提高 InterPro、Pfam 和 FunFam 中功能蛋白注释的准确性、覆盖范围和可持续性
- 批准号:
BB/X018660/1 - 财政年份:2024
- 资助金额:
$ 39.16万 - 项目类别:
Research Grant
UKRI/BBSRC-NSF/BIO: Unifying Pfam protein sequence and ECOD structural classifications with structure models
UKRI/BBSRC-NSF/BIO:通过结构模型统一 Pfam 蛋白质序列和 ECOD 结构分类
- 批准号:
BB/X012492/1 - 财政年份:2023
- 资助金额:
$ 39.16万 - 项目类别:
Research Grant
Exploiting data driven computational approaches for understanding protein structure and function in InterPro and Pfam
利用数据驱动的计算方法来理解 InterPro 和 Pfam 中的蛋白质结构和功能
- 批准号:
BB/S020381/1 - 财政年份:2019
- 资助金额:
$ 39.16万 - 项目类别:
Research Grant
Rfam: The community resource for RNA families
Rfam:RNA 家族的社区资源
- 批准号:
BB/S020462/1 - 财政年份:2019
- 资助金额:
$ 39.16万 - 项目类别:
Research Grant
RNAcentral, the RNA sequence database
RNAcentral,RNA 序列数据库
- 批准号:
BB/N019199/1 - 财政年份:2017
- 资助金额:
$ 39.16万 - 项目类别:
Research Grant
Rfam: Towards a sustainable resource for understanding the genomic functional ncRNA repertoire
Rfam:寻找了解基因组功能 ncRNA 库的可持续资源
- 批准号:
BB/M011690/1 - 财政年份:2015
- 资助金额:
$ 39.16万 - 项目类别:
Research Grant
Keeping pace with protein sequence annotation; consolidating and enhancing Pfam and InterPro's methodologies for functional prediction
与蛋白质序列注释保持同步;
- 批准号:
BB/L024136/1 - 财政年份:2014
- 资助金额:
$ 39.16万 - 项目类别:
Research Grant
The RNAcentral database of non-coding RNAs
非编码RNA的RNA中央数据库
- 批准号:
BB/J019232/1 - 财政年份:2012
- 资助金额:
$ 39.16万 - 项目类别:
Research Grant
相似国自然基金
基于单细胞多组学技术探究新冠灭活疫苗应答个体差异的分子机制
- 批准号:32371000
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
基于高通量测序技术的肝内胆管细胞癌新驱动基因SAV1调控免疫逃逸及其在肿瘤生长转移中的作用与机制研究
- 批准号:82372985
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
基于RNA碱基编辑的共享肿瘤新抗原原位疫苗技术开发及其应用
- 批准号:82373455
- 批准年份:2023
- 资助金额:49 万元
- 项目类别:面上项目
基于SCO方法和SDP/SOCP松弛技术的非凸二次约束二次规划新全局算法与应用研究
- 批准号:12271485
- 批准年份:2022
- 资助金额:46 万元
- 项目类别:面上项目
智能航运新业态下的船舶协同自主航行与智慧监管理论与关键技术
- 批准号:52231014
- 批准年份:2022
- 资助金额:269 万元
- 项目类别:重点项目
相似海外基金
New, easy to use, low-cost technologies based on DNA origami biosensing to achieve distributed screening for AMR and improved antibiotic prescribing
基于 DNA 折纸生物传感的易于使用、低成本的新型技术,可实现 AMR 的分布式筛查并改进抗生素处方
- 批准号:
MR/Y034481/1 - 财政年份:2024
- 资助金额:
$ 39.16万 - 项目类别:
Research Grant
Piecing together the Neutrino Mass Puzzle in Search of New Particles with Precision Oscillation Experiments and Quantum Technologies
通过精密振荡实验和量子技术拼凑中微子质量难题以寻找新粒子
- 批准号:
ST/W003880/2 - 财政年份:2024
- 资助金额:
$ 39.16万 - 项目类别:
Fellowship
NEW TECHNOLOGIES FOR AFRICAN SWINE FEVER VACCINES
非洲猪瘟疫苗新技术
- 批准号:
10091291 - 财政年份:2024
- 资助金额:
$ 39.16万 - 项目类别:
EU-Funded
PHOtolysis Reaction Mechanisms by Emerging and New Technologies - PhoRMENT
新兴新技术的光解反应机制 - PhoRMENT
- 批准号:
2885177 - 财政年份:2023
- 资助金额:
$ 39.16万 - 项目类别:
Studentship
A novel peptide assay for hepcidin clinical monitoring
一种用于铁调素临床监测的新型肽测定方法
- 批准号:
10698746 - 财政年份:2023
- 资助金额:
$ 39.16万 - 项目类别: