Structural Biology Information Resources

结构生物学信息资源

基本信息

  • 批准号:
    8149595
  • 负责人:
  • 金额:
    $ 665.96万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
  • 资助国家:
    美国
  • 起止时间:
  • 项目状态:
    未结题

项目摘要

Protein three-dimensional structures are drawn from the Protein Data Bank (PDB), an international database collaboration supported in part by the NIH. PDB records are processed at NCBI to provide Molecular Modeling Database (MMDB) records with precise definitions of the component biological macromolecules and chemicals, and their interactions as indicated by atomic contacts in three-dimensional structure. Protein structure records are compared to NCBI protein sequence records using the Basic Local Alignment Search Tool algorithm (BLAST) and compared to one another by the Vector Alignment Search Tool structure-comparison algorithm (VAST). Protein sequences in the NCBI collection are also compared to protein family records in the Conserved Domain Database (CDD) using the Reverse Position-Specific BLAST algorithm (RPSB). These automated comparison methods provide the cross references needed to link protein and gene sequences in NCBI's extensive collection to the biological function annotation provided by protein structures. Informatics projects were needed this year to address ongoing "remediation" undertaken by PDB, who have more than once modified 100% of the over 65,000 files in their collection. "Remediation" of the bonded-atom connectivity (chemical graphs) of component molecules has fortunately become less frequent. And additional, potentially useful information is now provided, such as "SPLIT" records indicating when a given crystal structure has been divided into two or more PDB files, and "Biological Assembly" information indicating a subset of biologically important interactions. Unfortunately the additional information is encoded in "REMARK" text without documented formats, requiring considerable effort to develop reliable text-parsing algorithms. But correction of PDB "SPLIT" files and display of "Biological Assembly" information on MMDB "Document Summary" and "Structure Summary" pages has progressed, and public release is expected before the end of this year. Research to identify molecular interactions not included in the PDB "Biological Assembly" has continued. This includes identification of contact thresholds for biologically relevant protein-ligand complexes such as heme in hemoglobin, and interactions observed among related protein structures but not mentioned the PDB files of those structures. The Inferred Biomolecular Interactions Server (IBIS) http://www.ncbi.nlm.nih.gov/Structure/ibis/ibis.cgi has been greatly improved, providing a research tool for structural biologists to explore the variety of intermolecular interactions observed among related protein three-dimensional structures. The NCBI Conserved Domains database CDD is in part derived from comprehensive protein sequence alignment collections prepared automatically by others. These include, for example, the Pfam collection prepared at the Wellcome Trust Sanger Institute and the Protein Clusters collection prepared at NCBI/IEB. More important contributions to CDD are the expert-curated protein family alignments prepared by staff of the CDD project. Very accurate protein family alignments consistent with known three-dimensional structures and structural superpositions are prepared using algorithms within the "See in Three Dimensions" program (Cn3D). Conserved subfamilies consistent with evolutionary evidence are derived using phylogenetic tree algorithms and graphics within the "Conserved Domain Tree" program (CDTree). Curators save into CDD records the phylogenetic trees identifying ancient conserved subfamilies and the biological function annotation derived from interactions observed in three-dimensional structures within each subfamily and/or from other observations such as subfamily-specific experimental studies reported in the literature. CDD informatics projects have continued this year. An algorithm to automate refinement of CDD subfamily alignments by rapidly performing multiple realignments of member sequences has continued to prove useful in reducing curator effort. A batch CD-search service supporting easy retrieval of CDD alignments and functional annotations for large groups of sequences has now been released and is in wide use. A research project on automated identification of ancient conserved multi-domain architectures has also continued and appears successful. The goal is to support efficient construction of ancient conserved multi-domain CDD records based on previously-curated alignments of component domains. This will allow curators to provide multi-domain-specific functional annotation without the necessity to edit already-accurate alignments. Another research project continued this year aims to automate identification of ancient conserved subfamilies where accurate biological function annotation is likely, given known three-dimensional structures and/or literature citations for subfamily member sequences. If successful, this procedure will reduce the curator time required to browse often-large phylogenetic trees produced by CDTree, supporting efficient identification of subfamilies where functional annotation is most possible and worthwhile.
蛋白质的三维结构是从NIH部分支持的国际数据库协作的蛋白质数据库(PDB)中得出的。 PDB记录在NCBI上进行处理,以提供分子建模数据库(MMDB)记录,具有成分生物学大分子和化学物质的精确定义,以及它们的相互作用,如原子接触所示,在三维结构中。使用基本的局部比对搜索工具算法(BLAST)将蛋白质结构记录与NCBI蛋白序列记录进行比较,并与矢量比对搜索工具结构 - 比较算法(vast)相互比较。还将NCBI收集中的蛋白质序列与使用反向位置特异性BLAST算法(RPSB)的保守域数据库(CDD)中的蛋白质家族记录进行了比较。这些自动比较方法提供了将NCBI广泛收集中蛋白质和基因序列与蛋白质结构提供的生物学功能注释联系起来所需的交叉参考。 今年需要信息项目来解决PDB进行的正在进行的“补救”,他们的收集中有超过65,000个文件中的100%修改了100%。幸运的是,组分分子的键合 - 原子连通性(化学图​​)的“补救”变得越来越少。现在还提供了其他潜在有用的信息,例如指示给定晶体结构何时将两个或多个PDB文件分为两个或多个的“生物组装”信息,表明一部分具有生物学上重要的相互作用。不幸的是,其他信息在没有记录格式的“备注”文本中编码,需要大量努力来开发可靠的文本范围算法。但是,在MMDB“文档摘要”和“结构摘要”页面上的PDB“拆分”文件和显示“生物组装”信息的校正已经进行了进展,并且预计今年年底之前将公开发布。鉴定PDB“生物组装”中未包括的分子相互作用的研究仍在继续。这包括鉴定与血红蛋白中血红素等生物相关蛋白质配合物的接触阈值以及在相关蛋白质结构之间观察到的相互作用,但未提及这些结构的PDB文件。推断的生物分子互动服务器(IBIS)http://www.ncbi.nlm.nih.gov/structure/ibis/ibis/ibis.cgi得到了极大的改进,为结构生物学家提供了一种研究工具,可以为探索相关蛋白质之间的各种蛋白质之间的蛋白质互动探索相关的蛋白质三二维结构的多样性。 NCBI保守的域数据库CDD部分源自其他人自动制备的综合蛋白质序列比对收集。其中包括在Wellcome Trust Sanger Institute准备的PFAM收集和在NCBI/IEB制备的蛋白质簇集合。对CDD的更重要的贡献是由CDD项目的工作人员准备的专家策划蛋白质家族对齐。非常准确的蛋白质家族比对,与已知的三维结构一致,结构叠加是使用“三维中的参见”程序(CN3D)中的算法制备的。与进化证据一致的保守亚家族是使用“保守域树”程序(CDTree)中的系统发育算法和图形得出的。策展人保存在CDD中记录了系统发育树,这些树木鉴定出古老的保守亚家族和生物学功能注释,这些注释是从每个亚家族中的三维结构中观察到的相互作用和/或其他观察值(例如在文献中报道的)中特定的实验研究中观察到的。 CDD信息学项目今年继续进行。通过迅速执行成员序列的多个重新调整来自动化CDD亚家族对齐的算法,已继续证明可用于减少策展人的努力。现已发布并广泛使用了大量序列的CDD对齐和功能注释的批处理CD搜索服务,可轻松检索CDD对齐和功能注释。关于古代保守多域架构的自动识别的研究项目也继续并显得成功。目的是支持基于组件域的先前策划对齐的古代保守多域CDD记录的有效构建。这将使策展人能够提供多域特异性功能注释,而无需编辑已经准确的一致性。今年继续进行的另一个研究项目旨在自动化对古老的保守亚家族的识别,其中可能是为了确定的生物学功能注释,鉴于已知的三维结构和/或文献引用了亚家族成员序列。如果成功,此过程将减少浏览CDTree生产的经常大量系统发育树所需的策展人时间,从而有效地识别最可能和值得的功能性注释。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

stephen h bryant其他文献

stephen h bryant的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('stephen h bryant', 18)}}的其他基金

Chemical Biology Information Resources
化学生物学信息资源
  • 批准号:
    8344963
  • 财政年份:
  • 资助金额:
    $ 665.96万
  • 项目类别:
Chemical Biology Information Resources
化学生物学信息资源
  • 批准号:
    7969241
  • 财政年份:
  • 资助金额:
    $ 665.96万
  • 项目类别:
Bioinformatics Methods for Mass Spectra Analysis
质谱分析的生物信息学方法
  • 批准号:
    6988545
  • 财政年份:
  • 资助金额:
    $ 665.96万
  • 项目类别:
Chemical Biology Information Resources
化学生物学信息资源
  • 批准号:
    8558120
  • 财政年份:
  • 资助金额:
    $ 665.96万
  • 项目类别:
Bioinformatics Methods for Mass Spectra Analysis
质谱分析的生物信息学方法
  • 批准号:
    7148179
  • 财政年份:
  • 资助金额:
    $ 665.96万
  • 项目类别:
Structural Biology Information Resources
结构生物学信息资源
  • 批准号:
    8943216
  • 财政年份:
  • 资助金额:
    $ 665.96万
  • 项目类别:
Structural Biology Information Resources
结构生物学信息资源
  • 批准号:
    7969205
  • 财政年份:
  • 资助金额:
    $ 665.96万
  • 项目类别:
Chemical Biology Information Resources
化学生物学信息资源
  • 批准号:
    8943242
  • 财政年份:
  • 资助金额:
    $ 665.96万
  • 项目类别:
Structural Biology Information Resources
结构生物学信息资源
  • 批准号:
    8344940
  • 财政年份:
  • 资助金额:
    $ 665.96万
  • 项目类别:
PubChem: An Information Resource for Chemical Structure
PubChem:化学结构信息资源
  • 批准号:
    7316287
  • 财政年份:
  • 资助金额:
    $ 665.96万
  • 项目类别:

相似国自然基金

结合态抗生素在水产品加工过程中的消解机制与产物毒性解析
  • 批准号:
    32302247
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
ABHD6与AMPA受体结合位点的鉴定及该位点在AMPA受体转运和功能调控中的作用研究
  • 批准号:
    32300794
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
α-突触核蛋白与脂肪酸结合蛋白FABP3相互作用维持自身低聚体形态的机制研究
  • 批准号:
    82301632
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
基于荧光共振能量转移机理构建多肽荧光探针用于可视化Zn2+结合SQSTM1/p62调节自噬在前列腺癌去势耐受中的作用机制
  • 批准号:
    82303568
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
手性氢键供体与阴离子结合催化乙烯基醚的立体选择性阳离子聚合
  • 批准号:
    22301279
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

Develop new bioinformatics infrastructures and computational tools for epitranscriptomics data
为表观转录组数据开发新的生物信息学基础设施和计算工具
  • 批准号:
    10633591
  • 财政年份:
    2023
  • 资助金额:
    $ 665.96万
  • 项目类别:
PAGE-G: Precision Approach combining Genes and Environment in Glaucoma
PAGE-G:青光眼基因与环境相结合的精准方法
  • 批准号:
    10797646
  • 财政年份:
    2023
  • 资助金额:
    $ 665.96万
  • 项目类别:
AI-powered cross-level cross-species omics data integration to elucidate mechanisms of EL
人工智能驱动的跨级别跨物种组学数据集成阐明 EL 机制
  • 批准号:
    10729946
  • 财政年份:
    2023
  • 资助金额:
    $ 665.96万
  • 项目类别:
iAGREE: A Multi- Center, Networked Patient Consent Study
iAGREE:一项多中心、网络化患者同意研究
  • 批准号:
    10748211
  • 财政年份:
    2023
  • 资助金额:
    $ 665.96万
  • 项目类别:
Immunogenomics and Systems Biology Core
免疫基因组学和系统生物学核心
  • 批准号:
    10452138
  • 财政年份:
    2022
  • 资助金额:
    $ 665.96万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了