Informatics, Machine Learning & Biomedical Data Science

信息学、机器学习

基本信息

项目摘要

Over the past year, we have been active in: (1) developing computationally efficient methods and algorithms to solve known problems in the analysis of biomedical and clinical data and study complex interactions in biological systems; (2) developing knowledge-based data management systems for the discovery and curation of biomedical knowledge, including distributed annotation systems and clinical information management systems; (3) applying predictive-analytic models to scientific and administrative domains; and (4) consulting with NIH leadership to provide evidence-based solutions to improve the grant application and review process. Specifically, in 2017, collaborative efforts in support of these goals included the following: -In a partnership with Dr. John Tsang of the NIAID Laboratory of Systems Biology, HPCIO is conducting a multifaceted project to profile the immune system using the latest high-throughput, multiplexed technologies and systems approaches. One of the goals of this collaboration is to develop novel computational methodologies that can exploit inter-subject heterogeneity and measurements at various scales to assess the roles of the immune system in health and disease. We have collected samples from a large cohort of patients with immune-mediated monogenic diseases and are the in process of deeply phenotyping blood samples of these patients. By studying the immune system of multiple monogenic, immune-mediated diseases, we will have the opportunities to infer cellular and molecular networks of the human immune system. HPCIO is actively involved in the development of a database to record clinical information of patient visits and in the bioinformatics analyses of data generated from the project. - HPCIO is working with NCI Occupational & Environmental Epidemiology Branch to develop methodologies to incorporate occupational risk factors into epidemiological models. We are enlarging the training data to improve our novel classifiers for coding free text job descriptions into the 840 codes of the 2010 U.S. Standard Occupational Classification System. Agreement between our classification system and expert coders is measured using SOC code agreement and exposure prediction from CANJEM, a job-exposure matrix of over 250 exposure agents developed by Jerome Lavoue at the University of Montreal. We are also working with NCI to develop a two-stage mixed generalized linear model to predict lifetime occupation exposures to lead. - In collaboration with the Membrane Transport Biophysics Section of NINDS, HPCIO is 1) developing a computational tool to accurately identify the boundaries of the lysosomes in fluorescence microscopy and 2) using the fluorescence ration to measure lysosomal pH within each organelle for better understanding of the lysosomal pH regulation. - A freely available plasmid database that is inter-operable with popular freeware is currently being developed for the NIDA Optogenetics and Transgenic Technology Core. The Plasmid Manager offers a versatile yet simple platform for scientists to store and analyze their plasmid data. Motivated by the need for a more comprehensive approach to archiving plasmid data, the database platform is enriched with numerous components beyond the repository, serving as an informatics platform designed to enhance the efficiency and analytic capabilities of scientists. - As high-throughput next-generation sequencing (NGS) technology plays an important role in systematically identifying novel cancer driver mutations in genome-wide surveys, NGS data generation is rapidly increasing, currently accumulating at a rate of several terabytes of data every month at the Lymphoid Malignancies Section of NCI. In collaboration with the Louis Staudt Laboratory, a bioinformatics website is being developed containing useful tools for the analysis of the laboratory's Diffuse Large B-Cell Lymphoma data. This website enables users with very little computer expertise to run their own analyses, as opposed to having a specialist run the analyses for them. Methodologies in parallelization and text searching have also been incorporated for returning the analysis results much more quickly and efficiently than before. In 2017, a new dimension to this collaboration was the development of machine learning methods to identify somatic from germline mutations from NGS sequencing data. Machine learning models have also been tested to identify subtypes of diffuse large B-cell lymphoma, based on their features of gene aberrations. - In collaboration with NIA and NCI, we are applying machine learning and visualization techniques on large biological datasets to discover novel patterns of functional gene or protein interactions as related to aging. In this collaboration, we are developing a machine learning method that models the temporal nature of the longitudinal clinical data to predict the progression of Amyotrophic lateral sclerosis. Such machine learning method may also work well in prediction of high-dimensional time-series genomic data. - The Human Salivary Proteome Wiki is a community-driven Web portal developed by HPCIO, in collaboration with NIDCR, to enable scientists to add their own research data, share results, and discover new knowledge. Many features and external contents have been incorporated over the last few years to make it easier for users to extract different kinds of information from the wiki. One of the latest enhancements is the integration of RNA-seq transcriptional and protein immunohistochemistry data from the Human Protein Atlas. This affords users the ability to weigh evidence generated by different, independent modalities, in addition to the original mass-spectrometry-based data, to assess the status of a protein. - In collaboration with CSR, HPCIO is applying text analytics to provide CSR leadership with evidence-based decision support in evaluation of the grant review process. A Web-based automated referral tool, called ART, was developed and deployed to help PIs and SROs to identify the most relevant study section(s) or special emphasis panel(s) based on the scientific content of an application. In addition, HPCIO is analyzing text from quick feedback surveys on peer review. HPCIO has developed a system to capture the sentiment of reviewer comments in quick feedback surveys and classify these comments with sentiment score into broad categories. Progress has been made to identify needs and suggestions offered by the reviewers and to assign topic labels for these needs and suggestions. In 2017, HPCIO began to explore appropriate topological network mapping diagrams of CSR study sections, superimposed with measures of scientific productivity for those study sections. - In collaboration with the Office of Data Analysis Tools and Systems, NIH Office of Extramural Research, HPCIO has been developing a standard database update pipeline for NIH Topic Maps, originally developed by Dr. Ned Talley of NINDS. This effort was concluded in 2017. - In collaboration with NIAID, HPCIO has supported its release HT JoinSolver(R), a new application capable of analyzing V(D)J recombination in thousands of immunoglobulin gene sequences produced by high throughput sequencing.
过去一年来,我们积极致力于:(1)开发计算高效的方法和算法,解决生物医学和临床数据分析中的已知问题,研究生物系统中复杂的相互作用; (2)开发基于知识的数据管理系统,用于生物医学知识的发现和管理,包括分布式注释系统和临床信息管理系统; (3) 将预测分析模型应用于科学和行政领域; (4) 与 NIH 领导层协商,提供基于证据的解决方案,以改进拨款申请和审查流程。 具体而言,2017 年,支持这些目标的协作努力包括以下内容: -HPCIO 与 NIAID 系统生物学实验室的 John Tsang 博士合作,正在进行一个多方面的项目,利用最新的高通量、多重技术和系统方法来分析免疫系统。此次合作的目标之一是开发新颖的计算方法,该方法可以利用受试者间的异质性和不同尺度的测量来评估免疫系统在健康和疾病中的作用。我们从大量患有免疫介导的单基因疾病的患者中收集了样本,并且正在对这些患者的血液样本进行深入的表型分析。 通过研究多种单基因免疫介导疾病的免疫系统,我们将有机会推断人类免疫系统的细胞和分子网络。 HPCIO 积极参与数据库的开发,以记录患者就诊的临床信息,并对项目生成的数据进行生物信息学分析。 - HPCIO 正在与 NCI 职业与环境流行病学分部合作,开发将职业风险因素纳入流行病学模型的方法。我们正在扩大训练数据,以改进我们的新型分类器,将自由文本职位描述编码为 2010 年美国标准职业分类系统的 840 个代码。我们的分类系统和专家编码员之间的一致性是使用 SOC 代码一致性和 CANJEM 的暴露预测来测量的,CANJEM 是蒙特利尔大学 Jerome Lavoue 开发的一个包含 250 多种暴露剂的工作暴露矩阵。 我们还与 NCI 合作开发一个两阶段混合广义线性模型来预测终生职业铅暴露量。 - HPCIO 与 NINDS 的膜运输生物物理学部分合作,1) 开发一种计算工具,以在荧光显微镜中准确识别溶酶体的边界;2) 使用荧光比率测量每个细胞器内的溶酶体 pH 值,以便更好地了解溶酶体的 pH 值。溶酶体pH调节。 - 目前正在为 NIDA 光遗传学和转基因技术核心开发可与流行免费软件互操作的免费质粒数据库。质粒管理器为科学家提供了一个多功能而简单的平台来存储和分析他们的质粒数据。由于需要更全面的方法来归档质粒数据,该数据库平台丰富了存储库之外的众多组件,作为一个旨在提高科学家的效率和分析能力的信息学平台。 - 由于高通量下一代测序 (NGS) 技术在全基因组调查中系统识别新型癌症驱动突变方面发挥着重要作用,NGS 数据生成量正在迅速增加,目前每月以数 TB 的速度积累数据NCI 淋巴恶性肿瘤科。与 Louis Staudt 实验室合作,正在开发一个生物信息学网站,其中包含用于分析实验室弥漫性大 B 细胞淋巴瘤数据的有用工具。该网站使计算机专业知识很少的用户能够运行自己的分析,而不是让专家为他们运行分析。并行化和文本搜索的方法也被纳入其中,以便比以前更快、更有效地返回分析结果。 2017 年,这项合作的一个新维度是开发机器学习方法,从 NGS 测序数据中识别种系突变中的体细胞。 机器学习模型也经过测试,可根据弥漫性大 B 细胞淋巴瘤的基因畸变特征来识别其亚型。 - 我们与 NIA 和 NCI 合作,在大型生物数据集上应用机器学习和可视化技术,以发现与衰老相关的功能基因或蛋白质相互作用的新模式。在这次合作中,我们正在开发一种机器学习方法,该方法可以模拟纵向临床数据的时间性质,以预测肌萎缩侧索硬化症的进展。这种机器学习方法也可以很好地预测高维时间序列基因组数据。 - 人类唾液蛋白质组 Wiki 是由 HPCIO 与 NIDCR 合作开发的社区驱动的门户网站,使科学家能够添加自己的研究数据、共享结果和发现新知识。在过去的几年里,许多功能和外部内容被纳入其中,使用户可以更轻松地从 wiki 中提取不同类型的信息。 最新的增强功能之一是整合来自人类蛋白质图谱的 RNA-seq 转录和蛋白质免疫组织化学数据。 除了原始的基于质谱的数据之外,这使用户能够权衡不同的独立模式生成的证据,以评估蛋白质的状态。 - HPCIO 与 CSR 合作,应用文本分析为 CSR 领导层提供基于证据的决策支持,以评估拨款审查流程。开发并部署了一种称为 ART 的基于网络的自动推荐工具,以帮助 PI 和 SRO 根据应用程序的科学内容确定最相关的研究部分或特别重点小组。此外,HPCIO 正在分析来自同行评审快速反馈调查的文本。 HPCIO 开发了一个系统,可以在快速反馈调查中捕获审稿人评论的情绪,并将这些带有情绪分数的评论分类为广泛的类别。 在确定审稿人提出的需求和建议以及为这些需求和建议分配主题标签方面已经取得了进展。 2017年,HPCIO开始探索企业社会责任研究部分的适当拓扑网络映射图,并叠加这些研究部分的科学生产力衡量标准。 - HPCIO 与数据分析工具和系统办公室、NIH 校外研究办公室合作,一直在为 NIH 主题图开发标准数据库更新管道,该管道最初由 NINDS 的 Ned Talley 博士开发。 这项工作于 2017 年完成。 - HPCIO 与 NIAID 合作,支持其发布 HT JoinSolver(R),这是一种新应用程序,能够分析高通量测序产生的数千个免疫球蛋白基因序列中的 V(D)J 重组。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

calvin a johnson其他文献

calvin a johnson的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('calvin a johnson', 18)}}的其他基金

Text Analytics, Knowledge Engineering, & High Performance Computing
文本分析、知识工程、
  • 批准号:
    8565613
  • 财政年份:
  • 资助金额:
    $ 261.98万
  • 项目类别:
High Performance Biomedical Computing And Informatics
高性能生物医学计算和信息学
  • 批准号:
    6988044
  • 财政年份:
  • 资助金额:
    $ 261.98万
  • 项目类别:
Collective Intelligence, Knowledge Infrastructure, & High Performance Computing
集体智慧、知识基础设施、
  • 批准号:
    7970400
  • 财政年份:
  • 资助金额:
    $ 261.98万
  • 项目类别:
High Performance Biomedical Computing And Informatics
高性能生物医学计算和信息学
  • 批准号:
    7593224
  • 财政年份:
  • 资助金额:
    $ 261.98万
  • 项目类别:
Informatics, Machine Learning & Biomedical Data Science
信息学、机器学习
  • 批准号:
    9146134
  • 财政年份:
  • 资助金额:
    $ 261.98万
  • 项目类别:
High Performance Biomedical Computing And Informatics
高性能生物医学计算和信息学
  • 批准号:
    7145121
  • 财政年份:
  • 资助金额:
    $ 261.98万
  • 项目类别:
High Performance Biomedical Computing And Informatics
高性能生物医学计算和信息学
  • 批准号:
    7145121
  • 财政年份:
  • 资助金额:
    $ 261.98万
  • 项目类别:
High Performance Biomedical Computing And Informatics
高性能生物医学计算和信息学
  • 批准号:
    7296863
  • 财政年份:
  • 资助金额:
    $ 261.98万
  • 项目类别:
Text Analytics, Machine Learning & High Performance Computing
文本分析、机器学习
  • 批准号:
    8746900
  • 财政年份:
  • 资助金额:
    $ 261.98万
  • 项目类别:
High Performance Biomedical Computing And Informatics
高性能生物医学计算和信息学
  • 批准号:
    7733758
  • 财政年份:
  • 资助金额:
    $ 261.98万
  • 项目类别:

相似国自然基金

水下多模态无线传感网路由协议及资源调度算法研究
  • 批准号:
  • 批准年份:
    2021
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
面向分布式机器学习的网络协议优化算法
  • 批准号:
  • 批准年份:
    2021
  • 资助金额:
    59 万元
  • 项目类别:
    面上项目
基于分布式下推自动机理论的多协议多域异构网络路径优化算法研究
  • 批准号:
  • 批准年份:
    2021
  • 资助金额:
    58 万元
  • 项目类别:
    面上项目
赣南山区地形无线传感器网络动态覆盖优化算法与路由协议研究
  • 批准号:
    62062037
  • 批准年份:
    2020
  • 资助金额:
    35 万元
  • 项目类别:
    地区科学基金项目
基于神经网络的多变量量子密码算法及协议研究
  • 批准号:
  • 批准年份:
    2019
  • 资助金额:
    60 万元
  • 项目类别:
    面上项目

相似海外基金

Administrative Core
行政核心
  • 批准号:
    10555724
  • 财政年份:
    2023
  • 资助金额:
    $ 261.98万
  • 项目类别:
Mendelian imputation for family-based GWAS and association-by-proxy in diverse ancestries
基于家庭的 GWAS 和不同祖先的代理关联的孟德尔插补
  • 批准号:
    10717993
  • 财政年份:
    2023
  • 资助金额:
    $ 261.98万
  • 项目类别:
Early diagnosis of light chain amyloidosis
轻链淀粉样变性的早期诊断
  • 批准号:
    10562721
  • 财政年份:
    2023
  • 资助金额:
    $ 261.98万
  • 项目类别:
Multi-modal Tracking of In Vivo Skeletal Structures and Implants
体内骨骼结构和植入物的多模式跟踪
  • 批准号:
    10367144
  • 财政年份:
    2022
  • 资助金额:
    $ 261.98万
  • 项目类别:
Enhancing the Efficiency of Pragmatic Clinical Trials Using Administrative Data: Analysis of the STRIDE Study
使用管理数据提高实用临床试验的效率:STRIDE 研究分析
  • 批准号:
    10365009
  • 财政年份:
    2022
  • 资助金额:
    $ 261.98万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了