Informatics, Machine Learning & Biomedical Data Science

信息学、机器学习

基本信息

项目摘要

The Informatics, Machine Learning, and Biomedical Data Science, which operates within the High Performance Computing and Informatics Office (HPCIO), Division of Computational Bioscience of CIT, is collaborating with NIH investigators to build a critical mass in text and numerical analytics that is envisioned to encompass a number of pertinent and related disciplines in biomedical research including semantic interoperability, computational linguistics, text and data mining, natural language processing, machine learning, longitudinal analysis, and visualization. The program is intended to foster advances in critical domains at NIH including biomedical and clinical informatics, translational research, genomics, proteomics, systems biology, "big data" analysis, and portfolio analysis. In 2015, collaborative efforts in support of these goals included the following: -In collaboration with NIA, we are applying machine learning and visualization techniques on large biological datasets to discover novel patterns of functional gene or protein interactions as related to aging. In this collaboration, we are developing a machine learning method that models the temporal nature of the longitudinal clinical data to predict the progression of Amyotrophic lateral sclerosis. Such machine learning method may also work well in prediction of high-dimensional time-series genomic data. - In collaboration with NIAID, HPCIO has released HT JoinSolver(R), a new application capable of analyzing V(D)J recombination in thousands of immunoglobulin gene sequences produced by high throughput sequencing. - HPCIO is working with NCI to develop methodologies to incorporate occupational risk factors into epidemiological models. Novel classifiers are being developed to classify free text job descriptions into the 840 codes of the 2010 U.S. Standard Occupational Classification System. Agreement between our classification system and expert coders is measured using SOC code agreement and exposure agreement after applying CANJEM, a job-exposure matrix of over 250 exposure agents developed by Jrme Lavou at the University of Montreal. - In collaboration with the Membrane Transport Biophysics Section NINDS, HPCIO is 1) developing a tool to accurately identify the boundaries of the lysosomes in fluorescence microscopy and 2) using the fluorescence ration to measure lysosomal pH within each organelle for better understanding of the lysosomal pH regulation. - HPCIO is collaborating with NIAID to study immune cell infiltration in various tissue samples from patients with metabolic diseases. Using systems-based approaches, we examine gene expression and genotyping data to understand the roles and interactions of different immune cells in response to metabolic disease signals and their associations to intervention outcomes and other phenotypes. - A freely available plasmid database that is interoperable with popular freeware is currently being developed for the NIDA Optogenetics and Transgenic Technology Core. The Plasmid Manager offers a versatile yet simple platform for scientists to store and analyze their plasmid data. Motivated by the need for a more comprehensive approach to archiving plasmid data, the database platform is enriched with numerous components beyond the repository, serving as an informatics platform designed to enhance the efficiency and analytic capabilities of scientists. - In collaboration with CSR, HPCIO is applying text analytics to provide CSR leadership with evidence-based decision support in evaluation of the grant review process. A Web-based automated referral tool, called ART, is being developed to help PIs and SROs to identify the most relevant study section(s) or special emphasis panel(s) based on the scientific content of an application. In addition, HPCIO is analyzing text from quick feedback surveys on peer review. This effort includes evaluatinng a pilot study to evaluate the feasibility of analyzing free text from peer reviewers on their perception of the study section quality. If successful, the pilot results will be used to as initial input for a full-scale implementation. - The Human Salivary Protein wiki has been made available online on a community-based Web portal developed by HPCIO, in collaboration with NIDCR, to enable scientists to add their own research data, share results, and discover new knowledge. This is a major step towards the discovery and use of saliva biomarkers to diagnose oral and systemic diseases. - In collaboration with the Office of Data Analysis Tools and Systems, NIH Office of the Director, HPCIO has been developing a standard database update pipeline for NIH Topic Maps, originally developed by Dr. Ned Talley of NINDS. We are evaluating whether this pipeline can be incorporated into a stable hosted instance. - As high-throughput next-generation sequencing (NGS) technology plays an important role in systematically identifying novel cancer driver mutations in genome-wide surveys, NGS data generation is rapidly increasing, currently accumulating at a rate of several terabytes of data every month at the Lymphoid Malignancies Section of NCI. We need to enhance database platforms in anticipation of even more growth in the near future. The recent emergence of Hadoop/NoSQL systems (e.g., Hbase) has provided an alternative platform for querying large-scale genomic data. In addition, relational database providers have been enhancing their offerings to include products for explicitly distributing data across multiple nodes (e.g., Postgres XL). We have sought to integrate these technologies with current relational database systems (e.g., Postgres) to improve performance in a parallel or distributed manner. The goal of our effort has been to investigate the potential of these distributed platforms in storing and querying the large volumes of data that NCI accumulates, thereby augmenting their current analytical capabilities. - Based on its experience in building novel models for classifying research grants and projects, HPCIO has collaborated with DPCPSI/OD and other ICs to develop the Portfolio Learning Tool, a comprehensive classification workflow system that will allow users to select from multiple classification algorithms, feature spaces, and training regimes, to build and run their own classifiers. HPCIO has developed an augmented support vector machine (SVM) that augments a training set by sampling from a corpus of unknowns and runs a large ensemble on various samples of this augmented space. The results obtained from this classifier suggest that, when coupled with an effective annotation strategy, such a classifier can be quite effective at categorizing a research portfolio.
在高性能计算和信息办公室(HPCIO)中运行的信息学,机器学习和生物医学数据科学,CIT计算生物科学的划分,正在与NIH研究人员合作,在文本和数值分析中建立批判性质量,这些分析设想,这些分析被设想,以涵盖许多知名度和相关的自然级别的生物学和计算机研究,包括生物学和计算机研究,并具有计算性的分子,该研究,计算机研究,计算机研究,分类性分析性,这些分析性的分析性,计算性的分子,分类性分析性,这些分析性,计算机研究,分类性分析。处理,机器学习,纵向分析和可视化。 该计划旨在促进NIH关键领域的进步,包括生物医学和临床信息学,转化研究,基因组学,蛋白质组学,系统生物学,“大数据”分析和投资组合分析。 2015年,为支持这些目标的协作努力包括以下内容: - 与NIA合作,我们正在大型生物数据集上应用机器学习和可视化技术,以发现与衰老相关的功能基因或蛋白质相互作用的新型模式。在这种合作中,我们正在开发一种机器学习方法,该方法可以模拟纵向临床数据的时间性质,以预测肌萎缩性侧向硬化的进展。这种机器学习方法在预测高维时序列基因组数据方面也可以很好地发挥作用。 - HPCIO与NIAID合作发布了HT Joinolver(R),该应用程序能够分析通过高通量测序产生的数千种免疫球蛋白基因序列进行V(d)J重组。 -HPCIO正在与NCI合作开发将职业风险因素纳入流行病学模型的方法。正在开发新颖的分类器,以将免费文本作业描述分类为2010年美国标准职业分类系统的840个代码。 我们的分类系统与专家编码人员之间的协议是使用SOC代码协议和曝光协议衡量的,CANJEM是蒙特利尔大学JRME LAVOU开发的250多个暴露剂的工作曝光矩阵。 - 与膜传输生物物理学部分Ninds合作,HPCIO为1)开发一种工具,以准确识别荧光显微镜中溶酶体的边界,以及2)使用荧光评估来测量每个细胞器中的溶酶体pH值,以更好地了解溶酶体pH调节。 -HPCIO与NIAID合作研究了来自代谢疾病患者的各种组织样品中的免疫细胞浸润。使用基于系统的方法,我们检查了基因表达和基因分型数据,以了解不同免疫细胞在响应代谢疾病信号及其与干预结果和其他表型的关联方面的作用和相互作用。 - 目前正在为NIDA光学遗传学和转基因技术核心开发与流行免费软件可互操作的免费质粒数据库。质粒经理为科学家提供了一个通用而简单的平台,以存储和分析其质粒数据。由于需要采用更全面的方法来存档质粒数据的动机,数据库平台充满了存储库以外的许多组件,它是一个信息平台,旨在提高科学家的效率和分析能力。 - 与CSR合作,HPCIO正在应用文本分析,以在评估赠款审查过程中为CSR领导提供基于证据的决策支持。正在开发一种基于Web的自动推荐工具,称为ART,以帮助PIS和SRO基于应用程序的科学内容确定最相关的研究部分或特别重点小组。此外,HPCIO正在分析有关同行评审的快速反馈调查中的文本。这项工作包括评估一项试点研究,以评估对同行评审者对研究部分质量的看法分析自由文本的可行性。如果成功,则试点结果将用作全面实施的初始输入。 - 人类唾液蛋白Wiki已在HPCIO与NIDCR合作开发的基于社区的Web门户网站上在线提供,以使科学家能够添加自己的研究数据,共享结果并发现新知识。这是朝着发现和使用唾液生物标志物来诊断口腔和系统性疾病的主要一步。 - 与数据分析工具和系统办公室合作,NIH主任办公室,HPCIO一直在为NINDS的Ned Talley博士开发,为NIH主题地图开发标准数据库更新管道。 我们正在评估是否可以将此管道合并到稳定的托管实例中。 - 由于高通量下一代测序(NGS)技术在系统地识别全基因组调查中新型癌症驱动突变中起着重要作用,因此NGS数据生成正在迅速增加,目前以NCI淋巴恶性段的每月几个数据的速度积累了几个数据的速度。我们需要增强数据库平台,以期在不久的将来会增加更多的增长。 Hadoop/NoSQL系统(例如HBase)最近出现的出现提供了一个替代平台来查询大规模基因组数据。此外,关系数据库提供商一直在增强其产品,以包括用于在多个节点(例如Postgres XL)上明确分发数据的产品。我们试图将这些技术与当前关系数据库系统(例如Postgres)集成在一起,以并行或分布式方式提高性能。我们努力的目的是研究这些分布式平台在存储和查询NCI积累的大量数据的潜力,从而增强其当前的分析能力。 - 基于其在为研究赠款和项目分类的新颖模型方面的经验,HPCIO与DPCPSI/OD和其他ICS合作开发了投资组合学习工具,该工具是一个全面的分类工作流程系统,该工具将允许用户从多个分类算法,特征空间,培训方案以及构建和运行自己的分类器中选择。 HPCIO开发了一个增强的支持向量机(SVM),该增强了通过从未知数的语料库中取样设置的培训,并在此增强空间的各个样本上运行了一个大合奏。从该分类器获得的结果表明,当与有效的注释策略结合使用时,这种分类器可以非常有效地对研究组合进行分类。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

calvin a johnson其他文献

calvin a johnson的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('calvin a johnson', 18)}}的其他基金

Text Analytics, Knowledge Engineering, & High Performance Computing
文本分析、知识工程、
  • 批准号:
    8565613
  • 财政年份:
  • 资助金额:
    $ 194.16万
  • 项目类别:
High Performance Biomedical Computing And Informatics
高性能生物医学计算和信息学
  • 批准号:
    7593224
  • 财政年份:
  • 资助金额:
    $ 194.16万
  • 项目类别:
Text Analytics, Machine Learning & High Performance Computing
文本分析、机器学习
  • 批准号:
    8746900
  • 财政年份:
  • 资助金额:
    $ 194.16万
  • 项目类别:
Text Analytics, Machine Learning & Biomedical Data Science
文本分析、机器学习
  • 批准号:
    8941588
  • 财政年份:
  • 资助金额:
    $ 194.16万
  • 项目类别:
High Performance Biomedical Computing And Informatics
高性能生物医学计算和信息学
  • 批准号:
    7145121
  • 财政年份:
  • 资助金额:
    $ 194.16万
  • 项目类别:
High Performance Biomedical Computing And Informatics
高性能生物医学计算和信息学
  • 批准号:
    6988044
  • 财政年份:
  • 资助金额:
    $ 194.16万
  • 项目类别:
High Performance Biomedical Computing And Informatics
高性能生物医学计算和信息学
  • 批准号:
    7733758
  • 财政年份:
  • 资助金额:
    $ 194.16万
  • 项目类别:
High Performance Biomedical Computing And Informatics
高性能生物医学计算和信息学
  • 批准号:
    6988075
  • 财政年份:
  • 资助金额:
    $ 194.16万
  • 项目类别:
Collective Intelligence, Knowledge Infrastructure, & High Performance Computing
集体智慧、知识基础设施、
  • 批准号:
    7970400
  • 财政年份:
  • 资助金额:
    $ 194.16万
  • 项目类别:
High Performance Biomedical Computing And Informatics
高性能生物医学计算和信息学
  • 批准号:
    7296863
  • 财政年份:
  • 资助金额:
    $ 194.16万
  • 项目类别:

相似国自然基金

温度作用下CA砂浆非线性老化蠕变性能的多尺度研究
  • 批准号:
    12302265
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
基于波动法的叠层橡胶隔震支座老化损伤原位检测及精确评估方法研究
  • 批准号:
    52308322
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
微纳核壳结构填充体系构建及其对聚乳酸阻燃、抗老化、降解和循环的作用机制
  • 批准号:
    52373051
  • 批准年份:
    2023
  • 资助金额:
    50 万元
  • 项目类别:
    面上项目
东北黑土中农膜源微塑料冻融老化特征及其毒性效应
  • 批准号:
    42377282
  • 批准年份:
    2023
  • 资助金额:
    49 万元
  • 项目类别:
    面上项目
高层建筑外墙保温材料环境暴露自然老化后飞火点燃机理及模型研究
  • 批准号:
    52376132
  • 批准年份:
    2023
  • 资助金额:
    50 万元
  • 项目类别:
    面上项目

相似海外基金

The Proactive and Reactive Neuromechanics of Instability in Aging and Dementia with Lewy Bodies
衰老和路易体痴呆中不稳定的主动和反应神经力学
  • 批准号:
    10749539
  • 财政年份:
    2024
  • 资助金额:
    $ 194.16万
  • 项目类别:
Administrative Core
行政核心
  • 批准号:
    10555682
  • 财政年份:
    2023
  • 资助金额:
    $ 194.16万
  • 项目类别:
Brain Mechanisms of Chronic Low-Back Pain: Specificity and Effects of Aging and Sex
慢性腰痛的脑机制:衰老和性别的特异性和影响
  • 批准号:
    10657958
  • 财政年份:
    2023
  • 资助金额:
    $ 194.16万
  • 项目类别:
Label-free, live-cell classification of neural stem cell activation state and dynamics
神经干细胞激活状态和动力学的无标记活细胞分类
  • 批准号:
    10863309
  • 财政年份:
    2023
  • 资助金额:
    $ 194.16万
  • 项目类别:
Feasibility of a Hearing Program in Primary Care for Underserved Older Adults
为服务不足的老年人提供初级保健听力计划的可行性
  • 批准号:
    10727976
  • 财政年份:
    2023
  • 资助金额:
    $ 194.16万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了