Informatics, Machine Learning & Biomedical Data Science

信息学、机器学习

基本信息

项目摘要

The Informatics, Machine Learning, and Biomedical Data Science, which operates within the High Performance Computing and Informatics Office (HPCIO), Division of Computational Bioscience of CIT, is collaborating with NIH investigators to build a critical mass in text and numerical analytics that is envisioned to encompass a number of pertinent and related disciplines in biomedical research including semantic interoperability, computational linguistics, text and data mining, natural language processing, machine learning, longitudinal analysis, and visualization. The program is intended to foster advances in critical domains at NIH including biomedical and clinical informatics, translational research, genomics, proteomics, systems biology, "big data" analysis, and portfolio analysis. In 2015, collaborative efforts in support of these goals included the following: -In collaboration with NIA, we are applying machine learning and visualization techniques on large biological datasets to discover novel patterns of functional gene or protein interactions as related to aging. In this collaboration, we are developing a machine learning method that models the temporal nature of the longitudinal clinical data to predict the progression of Amyotrophic lateral sclerosis. Such machine learning method may also work well in prediction of high-dimensional time-series genomic data. - In collaboration with NIAID, HPCIO has released HT JoinSolver(R), a new application capable of analyzing V(D)J recombination in thousands of immunoglobulin gene sequences produced by high throughput sequencing. - HPCIO is working with NCI to develop methodologies to incorporate occupational risk factors into epidemiological models. Novel classifiers are being developed to classify free text job descriptions into the 840 codes of the 2010 U.S. Standard Occupational Classification System. Agreement between our classification system and expert coders is measured using SOC code agreement and exposure agreement after applying CANJEM, a job-exposure matrix of over 250 exposure agents developed by Jrme Lavou at the University of Montreal. - In collaboration with the Membrane Transport Biophysics Section NINDS, HPCIO is 1) developing a tool to accurately identify the boundaries of the lysosomes in fluorescence microscopy and 2) using the fluorescence ration to measure lysosomal pH within each organelle for better understanding of the lysosomal pH regulation. - HPCIO is collaborating with NIAID to study immune cell infiltration in various tissue samples from patients with metabolic diseases. Using systems-based approaches, we examine gene expression and genotyping data to understand the roles and interactions of different immune cells in response to metabolic disease signals and their associations to intervention outcomes and other phenotypes. - A freely available plasmid database that is interoperable with popular freeware is currently being developed for the NIDA Optogenetics and Transgenic Technology Core. The Plasmid Manager offers a versatile yet simple platform for scientists to store and analyze their plasmid data. Motivated by the need for a more comprehensive approach to archiving plasmid data, the database platform is enriched with numerous components beyond the repository, serving as an informatics platform designed to enhance the efficiency and analytic capabilities of scientists. - In collaboration with CSR, HPCIO is applying text analytics to provide CSR leadership with evidence-based decision support in evaluation of the grant review process. A Web-based automated referral tool, called ART, is being developed to help PIs and SROs to identify the most relevant study section(s) or special emphasis panel(s) based on the scientific content of an application. In addition, HPCIO is analyzing text from quick feedback surveys on peer review. This effort includes evaluatinng a pilot study to evaluate the feasibility of analyzing free text from peer reviewers on their perception of the study section quality. If successful, the pilot results will be used to as initial input for a full-scale implementation. - The Human Salivary Protein wiki has been made available online on a community-based Web portal developed by HPCIO, in collaboration with NIDCR, to enable scientists to add their own research data, share results, and discover new knowledge. This is a major step towards the discovery and use of saliva biomarkers to diagnose oral and systemic diseases. - In collaboration with the Office of Data Analysis Tools and Systems, NIH Office of the Director, HPCIO has been developing a standard database update pipeline for NIH Topic Maps, originally developed by Dr. Ned Talley of NINDS. We are evaluating whether this pipeline can be incorporated into a stable hosted instance. - As high-throughput next-generation sequencing (NGS) technology plays an important role in systematically identifying novel cancer driver mutations in genome-wide surveys, NGS data generation is rapidly increasing, currently accumulating at a rate of several terabytes of data every month at the Lymphoid Malignancies Section of NCI. We need to enhance database platforms in anticipation of even more growth in the near future. The recent emergence of Hadoop/NoSQL systems (e.g., Hbase) has provided an alternative platform for querying large-scale genomic data. In addition, relational database providers have been enhancing their offerings to include products for explicitly distributing data across multiple nodes (e.g., Postgres XL). We have sought to integrate these technologies with current relational database systems (e.g., Postgres) to improve performance in a parallel or distributed manner. The goal of our effort has been to investigate the potential of these distributed platforms in storing and querying the large volumes of data that NCI accumulates, thereby augmenting their current analytical capabilities. - Based on its experience in building novel models for classifying research grants and projects, HPCIO has collaborated with DPCPSI/OD and other ICs to develop the Portfolio Learning Tool, a comprehensive classification workflow system that will allow users to select from multiple classification algorithms, feature spaces, and training regimes, to build and run their own classifiers. HPCIO has developed an augmented support vector machine (SVM) that augments a training set by sampling from a corpus of unknowns and runs a large ensemble on various samples of this augmented space. The results obtained from this classifier suggest that, when coupled with an effective annotation strategy, such a classifier can be quite effective at categorizing a research portfolio.
信息学、机器学习和生物医学数据科学部门在 CIT 计算生物科学部门高性能计算和信息学办公室 (HPCIO) 内运作,正在与 NIH 研究人员合作,以建立预期的文本和数值分析领域的临界质量。涵盖生物医学研究中的许多相关学科,包括语义互操作性、计算语言学、文本和数据挖掘、自然语言处理、机器学习、纵向分析和可视化。 该计划旨在促进 NIH 关键领域的进步,包括生物医学和临床信息学、转化研究、基因组学、蛋白质组学、系统生物学、“大数据”分析和组合分析。 2015 年,支持这些目标的协作努力包括: -与 NIA 合作,我们正在大型生物数据集上应用机器学习和可视化技术,以发现与衰老相关的功能基因或蛋白质相互作用的新模式。在这次合作中,我们正在开发一种机器学习方法,该方法可以模拟纵向临床数据的时间性质,以预测肌萎缩侧索硬化症的进展。这种机器学习方法也可以很好地预测高维时间序列基因组数据。 - HPCIO 与 NIAID 合作发布了 HT JoinSolver(R),这是一款新应用程序,能够分析高通量测序产生的数千个免疫球蛋白基因序列中的 V(D)J 重组。 - HPCIO 正在与 NCI 合作开发将职业风险因素纳入流行病学模型的方法。新型分类器正在开发中,可将自由文本职位描述分类为 2010 年美国标准职业分类系统的 840 个代码。 我们的分类系统和专家编码员之间的一致性是在应用 CANJEM 后使用 SOC 代码一致性和暴露一致性来测量的,CANJEM 是蒙特利尔大学 Jrme Lavou 开发的一个包含 250 多种暴露剂的工作暴露矩阵。 - HPCIO 与 NINDS 膜运输生物物理学科合作,1) 开发一种工具,可在荧光显微镜中准确识别溶酶体的边界;2) 使用荧光比率测量每个细胞器内的溶酶体 pH 值,以更好地了解溶酶体 pH 值规定。 - HPCIO 正在与 NIAID 合作,研究代谢性疾病患者各种组织样本中的免疫细胞浸润情况。使用基于系统的方法,我们检查基因表达和基因分型数据,以了解不同免疫细胞响应代谢疾病信号的作用和相互作用及其与干预结果和其他表型的关联。 - 目前正在为 NIDA 光遗传学和转基因技术核心开发可与流行免费软件互操作的免费质粒数据库。质粒管理器为科学家提供了一个多功能而简单的平台来存储和分析他们的质粒数据。由于需要更全面的方法来归档质粒数据,该数据库平台丰富了存储库之外的众多组件,作为一个旨在提高科学家的效率和分析能力的信息学平台。 - HPCIO 与 CSR 合作,应用文本分析为 CSR 领导层提供基于证据的决策支持,以评估拨款审查流程。正在开发一种称为 ART 的基于网络的自动推荐工具,以帮助 PI 和 SRO 根据申请的科学内容确定最相关的研究部分或特别重点小组。此外,HPCIO 正在分析来自同行评审快速反馈调查的文本。这项工作包括评估一项试点研究,以评估分析同行评审员对研究部分质量的看法的自由文本的可行性。如果成功,试点结果将用作全面实施的初始输入。 - 人类唾液蛋白 wiki 已在 HPCIO 与 NIDCR 合作开发的基于社区的门户网站上在线提供,使科学家能够添加自己的研究数据、共享结果和发现新知识。这是发现和使用唾液生物标志物诊断口腔和全身疾病的重要一步。 - HPCIO 与 NIH 主任办公室数据分析工具和系统办公室合作,一直在为 NIH 主题图开发标准数据库更新管道,该管道最初由 NINDS 的 Ned Talley 博士开发。 我们正在评估该管道是否可以合并到稳定的托管实例中。 - 由于高通量下一代测序 (NGS) 技术在全基因组调查中系统识别新型癌症驱动突变方面发挥着重要作用,NGS 数据生成量正在迅速增加,目前每月以数 TB 的速度积累数据NCI 淋巴恶性肿瘤科。我们需要增强数据库平台,以期在不久的将来实现更大的增长。最近出现的 Hadoop/NoSQL 系统(例如 Hbase)为查询大规模基因组数据提供了替代平台。此外,关系数据库提供商一直在增强其产品,包括用于跨多个节点显式分发数据的产品(例如 Postgres XL)。我们试图将这些技术与当前的关系数据库系统(例如 Postgres)集成,以并行或分布式方式提高性能。我们努力的目标是调查这些分布式平台在存储和查询 NCI 积累的大量数据方面的潜力,从而增强其当前的分析能力。 - 基于构建用于对研究资助和项目进行分类的新颖模型的经验,HPCIO 与 DPCPSI/OD 和其他 IC 合作开发了组合学习工具,这是一个全面的分类工作流程系统,允许用户从多种分类算法、特征中进行选择空间和培训制度来构建和运行自己的分类器。 HPCIO 开发了一种增强支持向量机 (SVM),它通过从未知语料库中采样来增强训练集,并在此增强空间的各种样本上运行大型集成。从该分类器获得的结果表明,当与有效的注释策略相结合时,这种分类器可以非常有效地对研究组合进行分类。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

calvin a johnson其他文献

calvin a johnson的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('calvin a johnson', 18)}}的其他基金

Text Analytics, Knowledge Engineering, & High Performance Computing
文本分析、知识工程、
  • 批准号:
    8565613
  • 财政年份:
  • 资助金额:
    $ 194.16万
  • 项目类别:
High Performance Biomedical Computing And Informatics
高性能生物医学计算和信息学
  • 批准号:
    6988044
  • 财政年份:
  • 资助金额:
    $ 194.16万
  • 项目类别:
Collective Intelligence, Knowledge Infrastructure, & High Performance Computing
集体智慧、知识基础设施、
  • 批准号:
    7970400
  • 财政年份:
  • 资助金额:
    $ 194.16万
  • 项目类别:
Informatics, Machine Learning & Biomedical Data Science
信息学、机器学习
  • 批准号:
    9550738
  • 财政年份:
  • 资助金额:
    $ 194.16万
  • 项目类别:
High Performance Biomedical Computing And Informatics
高性能生物医学计算和信息学
  • 批准号:
    7593224
  • 财政年份:
  • 资助金额:
    $ 194.16万
  • 项目类别:
High Performance Biomedical Computing And Informatics
高性能生物医学计算和信息学
  • 批准号:
    7145121
  • 财政年份:
  • 资助金额:
    $ 194.16万
  • 项目类别:
High Performance Biomedical Computing And Informatics
高性能生物医学计算和信息学
  • 批准号:
    7145121
  • 财政年份:
  • 资助金额:
    $ 194.16万
  • 项目类别:
High Performance Biomedical Computing And Informatics
高性能生物医学计算和信息学
  • 批准号:
    7296863
  • 财政年份:
  • 资助金额:
    $ 194.16万
  • 项目类别:
Text Analytics, Machine Learning & High Performance Computing
文本分析、机器学习
  • 批准号:
    8746900
  • 财政年份:
  • 资助金额:
    $ 194.16万
  • 项目类别:
High Performance Biomedical Computing And Informatics
高性能生物医学计算和信息学
  • 批准号:
    7733758
  • 财政年份:
  • 资助金额:
    $ 194.16万
  • 项目类别:

相似国自然基金

角质形成细胞源性外泌体携载miR-31调控成纤维细胞ERK通路抗皮肤老化的作用机制
  • 批准号:
    82373460
  • 批准年份:
    2023
  • 资助金额:
    49 万元
  • 项目类别:
    面上项目
塑料光老化介导的微(纳)塑料形成和光解产物释放对雄性生殖内分泌的干扰研究
  • 批准号:
    22376195
  • 批准年份:
    2023
  • 资助金额:
    50 万元
  • 项目类别:
    面上项目
东北黑土中农膜源微塑料冻融老化特征及其毒性效应
  • 批准号:
    42377282
  • 批准年份:
    2023
  • 资助金额:
    49 万元
  • 项目类别:
    面上项目
温度作用下CA砂浆非线性老化蠕变性能的多尺度研究
  • 批准号:
    12302265
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
苯乙烯-丁二烯共聚物力化学老化的自由基捕获光环加成协同修复机制
  • 批准号:
    22303065
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

The Proactive and Reactive Neuromechanics of Instability in Aging and Dementia with Lewy Bodies
衰老和路易体痴呆中不稳定的主动和反应神经力学
  • 批准号:
    10749539
  • 财政年份:
    2024
  • 资助金额:
    $ 194.16万
  • 项目类别:
Feasibility of a Hearing Program in Primary Care for Underserved Older Adults
为服务不足的老年人提供初级保健听力计划的可行性
  • 批准号:
    10727976
  • 财政年份:
    2023
  • 资助金额:
    $ 194.16万
  • 项目类别:
ADVANCED COMPREHENSIVE MAGNETIC RESONANCE SOLUTION FOR THE NONINVASIVE CHARACTERIZATION OF HIGH RESOLUTION METABOLIC BIOMARKERS OF RISK IN PATIENTS WITH ALZHEIMER'S DISEASE AND DEMENTIA
先进的综合磁共振解决方案,用于无创表征阿尔茨海默病和痴呆症患者风险的高分辨率代谢生物标志物
  • 批准号:
    10820517
  • 财政年份:
    2023
  • 资助金额:
    $ 194.16万
  • 项目类别:
Brain Mechanisms of Chronic Low-Back Pain: Specificity and Effects of Aging and Sex
慢性腰痛的脑机制:衰老和性别的特异性和影响
  • 批准号:
    10657958
  • 财政年份:
    2023
  • 资助金额:
    $ 194.16万
  • 项目类别:
Providers and Older Pain Patients with Prescription Opioid Dependence: A Qualitative Study to Understand Barriers to Opioid Taper, Cessation, and Transition to Buprenorphine.
具有处方阿片类药物依赖性的提供者和老年疼痛患者:一项定性研究,旨在了解阿片类药物逐渐减少、戒断和过渡到丁丙诺啡的障碍。
  • 批准号:
    10671358
  • 财政年份:
    2023
  • 资助金额:
    $ 194.16万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了