Big Data Methods for Comprehensive Similarity based Risk Prediction
基于大数据的综合相似性风险预测方法
基本信息
- 批准号:10087958
- 负责人:
- 金额:$ 45.57万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2019
- 资助国家:美国
- 起止时间:2019-02-12 至 2024-01-31
- 项目状态:已结题
- 来源:
- 关键词:AddressAutomationBig DataBig Data MethodsBiological MarkersBiological ProcessBiometryCase StudyCharacteristicsChronicChronic DiseaseChronic Kidney FailureClassificationClinicalClinical DataClinical MedicineComplexDataData ReportingData ScienceDerivation procedureDiagnosisDiseaseDisease ProgressionElectronic Health RecordEnd stage renal failureEnvironmentEtiologyExhibitsExposure toGeneticGenomicsGoalsHealthHealth ProfessionalHealthcareHeterogeneityHumanIndividualInformaticsInterdisciplinary StudyInterventionKnowledgeLengthLifeLiteratureMachine LearningMedicalMedical GeneticsMedical RecordsMethodsModelingNatural Language ProcessingOutcomePatientsPharmaceutical PreparationsPopulationPreparationReportingReproducibilityResearchRiskSocial EnvironmentSourceSurveysTechniquesbasebiomedical informaticsclinical decision supportclinical decision-makingclinical phenotypeclinical riskdata analysis pipelinedata modelingdata standardsdesigndisease diagnosisfeature selectionhealth dataimprovedinteroperabilitymortality risknovelopen dataopen sourceoutcome predictionpatient health informationpatient populationprecision medicinepredict clinical outcomerisk predictionsocioeconomicssupport toolsvector
项目摘要
Project Summary
Electronic health records (EHR) provide rich source of data about representative populations and are yet to be
fully utilized to enhance clinical decision-making. Conventional approaches in clinical decision-making start
with the identification of relevant biomarkers based on subject-matter knowledge, followed by detailed but
limited analysis using these biomarkers exclusively. As the current scientific literature indicates, many human
disorders share a complex etiological basis and exhibit correlated disease progression. Therefore, it is
desirable to use comprehensive patient data for patient similarity. This proposal focuses on deriving a
comprehensive and integrated score of patient similarity from complete patient characteristics currently
available, including but not limited to 1) demographic similarity; 2) genetic similarity; 3) clinical phenotype
similarity; 4) treatment similarity; and 5) exposome similarity (here exposome defined as all available attributes
of the living environment an individual is exposed to), when some of the aspects may overlap and interact. We
will optimize information fusion and task-dependent feature selection for assessing patient similarity for clinical
risk prediction. Since currently there does not exist a pipeline that is able to extract executable complete
patient determinant data, to achieve the research goal described above, we propose first deliver an open-
source data preparation pipeline that is based on a widely used clinical data standard, the OMOP
(Observational Medical Outcomes Partnership) Common Data Model (CMD) version 5.2. Moreover, to mitigate
common missingness and sparsity challenges in clinical data, we describe the first attempt to represent
patients' sparse clinical information with missingness, including diagnosis information, medication data,
treatment intervention, with a fixed-length feature vector (i.e. the Patient2Vec). This project has four specific
aims. Aim 1 is to develop a clinical data processing pipeline for harmonizing patient information from multiple
sources into a standards-based uniformed data representation and to evaluate its efficiency, interoperability,
and accuracy. Aim 2 is to leverage a powerful machine learning technique, Document2Vec, from the natural
language processing literature, to create an open-source Patient2Vec framework for the derivation of
informative numerical representations of patients. Aim 3 is to develop a unified machine learning clinical-
outcome-prediction framework for Optimized Patient Similarity Fusion (OptPSF) that integrates traditional
medical covariates with the derived numerical patient representations from Patient2Vec (Aim 2) for improved
clinical risk prediction. Aim 4 is to evaluate our similarity framework for predicting 1) the risk of end-stage
kidney disease (ESKD) in general EHR patient population and 2) the risk of death among patients with chronic
kidney disease (CKD).
项目摘要
电子健康记录(EHR)提供了有关代表人群的丰富数据来源,尚未
充分利用用于增强临床决策。临床决策开始中的常规方法
根据基于主题标记知识的相关生物标志物的识别,然后是详细的,但
使用这些生物标志物的有限分析。正如当前的科学文献所表明的那样,许多人
疾病具有复杂的病因基础,并且表现出相关的疾病进展。因此,是
希望使用全面的患者数据来实现患者相似性。该提议着重于推导
目前,来自完整患者特征的患者相似性的全面和集成得分
可用,包括但不限于1)人口相似性; 2)遗传相似性; 3)临床表型
相似; 4)治疗相似性; 5)释放相似性(此处定义为所有可用属性
当某些方面可能重叠和相互作用时,个人暴露于生活环境中。我们
将优化信息融合和任务依赖性功能选择,以评估患者的临床相似性
风险预测。由于目前不存在能够提取可执行文件的管道
患者的决定因素数据,为了实现上述研究目标,我们提议首先提供一个开放式数据
源数据制备管道基于广泛使用的临床数据标准,OMOP
(观察医学结果伙伴关系)常见数据模型(CMD)版本5.2。此外,要减轻
临床数据中的常见缺失和稀疏性挑战,我们描述了代表的首次尝试
患者缺失的稀疏临床信息,包括诊断信息,药物数据,
治疗干预措施,具有固定长度矢量(即患者2VEC)。这个项目有四个特定的
目标。 AIM 1是开发临床数据处理管道,以协调来自多个的患者信息
来源成为基于标准的统一数据表示,并评估其效率,互操作性,
和准确性。 AIM 2是从自然中利用强大的机器学习技术Document2Vec
语言处理文献,创建开源的患者2VEC框架
患者的信息数值表示。目标3是开发统一的机器学习临床 -
优化患者相似性融合(OPTPSF)的结果预测框架,该框架集成了传统
医疗协变量与来自患者2VEC的派生数值患者表示(AIM 2)以改进
临床风险预测。 AIM 4是评估我们预测的相似性框架1)终端阶段的风险
EHR患者人群中的肾脏疾病(ESKD)和2)慢性患者死亡的风险
肾脏疾病(CKD)。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
KRZYSZTOF KIRYLUK其他文献
KRZYSZTOF KIRYLUK的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('KRZYSZTOF KIRYLUK', 18)}}的其他基金
Non-APOL1 genetic factors and kidney transplant outcomes
非 APOL1 遗传因素与肾移植结果
- 批准号:
10717171 - 财政年份:2023
- 资助金额:
$ 45.57万 - 项目类别:
MHC and KIR Sequencing and Association Analyses in the iGeneTRAiN Studies
iGeneTRAiN 研究中的 MHC 和 KIR 测序及关联分析
- 批准号:
10438855 - 财政年份:2020
- 资助金额:
$ 45.57万 - 项目类别:
MHC and KIR Sequencing and Association Analyses in the iGeneTRAiN Studies
iGeneTRAiN 研究中的 MHC 和 KIR 测序及关联分析
- 批准号:
10251946 - 财政年份:2020
- 资助金额:
$ 45.57万 - 项目类别:
MHC and KIR Sequencing and Association Analyses in the iGeneTRAiN Studies
iGeneTRAiN 研究中的 MHC 和 KIR 测序及关联分析
- 批准号:
10020606 - 财政年份:2020
- 资助金额:
$ 45.57万 - 项目类别:
Big Data Methods for Comprehensive Similarity based Risk Prediction
基于大数据的综合相似性风险预测方法
- 批准号:
10551349 - 财政年份:2019
- 资助金额:
$ 45.57万 - 项目类别:
Big Data Methods for Comprehensive Similarity based Risk Prediction
基于大数据的综合相似性风险预测方法
- 批准号:
10323033 - 财政年份:2019
- 资助金额:
$ 45.57万 - 项目类别:
Genetics of IgA nephropathy by integrative network-based association studies
基于综合网络关联研究的 IgA 肾病遗传学
- 批准号:
9258422 - 财政年份:2015
- 资助金额:
$ 45.57万 - 项目类别:
相似国自然基金
大数据驱动的注塑成型知识自动化理论与方法研究
- 批准号:61673126
- 批准年份:2016
- 资助金额:63.0 万元
- 项目类别:面上项目
基于大数据和云计算的铝电解生产知识自动化决策系统设计方法与应用验证
- 批准号:61533020
- 批准年份:2015
- 资助金额:290.0 万元
- 项目类别:重点项目
基于大数据知识自动化的复杂离散制造能耗网络协同优化
- 批准号:61572238
- 批准年份:2015
- 资助金额:64.0 万元
- 项目类别:面上项目
基于云计算和MapReduce的区域医疗大数据分析关键技术研究
- 批准号:61572268
- 批准年份:2015
- 资助金额:65.0 万元
- 项目类别:面上项目
铁道供电调度中心大数据集的广域同步与实时流计算研究
- 批准号:51567008
- 批准年份:2015
- 资助金额:39.0 万元
- 项目类别:地区科学基金项目
相似海外基金
An automated system to differentiate Kawasaki disease from febrile illness with real life clinical datasets in New York City
利用纽约市真实临床数据集区分川崎病和发热性疾病的自动化系统
- 批准号:
10477176 - 财政年份:2022
- 资助金额:
$ 45.57万 - 项目类别:
An automated system to interpret echocardiography to predict adverse outcomes in patients with right ventricular dysfunction in daily hospital practice
一种解释超声心动图的自动化系统,以预测日常医院实践中右心室功能障碍患者的不良后果
- 批准号:
10326000 - 财政年份:2021
- 资助金额:
$ 45.57万 - 项目类别:
An ethical framework-guided metric tool for assessing bias in EHR-based Big Data studies
一种道德框架指导的度量工具,用于评估基于电子病历的大数据研究中的偏差
- 批准号:
10599459 - 财政年份:2021
- 资助金额:
$ 45.57万 - 项目类别:
Cytoscape: A Modeling Platform for Biomolecular Networks
Cytoscape:生物分子网络建模平台
- 批准号:
10166303 - 财政年份:2020
- 资助金额:
$ 45.57万 - 项目类别:
MethylScan for Early Detection of Liver and Lung Cancer
MmethylScan 用于早期检测肝癌和肺癌
- 批准号:
10603422 - 财政年份:2020
- 资助金额:
$ 45.57万 - 项目类别: