Early detection and risk assessment of complex chronic disease based on longitudinal clinical data is helpful for doctors to make early diagnosis and monitor the disease progression. Disease diagnosis with computer-aided methods has been extensively studied. However, early detection and contemporaneous risk assessment based on partially labeled irregular longitudinal measurements is relatively unexplored. In this paper, we propose a flexible mixed-kernel framework for training a contemporaneous disease risk detector to predict the onset of disease and monitor the disease progression. Moreover, we address the label insufficiency problem by identifying the pattern of disease-induced progression over time with longitudinal data. Our method is based on a Structured Output Support Vector Machine (SOSVM), extended to longitudinal data analysis. Extensive experiments are conducted on several datasets of varying complexity, including the contemporaneous risk assessment with simulated irregular longitudinal data; the identification of the onset of Type 1 Diabetes (T1D) with irregularly sampled longitudinal RNA-Seq gene expression dataset; as well as the monitoring of the drug long-term effects on patients using longitudinal RNA-Seq dataset containing missing time points, demonstrating that our method enhances the accuracy in both early diagnosis and risk estimation with partially labeled irregular longitudinal clinical data.
基于纵向临床数据对复杂慢性疾病进行早期检测和风险评估有助于医生做出早期诊断并监测疾病进展。利用计算机辅助方法进行疾病诊断已被广泛研究。然而,基于部分标记的不规则纵向测量数据进行早期检测和同期风险评估的研究相对较少。在本文中,我们提出了一种灵活的混合核框架,用于训练一个同期疾病风险检测器,以预测疾病的发作并监测疾病进展。此外,我们通过利用纵向数据识别随时间推移由疾病引起的进展模式来解决标记不足的问题。我们的方法基于结构输出支持向量机(SOSVM),并扩展到纵向数据分析。我们在几个复杂程度不同的数据集上进行了大量实验,包括利用模拟的不规则纵向数据进行同期风险评估;利用不规则采样的纵向RNA - Seq基因表达数据集识别1型糖尿病(T1D)的发病;以及利用包含缺失时间点的纵向RNA - Seq数据集监测药物对患者的长期影响,结果表明我们的方法提高了利用部分标记的不规则纵向临床数据进行早期诊断和风险估计的准确性。