Osteogenesis imperfecta (OI) is a genetic disease with an estimated prevalence of 1 in 13,500 and 1 in 9700. The classification into subtypes of OI is important for prognosis and management. In this study, we established a clinical severity prediction model depending on multiple features of variants in COL1A1/2 genes.
Ninety percent of OI cases are caused by pathogenic variants in the COL1A1/COL1A2 gene. The Sillence classification describes four OI types with variable clinical features ranging from mild symptoms to lethal and progressively deforming symptoms.
We established a prediction model of the clinical severity of OI based on the random forest model with a training set obtained from the Human Gene Mutation Database, including 790 records of the COL1A1/COL1A2 genes. The features used in the prediction model were respectively based on variant-type features only, and the optimized features.
With the training set, the prediction results showed that the area under the receiver operating characteristic curve (AUC) for predicting lethal to severe OI or mild/moderate OI was 0.767 and 0.902, respectively, when using variant-type features only and optimized features for COL1A1 defects, 0.545 and 0.731, respectively, for COL1A2 defects. For the 17 patients from our hospital, prediction accuracy for the patient with the COL1A1 and COL1A2 defects was 76.5% (95% CI: 50.1–93.2%) and 88.2% (95% CI: 63.6–98.5%), respectively.
We established an OI severity prediction model depending on multiple features of the specific variants in COL1A1/2 genes, with a prediction accuracy of 76–88%. This prediction algorithm is a promising alternative that could prove to be valuable in clinical practice.
The online version contains supplementary material available at 10.1007/s00198-021-06263-0.
成骨不全症(OI)是一种遗传性疾病,估计患病率在1/13500到1/9700之间。对成骨不全症进行亚型分类对于预后和治疗至关重要。在本研究中,我们根据COL1A1/2基因变异的多种特征建立了一个临床严重程度预测模型。
90%的成骨不全症病例是由COL1A1/COL1A2基因的致病变异引起的。锡伦斯分类法描述了四种成骨不全症类型,其临床特征从轻度症状到致命且逐渐致畸的症状不等。
我们基于随机森林模型建立了一个成骨不全症临床严重程度的预测模型,训练集来自人类基因突变数据库,包括790条COL1A1/COL1A2基因记录。预测模型中使用的特征分别仅基于变异类型特征以及优化特征。
利用训练集,预测结果显示,当仅使用变异类型特征以及针对COL1A1缺陷的优化特征时,预测致命到重度成骨不全症或轻度/中度成骨不全症的受试者工作特征曲线下面积(AUC)分别为0.767和0.902;对于COL1A2缺陷,分别为0.545和0.731。对于我院的17名患者,针对COL1A1和COL1A2缺陷患者的预测准确率分别为76.5%(95%置信区间:50.1 - 93.2%)和88.2%(95%置信区间:63.6 - 98.5%)。
我们建立了一个基于COL1A1/2基因特定变异的多种特征的成骨不全症严重程度预测模型,预测准确率为76 - 88%。这种预测算法是一种有前景的替代方法,可能在临床实践中具有价值。
网络版包含补充材料,可在10.1007/s00198 - 021 - 06263 - 0获取。