We introduce CASED, a novel curriculum sampling algorithm that facilitates the optimization of deep learning segmentation or detection models on data sets with extreme class imbalance. We evaluate the CASED learning framework on the task of lung nodule detection in chest CT. In contrast to two-stage solutions, wherein nodule candidates are first proposed by a segmentation model and refined by a second detection stage, CASED improves the training of deep nodule segmentation models (e.g. UNet) to the point where state of the art results are achieved using only a trivial detection stage. CASED improves the optimization of deep segmentation models by allowing them to first learn how to distinguish nodules from their immediate surroundings, while continuously adding a greater proportion of difficult-to-classify global context, until uniformly sampling from the empirical data distribution. Using CASED during training yields a minimalist proposal to the lung nodule detection problem that tops the LUNA16 nodule detection benchmark with an average sensitivity score of 88.35%. Furthermore, we find that models trained using CASED are robust to nodule annotation quality by showing that comparable results can be achieved when only a point and radius for each ground truth nodule are provided during training. Finally, the CASED learning framework makes no assumptions with regard to imaging modality or segmentation target and should generalize to other medical imaging problems where class imbalance is a persistent problem.
我们介绍了CASED,这是一种新颖的课程采样算法,它有助于在具有极端类别不平衡的数据集上优化深度学习分割或检测模型。我们在胸部CT肺结节检测任务中评估了CASED学习框架。与两阶段解决方案不同,在两阶段解决方案中,结节候选区域首先由分割模型提出,然后由第二个检测阶段进行细化,CASED改进了深度结节分割模型(例如UNet)的训练,以至于仅使用一个简单的检测阶段就能达到最先进的结果。CASED通过允许深度分割模型首先学习如何区分结节与其紧邻环境,同时不断增加更大部分难以分类的全局背景,直到从经验数据分布中均匀采样,从而改进了深度分割模型的优化。在训练过程中使用CASED为肺结节检测问题提供了一种极简的方案,该方案在LUNA16结节检测基准测试中以88.35%的平均灵敏度得分位居榜首。此外,我们发现使用CASED训练的模型对结节标注质量具有鲁棒性,这通过在训练期间仅为每个真实结节提供一个点和半径时也能取得相当的结果得以证明。最后,CASED学习框架对成像模态或分割目标没有任何假设,并且应该能推广到类别不平衡是一个长期问题的其他医学成像问题。