In clinical visits, clinical note writing is a time-consuming and cost-prohibitive manual task for clinicians. Although virtual medical scribes have been proposed to generate clinical notes (semi-)automatically, the data sparsity issue is still a challenging problem in practice. Identifying the topic of clinical utterances in doctor-patient conversations is one of the key strategies for automation. In this paper, we propose an utterance-level note section classification method for the situation of the limited amount of in-house data. We leverage an external, un-supervised corpus of medical conversations to transfer knowledge using the framework of Un-supervised Meta-learning with Task Augmentation (UMTA). Our experiments are performed on both manual transcripts and machine transcripts generated by automatic speech recognition (ASR). The results show that our strategies achieve substantial gains in prediction accuracy over several baseline approaches and are robust to ASR errors.
在临床就诊中,临床记录书写对临床医生来说是一项耗时且成本高昂的手动任务。尽管已经提出虚拟医疗记录员来(半)自动生成临床记录,但数据稀疏问题在实践中仍然是一个具有挑战性的难题。识别医患对话中临床表述的主题是实现自动化的关键策略之一。在本文中,我们针对内部数据量有限的情况提出了一种语句级别的记录部分分类方法。我们利用一个外部的、无监督的医疗对话语料库,通过带有任务增强的无监督元学习(UMTA)框架来迁移知识。我们的实验在人工转录本和由自动语音识别(ASR)生成的机器转录本上都进行了。结果表明,我们的策略在预测准确性方面比几种基线方法有了显著提高,并且对ASR错误具有鲁棒性。