Facing the biological field examples described in natural language, based on the vector representation model of natural language, the problem of obtaining relevant biological field examples is studied and designed, and a method for obtaining biological field examples based on text mining is proposed. Through the construction of the text vector space of the corpus and knowledge mining, the feature selection, similarity measurement and example retrieval methods of biological field texts are studied, providing technical support for the design of demand-driven acquisition of biological field examples. The example analysis shows that: on the one hand, the text mining method in the biological field based on the vector space model has greater advantages over the baseline method in both precision and recall rate; on the other hand, the text retrieval mechanism based on the vector space has good adaptability and expansibility and can meet the needs of semantic retrieval in different environments.
面向以自然语言描述的生物领域实例,基于自然语言的向量表示模型,研究与设计相关的生物领域实例获取问题,提出了基于文本挖掘的生物领域实例获取方法.通过对语料库文本向量空间的构建和知识挖掘,研究生物领域文本的特征选择、相似性度量和实例检索方法,为设计需求驱动的生物领域实例获取提供技术支持.实例分析表明:一方面,基于向量空间模型的生物领域文本挖掘方法在精度和召回率两方面均较基线法具有较大的优势;另一方面,基于向量空间的文本检索机制具有很好的适应性和扩展性,可以满足不同环境下语义检索的需要.