In order to mitigate the effects of a changing climate, agriculture requires more effective evaluation, selection, and production of crop cultivars in order to accelerate genotype-to-phenotype connections and the selection of beneficial traits. Critically, plant growth and development are highly dependent on sunlight, with light energy providing plants with the energy required to photosynthesize as well as a means to directly intersect with the environment in order to develop. In plant analyses, machine learning and deep learning techniques have a proven ability to learn plant growth patterns, including detection of disease, plant stress, and growth using a variety of image data. To date, however, studies have not assessed machine learning and deep learning algorithms for their ability to differentiate a large cohort of genotypes grown under several growth conditions using time-series data automatically acquired across multiple scales (daily and developmentally). Here, we extensively evaluate a wide range of machine learning and deep learning algorithms for their ability to differentiate 17 well-characterized photoreceptor deficient genotypes differing in their light detection capabilities grown under several different light conditions. Using algorithm performance measurements of precision, recall, F1-Score, and accuracy, we find that Suport Vector Machine (SVM) maintains the greatest classification accuracy, while a combined ConvLSTM2D deep learning model produces the best genotype classification results across the different growth conditions. Our successful integration of time-series growth data across multiple scales, genotypes and growth conditions sets a new foundational baseline from which more complex plant science traits can be assessed for genotype-to-phenotype connections.
为了减轻气候变化的影响,农业需要更有效地评估、选择和培育作物品种,以加速基因型与表型的关联以及有益性状的筛选。关键的是,植物的生长和发育高度依赖阳光,光能为植物提供光合作用所需的能量,也是植物与环境相互作用以实现生长发育的一种途径。在植物分析中,机器学习和深度学习技术已被证明有能力学习植物生长模式,包括利用各种图像数据检测疾病、植物胁迫和生长情况。然而,到目前为止,研究尚未评估机器学习和深度学习算法利用在多个尺度(每日和发育阶段)自动获取的时间序列数据区分在多种生长条件下生长的大量基因型的能力。在此,我们广泛评估了多种机器学习和深度学习算法区分17种特征明确的光受体缺陷基因型的能力,这些基因型在不同光照条件下生长,其光检测能力各异。通过精确率、召回率、F1值和准确率等算法性能指标的测量,我们发现支持向量机(SVM)保持了最高的分类准确率,而组合的ConvLSTM2D深度学习模型在不同生长条件下产生了最佳的基因型分类结果。我们成功整合了多个尺度、基因型和生长条件的时间序列生长数据,为评估更复杂的植物科学性状的基因型 - 表型关联奠定了新的基础基准。