Pinus sylvestris (Scots pine) is the most widespread coniferous tree in the boreal forests of Eurasia, with major economic and ecological importance. However, its large and repetitive genome presents a challenge for conducting genome‐wide analyses such as association studies, genetic mapping and genomic selection. We present a new 50K single‐nucleotide polymorphism (SNP) genotyping array for Scots pine research, breeding and other applications. To select the SNP set, we first genotyped 480 Scots pine samples on a 407 540 SNP screening array and identified 47 712 high‐quality SNPs for the final array (called ‘PiSy50k’). Here, we provide details of the design and testing, as well as allele frequency estimates from the discovery panel, functional annotation, tissue‐specific expression patterns and expression level information for the SNPs or corresponding genes, when available. We validated the performance of the PiSy50k array using samples from Finland and Scotland. Overall, 39 678 (83.2%) SNPs showed low error rates (mean = 0.9%). Relatedness estimates based on array genotypes were consistent with the expected pedigrees, and the level of Mendelian error was negligible. In addition, array genotypes successfully discriminate between Scots pine populations of Finnish and Scottish origins. The PiSy50k SNP array will be a valuable tool for a wide variety of future genetic studies and forestry applications.
In the genomic era, the gigantic size of conifer genomes still hampers advances in a wide range of fields of applied and fundamental sciences. To facilitate Scots pine genetic research and modern breeding methods, we developed a 50K SNP genotyping array and provide information that links the array markers to genetic diversity and gene expression levels across tissues.
欧洲赤松(Pinus sylvestris)是欧亚大陆北方森林中分布最广的针叶树,具有重要的经济和生态意义。然而,其庞大且重复的基因组对开展全基因组分析(如关联研究、遗传图谱绘制和基因组选择)构成了挑战。我们推出了一种新的用于欧洲赤松研究、育种及其他应用的5万单核苷酸多态性(SNP)基因分型阵列。为了选择SNP集合,我们首先在一个包含407540个SNP的筛选阵列上对480个欧洲赤松样本进行基因分型,并为最终阵列(称为“PiSy50k”)确定了47712个高质量的SNP。在此,我们提供了设计和测试的详细信息,以及来自发现样本组的等位基因频率估计值、功能注释、组织特异性表达模式以及在可获取的情况下SNP或相应基因的表达水平信息。我们使用来自芬兰和苏格兰的样本验证了PiSy50k阵列的性能。总体而言,39678个(83.2%)SNP显示出较低的错误率(平均值 = 0.9%)。基于阵列基因型的亲缘关系估计与预期的系谱一致,孟德尔错误水平可忽略不计。此外,阵列基因型成功地区分了芬兰和苏格兰起源的欧洲赤松种群。PiSy50k SNP阵列将成为未来各种遗传研究和林业应用的有价值工具。
在基因组时代,针叶树基因组的巨大规模仍然阻碍着应用科学和基础科学众多领域的进展。为了促进欧洲赤松的遗传研究和现代育种方法,我们开发了一个5万SNP基因分型阵列,并提供了将阵列标记与跨组织的遗传多样性和基因表达水平相关联的信息。