Over Several years, we have developed a system for assuring the quality of whole genome sequence (WGS) data in the LLFS families. We have focused on providing data to identify germline genetic variants with the aim of releasing as many variants on as many individuals as possible. We aim to assure the quality of the individual calls. The availability of family data has enabled us to use and validate some filters not commonly used in population-based studies. We developed slightly different procedures for the autosomal, X, Y, and Mitochondrial (MT) chromosomes. Some of these filters are specific to family data, but some can be used with any WGS data set. We also describe the procedure we use to construct linkage markers from the SNP sequence data and how we compute IBD values for use in linkage analysis.
多年来,我们开发了一个用于确保LLFS(长期家庭研究)家庭全基因组序列(WGS)数据质量的系统。我们专注于提供数据以识别种系遗传变异,目的是在尽可能多的个体上发布尽可能多的变异。我们旨在确保个体识别的质量。家庭数据的可用性使我们能够使用和验证一些在基于人群的研究中不常用的筛选方法。我们针对常染色体、X染色体、Y染色体和线粒体(MT)染色体制定了略有不同的程序。其中一些筛选方法是针对家庭数据的,但有些可用于任何全基因组序列数据集。我们还描述了我们从单核苷酸多态性(SNP)序列数据构建连锁标记的程序,以及我们如何计算用于连锁分析的同源一致性(IBD)值。