An efficient rule-based algorithm is presented for haplotype inference from general pedigree genotype data, with the assumption of no recombination. This algorithm generalizes previous algorithms to handle the cases where some pedigree founders are not genotyped, provided that for each nuclear family at least one parent is genotyped and each non-genotyped founder appears in exactly one nuclear family. The importance of this generalization lies in that such cases frequently happen in real data, because some founders may have passed away and their genotype data can no longer be collected. The algorithm runs in O(m (3) n (3)) time, where m is the number of single nucleotide polymorphism (SNP) loci under consideration and n is the number of genotyped members in the pedigree. This zero-recombination haplotyping algorithm is extended to a maximum parsimoniously haplotyping algorithm in one whole genome scan to minimize the total number of breakpoint sites, or equivalently, the number of maximal zero-recombination chromosomal regions. We show that such a whole genome scan haplotyping algorithm can be implemented in O(m (3) n (3)) time in a novel incremental fashion, here m denotes the total number of SNP loci along the chromosome.
提出了一种基于规则的高效算法,用于从一般家系基因型数据中推断单倍型,假设无重组。该算法对先前的算法进行了推广,以处理一些家系创始人未进行基因分型的情况,前提是每个核心家庭至少有一个亲本进行了基因分型,并且每个未基因分型的创始人恰好出现在一个核心家庭中。这种推广的重要性在于,此类情况在实际数据中经常发生,因为一些创始人可能已经去世,其基因型数据无法再收集。该算法的运行时间为\(O(m^{3}n^{3})\),其中\(m\)是所考虑的单核苷酸多态性(SNP)位点的数量,\(n\)是家系中进行基因分型的成员数量。这种零重组单倍型算法在一次全基因组扫描中被扩展为一种最大简约单倍型算法,以最小化断点位点的总数,或者等效地,最大零重组染色体区域的数量。我们表明,这种全基因组扫描单倍型算法可以以一种新颖的增量方式在\(O(m^{3}n^{3})\)时间内实现,这里\(m\)表示沿着染色体的SNP位点总数。