Abstract:The problems of haplotyping and haplotype frequency estimation on trio genotype data under the Mendelian law of inheritance and the assumption of Hardy-Weinberg equilibrium are studied in this paper. Since most past efforts only focused on haplotyping on genotype data of unrelated individuals and data with general pedigrees, but gave insufficient efforts to the special case of trio genotype data, there is coming an increasing demand in analyzing them in particular, especially when taking into account that part of HAPMAP database is exactly trio data. This paper presents a two-staged method to estimate haplotype frequencies in trios: i) haplotyping stage, find haplotype configurations without recombinant for each trio; ii) frequency estimation stage, use the expectation-maximization (EM) algorithm to estimate haplotype frequencies based on these inferred haplotype configurations. Both the haplotyping algorithm and the EM algorithm are implemented in software package TRIOHAP using C language. Its effectiveness and efficiency and tested on simulated and real data sets as well. The experimental results show that, TRIOHAP runs much faster than a popular frequency estimation software which discards trio information. Moreover, because TRIOHAP utilizes such information, its estimation is more reliable.