Logo Logo
Help
Contact
Switch Language to German
Engelhardt, Alexander; Rieger, Anna; Tresch, Achim; Mansmann, Ulrich (2016): Efficient Maximum Likelihood Estimation for Pedigree Data with the Sum-Product Algorithm. In: Human Heredity, Vol. 82: pp. 1-15
[img]
Preview
292kB

Abstract

OBJECTIVE We analyze data sets consisting of pedigrees with age at onset of colorectal cancer (CRC) as phenotype. The occurrence of familial clusters of CRC suggests the existence of a latent, inheritable risk factor. We aimed to compute the probability of a family possessing this risk factor as well as the hazard rate increase for these risk factor carriers. Due to the inheritability of this risk factor, the estimation necessitates a costly marginalization of the likelihood. METHODS We propose an improved EM algorithm by applying factor graphs and the sum-product algorithm in the E-step. This reduces the computational complexity from exponential to linear in the number of family members. RESULTS Our algorithm is as precise as a direct likelihood maximization in a simulation study and a real family study on CRC risk. For 250 simulated families of size 19 and 21, the runtime of our algorithm is faster by a factor of 4 and 29, respectively. On the largest family (23 members) in the real data, our algorithm is 6 times faster. CONCLUSION We introduce a flexible and runtime-efficient tool for statistical inference in biomedical event data with latent variables that opens the door for advanced analyses of pedigree data.