SMURFLite: combining simplified Markov random fields with simulated evolution improves remote homolo

Lenore Cowen - Tufts University

April 17, 2013, 1 p.m. - April 17, 2013, 2 p.m.

MC103


One of the most successful methods to date for recognizing protein sequences that are evolutionarily related has been profile Hidden Markov Models (HMMs). However, these models do not capture pairwise statistical preferences of residues that are hydrogen bonded in beta sheets. These dependencies have been partially captured in the HMM setting by simulated evolution in the training phase and can be fully captured by Markov Random Fields (MRFs). However, the MRFs can be computationally prohibitive when beta strands are interleaved in complex topologies. We introduce SMURFLite, a method that combines both simplified Markov Random Fields and simulated evolution to substantially improve remote homology detection for beta structures. Unlike previous MRF-based methods, SMURFLite is computationally feasible on any beta-structural motif. We test SMURFLite on all propeller and barrel folds in the mainly-beta class of the SCOP hierarchy in stringent cross-validation experiments. We show a mean 26% (median 16%) improvement in AUC for beta-structural motif recognition as compared to HMMER (a well-known HMM method) and a mean 33% (median 19%) improvement as compared to RAPTOR (a well-known threading method), and even a mean 18% (median 10%) improvement in AUC over HHPred (a profile-profile HMM method), despite HHpred’s use of extensive additional training data. We demonstrate SMURFLite’s ability to scale to whole genomes by running a SMURFLite library of 207 beta-structural SCOP superfamilies against the entire genome of Thermotoga maritima, and make over a hundred new fold predictions. This is joint work with Noah Daniels, Raghavendra Hosur, and Bonnie Berger. Biography Dr. Lenore J. Cowen is a Professor in the Computer Science Department at Tufts University. She also has a courtesy appointment in the Tufts Mathematics Department. She received a BA in Mathematics from Yale and a Ph.D. in Mathematics from MIT. After finishing her Ph.D. in 1993, she was an NSF Postdoctoral Fellow and then joined the faculty of the Mathematical Sciences department (now renamed the Applied Mathematics and Statistics department) at Johns Hopkins University where she was promoted to the rank of Associate Professor in 2000. Lured by the Boston area, and the prospect of making an impact in a growing young department, she joined Tufts in September, 2001. Dr. Cowen has been named an ONR Young Investigator and a fellow of the Radcliffe Institute for Advanced Study. Her research interests span three areas: Discrete Mathematics (since high school), Algorithms (since 1991 in graduate school) and Computational Molecular Biology (since 2000). She is on the editorial board of SIAM Review.