next up previous
Next: 7 Acknowledgements Up: Scoring Hidden Markov Models Previous: 5 Results

6 Discussion

The paper has studied seven different log-odds scoring methods, all of which compare the probability of a sequence being generated by an HMM to that of being generated by a null model. Among the three categories of methods considered, model-specific null models seem better than sequence specific and global models. Among the model specific null models, the geometric average of the match state probabilities performs well, and is appealing in that it requires no information external to the model (such as the background distribution of amino acids or characteristics of the training set), and has an intuitive justification. The complex null model, which corrects for the length distribution of the HMM, can improve discrimination.

The study has also shown that SAM's sensitivity to fragments has room for improvement, and that scoring methods require further refinement before the theoretical significance setting can be effectively used.

The experiment with finding remote homologs of 1hurA can be repeated in other parts of the fssp-tree. We plan a comprehensive series of evaluations of the ability of HMMs to find remote homologs.



SAM
sam-info@cse.ucsc.edu
UCSC Computational Biology Group