Abstract
One of the major goals of computational sequence analysis is to find sequence similarities, which could serve as
evidence of structural and functional conservation, as well as of evolutionary relations among the sequences. Since
the degree of similarity is usually assessed by the sequence alignment score, it is necessary to know if a score is high
enough to indicate a biologically interesting alignment. A powerful approach to defining score cutoffs is based on the
evaluation of the statistical significance of alignments. The statistical significance of an alignment score is frequently
assessed by its P-value, which is the probability that this score or a higher one can occur simply by chance, given the
probabilistic models for the sequences. In this review we discuss the general role of P-value estimation in sequence
analysis, and give a description of theoretical methods and computational approaches to the estimation of statistical
signifiance for important classes of sequence analysis problems. In particular, we concentrate on the P-value estimation
techniques for single sequence studies (both score-based and score-free), global and local pairwise sequence
alignments, multiple alignments, sequence-to-profile alignments and alignments built with hidden Markov models.
We anticipate that the review will be useful both to. researchers professionally working in bioinformatics as well as
to biomedical scientists interested in using contemporary methods of DNA and protein sequence analysis.
Bayesian Methods for Hackers : An intro to Bayesian methods + probabilistic programming with a computation/understanding-first, mathematics-second point of view.
S. Basu, M. Bilenko, and R. Mooney. Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, page 59--68. New York, NY, USA, ACM, (2004)
A. Bekkali, H. Sanson, and M. Matsumoto. Wireless and Mobile Computing, Networking and Communications, 2007. WiMOB 2007. Third IEEE International Conference on, page 21 -21. (2007)
A. Boulis, A. Fehnker, M. Fruth, and A. McIver. QEST '08: Proceedings of the 2008 Fifth International Conference on Quantitative Evaluation of Systems, page 37--38. Washington, DC, USA, IEEE Computer Society, (2008)
I. Chatzigiannakis, S. Dolev, S. Fekete, O. Michail, and P. Spirakis. In 13th International Conference on Principles of Distributed Systems (OPODIS), volume 5923 of Lecture Notes in Computer Science, page 33--47. Springer-Verlag, (2009)
H. Chieu, and H. Ng. Eighteenth national conference on Artificial intelligence, page 786--791. Menlo Park, CA, USA, American Association for Artificial Intelligence, (2002)
B. Chiu, E. Keogh, and S. Lonardi. Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, page 493--498. ACM, (2003)
C. Ding, T. Li, D. Luo, and W. Peng. SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, page 831--832. New York, NY, USA, ACM, (2008)
N. Fuhr, and P. Muller. 10th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, page 13-18. New Orleans, Louisiana, United States, ACM Press, (1987)
M. Goodrich. Proceedings of the Twenty-first Annual ACM-SIAM Symposium on Discrete Algorithms, page 1262--1277. Philadelphia, PA, USA, Society for Industrial and Applied Mathematics, (2010)
W. Greiff. Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval, page 11-19. Melbourne Australia, (1998)
L. Grunske, and P. Zhang. Proceedings of the the 7th joint meeting of the European software engineering conference and the ACM SIGSOFT symposium on The foundations of software engineering, page 183--192. New York, NY, USA, ACM, (2009)
Y. Han, and J. Tang. Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, page 407--416. New York, NY, USA, ACM, (2015)
T. Hofmann. volume 2003 of Special issue of the SIGIR forum, New York, NY, International Conference on Research and Development in Information Retrieval, ACM Press, (2003)