The MCL algorithm is short for the Markov Cluster Algorithm, a fast and scalable unsupervised cluster algorithm for graphs based on simulation of (stochastic) flow in graphs.
Consensus clustering has emerged as an important elaboration of the classical clustering problem. Consensus clustering, also called aggregation of clustering (or partitions), refers to the situation in which a number of different (input) clusterings have been obtained for a particular dataset and it is desired to find a single (consensus) clustering which is a better fit in some sense than the existing clusterings. Consensus clustering is thus the problem of reconciling clustering information about the same data set coming from different sources or from different runs of the same algorithm. When cast as an optimization problem, consensus clustering is known as median partition, and has been shown to be NP-complete.
EM has been shown to have favorable convergence properties, automatical satisfaction of constraints, and fast convergence. The next section explains the traditional approach to deriving the EM algorithm and proving its convergence property. Section 3.3 covers the interpretion the EM algorithm as the maximization of two quantities: the entropy and the expectation of complete-data likelihood. Then, the K-means algorithm and the EM algorithm are compared. The conditions under which the EM algorithm is reduced to the K-means are also explained. The discussion in Section 3.4 generalizes the EM algorithm described in Sections 3.2 and 3.3 to problems with partial-data and hidden-state. We refer to this new type of EM as the doubly stochastic EM. Finally, the chapter is concluded in Section 3.5.
D. Arthur, und S. Vassilvitskii. SODA '07: Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms, Seite 1027--1035. Philadelphia, PA, USA, Society for Industrial and Applied Mathematics, (2007)
M. Balcan, A. Blum, und A. Gupta. SODA '09: Proceedings of the Nineteenth Annual ACM -SIAM Symposium on Discrete Algorithms, Seite 1068--1077. Philadelphia, PA, USA, Society for Industrial and Applied Mathematics, (2009)
K. Avrachenkov, V. Dobrynin, D. Nemirovsky, S. Pham, und E. Smirnova. SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, Seite 873--874. New York, NY, USA, ACM, (2008)
T. Hu, H. Xiong, W. Zhou, S. Sung, und H. Luo. SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, Seite 871--872. New York, NY, USA, ACM, (2008)
C. Ding, T. Li, D. Luo, und W. Peng. SIGIR '08: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval, Seite 831--832. New York, NY, USA, ACM, (2008)
D. Ramage, P. Heymann, C. Manning, und H. Garcia-Molina. WSDM '09: Proceedings of the Second ACM International Conference on Web Search and Data Mining, Seite 54--63. New York, NY, USA, ACM, (2009)
M. B\=adoiu, S. Har-Peled, und P. Indyk. STOC '02: Proceedings of the thiry-fourth annual ACM symposium on Theory of computing, Seite 250--257. New York, NY, USA, ACM, (2002)
N. Slonim, und N. Tishby. SIGIR '00: Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval, Seite 208--215. New York, NY, USA, ACM, (2000)
M. Surdeanu, J. Turmo, und A. Ageno. KDD '05: Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, Seite 685--690. New York, NY, USA, ACM, (2005)
X. Liu, und W. Croft. ECIR'08: Proceedings of the IR research, 30th European conference on Advances in information retrieval, Seite 454--462. Berlin, Heidelberg, Springer-Verlag, (2008)
A. Leuski. CIKM '01: Proceedings of the tenth international conference on Information and knowledge management, Seite 33--40. New York, NY, USA, ACM, (2001)
W. Xu, X. Liu, und Y. Gong. SIGIR '03: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval, Seite 267--273. New York, NY, USA, ACM, (2003)
D. Arthur, und S. Vassilvitskii. SCG '06: Proceedings of the twenty-second annual symposium on Computational geometry, Seite 144--153. New York, NY, USA, ACM, (2006)
X. Liu, und W. Croft. SIGIR '04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, Seite 186--193. New York, NY, USA, ACM, (2004)
G. Erkan. Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics, Seite 479--486. Morristown, NJ, USA, Association for Computational Linguistics, (2006)