Unsupervised Text Learning Based on Context Mixture Model with Dirichlet Prior
D. Chen, D. Wang, and G. Yu. Advanced Web and NetworkTechnologies, and Applications, (2008)
Abstract
In this paper, we proposed a bayesian mixture model, in which introduce a context variable, which has Dirichlet prior, in
a bayesian framework to model text multiple topics and then clustering. It is a novel unsupervised text learning algorithmto cluster large-scale web data. In addition, parameters estimation we adopt Maximum Likelihood (ML) and EM algorithm to estimatethe model parameters, and employed BIC principle to determine the number of clusters. Experimental results show that methodwe proposed distinctly outperformed baseline algorithms.
%0 Journal Article
%1 dongling2008unsupervised
%A Chen, Dongling
%A Wang, Daling
%A Yu, Ge
%D 2008
%J Advanced Web and NetworkTechnologies, and Applications
%K AWM2010 Text_classification Unsupervised_Learning weblogGoalExtraction
%P 172--181
%T Unsupervised Text Learning Based on Context Mixture Model with Dirichlet Prior
%U http://dx.doi.org/10.1007/978-3-540-89376-9_17
%X In this paper, we proposed a bayesian mixture model, in which introduce a context variable, which has Dirichlet prior, in
a bayesian framework to model text multiple topics and then clustering. It is a novel unsupervised text learning algorithmto cluster large-scale web data. In addition, parameters estimation we adopt Maximum Likelihood (ML) and EM algorithm to estimatethe model parameters, and employed BIC principle to determine the number of clusters. Experimental results show that methodwe proposed distinctly outperformed baseline algorithms.
@article{dongling2008unsupervised,
abstract = {In this paper, we proposed a bayesian mixture model, in which introduce a context variable, which has Dirichlet prior, in
a bayesian framework to model text multiple topics and then clustering. It is a novel unsupervised text learning algorithmto cluster large-scale web data. In addition, parameters estimation we adopt Maximum Likelihood (ML) and EM algorithm to estimatethe model parameters, and employed BIC principle to determine the number of clusters. Experimental results show that methodwe proposed distinctly outperformed baseline algorithms.},
added-at = {2010-07-27T20:14:54.000+0200},
author = {Chen, Dongling and Wang, Daling and Yu, Ge},
biburl = {https://www.bibsonomy.org/bibtex/2962c3e06de4fe1cd737c5cce1c58a400/chris_o},
description = {SpringerLink Beta -},
interhash = {7b9ac36c02d607d34c81bf9f11315554},
intrahash = {962c3e06de4fe1cd737c5cce1c58a400},
journal = {Advanced Web and NetworkTechnologies, and Applications},
keywords = {AWM2010 Text_classification Unsupervised_Learning weblogGoalExtraction},
pages = {172--181},
timestamp = {2010-07-27T20:17:10.000+0200},
title = {Unsupervised Text Learning Based on Context Mixture Model with Dirichlet Prior},
url = {http://dx.doi.org/10.1007/978-3-540-89376-9_17},
year = 2008
}