We address the problem of learning topic hierarchies from data. The
model selection problem in this domain is daunting—which of the large
collection of possible trees to use? We take a Bayesian approach, generating
an appropriate prior via a distribution on partitions that we refer
to as the nested Chinese restaurant process. This nonparametric prior allows
arbitrarily large branching factors and readily accommodates growing
data collections. We build a hierarchical topic model by combining
this prior with a likelihood that is based on a hierarchical variant of latent
Dirichlet allocation. We illustrate our approach on simulated data and
with an application to the modeling of NIPS abstracts.
Learn a topic taxonomy with document assignments. They use Gibbs and a nested chinese restraurant prior. conceptually they learn the LDA + the hierarchy (the restaurants) but this is mixed in one posterior formular, where other variables than w_m,n c_m,l z_m,n and hyperparam eta are integrated out.
The chinese restaurant process gives the probability of sitting at an free (occupied) table in a chinese restaurant given the number of other customers.
The nested CRP uses the intiution of a culinary journey, where on each table in a restaurant there is a signpost to another restaurant, which the traveller follows. All travellers start at the same restaurant (the root node in the taxonomy) and follow the signpost on each table they sit during the journey.
Criticism: They have employed their experiments on very limited data 100 docs, with 100 words and 100 vocaulary size.
%0 Conference Paper
%1 citeulike:432492
%A Blei, David M.
%A Griffiths, Thomas L.
%A Jordan, Michael I.
%A Tenenbaum, Joshua B.
%B Advances in Neural Information Processing Systems
%D 2004
%K hierarchy
%T Hierarchical Topic Models and
the Nested Chinese Restaurant Process
%U http://www-psych.stanford.edu/~gruffydd/papers/ncrp.pdf
%X We address the problem of learning topic hierarchies from data. The
model selection problem in this domain is daunting—which of the large
collection of possible trees to use? We take a Bayesian approach, generating
an appropriate prior via a distribution on partitions that we refer
to as the nested Chinese restaurant process. This nonparametric prior allows
arbitrarily large branching factors and readily accommodates growing
data collections. We build a hierarchical topic model by combining
this prior with a likelihood that is based on a hierarchical variant of latent
Dirichlet allocation. We illustrate our approach on simulated data and
with an application to the modeling of NIPS abstracts.
@inproceedings{citeulike:432492,
abstract = {We address the problem of learning topic hierarchies from data. The
model selection problem in this domain is daunting—which of the large
collection of possible trees to use? We take a Bayesian approach, generating
an appropriate prior via a distribution on partitions that we refer
to as the nested Chinese restaurant process. This nonparametric prior allows
arbitrarily large branching factors and readily accommodates growing
data collections. We build a hierarchical topic model by combining
this prior with a likelihood that is based on a hierarchical variant of latent
Dirichlet allocation. We illustrate our approach on simulated data and
with an application to the modeling of NIPS abstracts.},
added-at = {2007-03-14T13:30:57.000+0100},
author = {Blei, David M. and Griffiths, Thomas L. and Jordan, Michael I. and Tenenbaum, Joshua B.},
biburl = {https://www.bibsonomy.org/bibtex/213eceeb70da1669e302605e47d449a97/davids},
booktitle = {Advances in Neural Information Processing Systems},
citeulike-article-id = {432492},
comment = {Learn a topic taxonomy with document assignments. They use Gibbs and a nested chinese restraurant prior. conceptually they learn the LDA + the hierarchy (the restaurants) but this is mixed in one posterior formular, where other variables than w_m,n c_m,l z_m,n and hyperparam eta are integrated out.
The chinese restaurant process gives the probability of sitting at an free (occupied) table in a chinese restaurant given the number of other customers.
The nested CRP uses the intiution of a culinary journey, where on each table in a restaurant there is a signpost to another restaurant, which the traveller follows. All travellers start at the same restaurant (the root node in the taxonomy) and follow the signpost on each table they sit during the journey.
Criticism: They have employed their experiments on very limited data 100 docs, with 100 words and 100 vocaulary size.},
interhash = {f185b4657e25c733ee613bece516b3c5},
intrahash = {13eceeb70da1669e302605e47d449a97},
keywords = {hierarchy},
priority = {0},
timestamp = {2007-03-14T13:30:57.000+0100},
title = {Hierarchical Topic Models and
the Nested Chinese Restaurant Process},
url = {http://www-psych.stanford.edu/~gruffydd/papers/ncrp.pdf},
year = 2004
}