<rdf:RDF xmlns:burst="http://xmlns.com/burst/0.1/" xmlns:admin="http://webns.net/mvcb/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:syn="http://purl.org/rss/1.0/modules/syndication/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" xmlns:owl="http://www.w3.org/2002/07/owl#" xmlns:cc="http://web.resource.org/cc/" xmlns:xsd="http://www.w3.org/2001/XMLSchema#" xmlns:swrc="http://swrc.ontoware.org/ontology#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#" xmlns="http://purl.org/rss/1.0/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"><channel rdf:about="http://www.bibsonomy.org/burst/user/jil/thesis+likelihood"><title>BibSonomy publications for /user/jil/thesis+likelihood</title><link>http://www.bibsonomy.org/burst/user/jil/thesis+likelihood</link><description>BibSonomy BuRST Feed for /user/jil/thesis+likelihood</description><dc:date>2008-07-26T21:19:43+02:00</dc:date><items><rdf:Seq><rdf:li rdf:resource="http://www.bibsonomy.org/bibtex/22896eb9538a6ee34f8e6c6757bdcf99e/jil"/></rdf:Seq></items></channel><item rdf:about="http://www.bibsonomy.org/bibtex/22896eb9538a6ee34f8e6c6757bdcf99e/jil"><title>Improving Multi-class Text Classification with Naive Bayes</title><link>http://www.bibsonomy.org/bibtex/22896eb9538a6ee34f8e6c6757bdcf99e/jil</link><dc:creator>jil</dc:creator><dc:date>2008-05-05T19:34:57+02:00</dc:date><dc:subject>bayes naive multinomial prior herleitung map thesis deduction likelihood estimation exhaustive komplett mle maximum </dc:subject><content:encoded>&lt;span style=&#034;color:#555555;&#034;&gt;Jason D. M. &lt;a href=&#034;http://www.bibsonomy.org/author/Rennie&#034;&gt;Rennie&lt;/a&gt;  &lt;/span&gt;(&lt;em&gt;2001&lt;/em&gt;)</content:encoded><taxo:topics><rdf:Bag><rdf:li rdf:resource="http://www.bibsonomy.org/tag/bayes"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/naive"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/multinomial"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/prior"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/herleitung"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/map"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/thesis"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/deduction"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/likelihood"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/estimation"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/exhaustive"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/komplett"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/mle"/><rdf:li rdf:resource="http://www.bibsonomy.org/tag/maximum"/></rdf:Bag></taxo:topics><burst:publication><rdf:Description rdf:about="http://www.bibsonomy.org/bibtex/22896eb9538a6ee34f8e6c6757bdcf99e/jil"><owl:sameAs rdf:resource="http://www.bibsonomy.org/uri/bibtex/22896eb9538a6ee34f8e6c6757bdcf99e/jil"/><rdf:type rdf:resource="http://swrc.ontoware.org/ontology#Misc"/><owl:sameAs rdf:resource="http://people.csail.mit.edu/~jrennie/papers/sm-thesis.pdf"/><swrc:date>Mon May 05 19:34:57 CEST 2008</swrc:date><swrc:school><swrc:University swrc:name="Massachusetts Institute of Technology"/></swrc:school><swrc:title>Improving Multi-class Text Classification with Naive Bayes</swrc:title><swrc:year>2001</swrc:year><swrc:keywords>bayes naive multinomial prior herleitung map thesis deduction likelihood estimation exhaustive komplett mle maximum </swrc:keywords><swrc:abstract>There are numerous text documents available in electronic form. More and more
are becoming available every day. Such documents represent a massive amount of
information that is easily accessible. Seeking value in this huge collection requires
organization; much of the work of organizing documents can be automated through
text classification. The accuracy and our understanding of such systems greatly
influences their usefulness. In this paper, we seek 1) to advance the understanding
of commonly used text classification techniques, and 2) through that understanding,
improve the tools that are available for text classification. We begin by clarifying
the assumptions made in the derivation of Naive Bayes, noting basic properties and
proposing ways for its extension and improvement. Next, we investigate the quality
of Naive Bayes parameter estimates and their impact on classification. Our analysis
leads to a theorem which gives an explanation for the improvements that can be
found in multiclass classification with Naive Bayes using Error-Correcting Output
Codes. We use experimental evidence on two commonly-used data sets to exhibit an
application of the theorem. Finally, we show fundamental flaws in a commonly-used
feature selection algorithm and develop a statistics-based framework for text feature
selection. Greater understanding of Naive Bayes and the properties of text allows us
to make better use of it in text classification.</swrc:abstract><swrc:author><rdf:Seq><rdf:_1><swrc:Person swrc:name="Jason D. M. Rennie"/></rdf:_1></rdf:Seq></swrc:author></rdf:Description></burst:publication></item></rdf:RDF>