Kopieren Löschen Diese Publikation zur Ablage hinzufügen
Community-Eintrag
Versionsverlauf dieses Eintrags
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Machine learning in automated text categorization

F. Sebastiani. ACM Computing Surveys, (2002)

Zusammenfassung

The automated categorisation (or classiﬁcation) of texts into topical categories has a long history, dating back at least to the early ’60s. Until the late ’80s, the most eﬀective approach to the problem seemed to be that of manually building automatic classiﬁers by means of knowledgeengineering techniques, i.e. manually deﬁning a set of rules encoding expert knowledge on how to classify documents under a given set of categories. In the ’90s, with the booming production and availability of on-line documents, automated text categorisation has witnessed an increased and renewed interest, prompted by which the machine learning paradigm to automatic classiﬁer construction has emerged and deﬁnitely superseded the knowledge-engineering approach. Within the machine learning paradigm, a general inductive process (called the learner) automatically builds a classiﬁer (also called the rule, or the hypothesis) by “learning”, from a set of previously classiﬁed documents, the characteristics of one or more categories. The advantages of this approach are a very good eﬀectiveness, a considerable savings in terms of expert manpower, and domain independence. In this survey we look at the main approaches that have been taken towards automatic text categorisation within the general machine learning paradigm. Issues pertaining to document indexing, classiﬁer construction, and classiﬁer evaluation, will be discussed in detail. A ﬁnal section will be devoted to the techniques that have speciﬁcally been devised for an emerging application such as the automatic classiﬁcation of Web pages into “Yahoo!-like” hierarchically structured sets of categories.

Links und Ressourcen

BibTeX-Schlüssel: citeulike:478973
Eintragstyp: article
Jahr: 2002
Zeitschrift: ACM Computing Surveys
comment: survey recommended by claudia niederee
priority: 2
citeulike-article-id: 478973
URL: http://portal.acm.org/ft_gateway.cfm?id=505283\&type=pdf\&dl=ACM\&dl=ACM\&CFID=11111111\&CFTOKEN=2222222

Zitieren Sie diese Publikation

@article{citeulike:478973, abstract = {The automated categorisation (or classiﬁcation) of texts into topical categories has a long history, dating back at least to the early ’60s. Until the late ’80s, the most eﬀective approach to the problem seemed to be that of manually building automatic classiﬁers by means of knowledgeengineering techniques, i.e. manually deﬁning a set of rules encoding expert knowledge on how to classify documents under a given set of categories. In the ’90s, with the booming production and availability of on-line documents, automated text categorisation has witnessed an increased and renewed interest, prompted by which the machine learning paradigm to automatic classiﬁer construction has emerged and deﬁnitely superseded the knowledge-engineering approach. Within the machine learning paradigm, a general inductive process (called the learner) automatically builds a classiﬁer (also called the rule, or the hypothesis) by “learning”, from a set of previously classiﬁed documents, the characteristics of one or more categories. The advantages of this approach are a very good eﬀectiveness, a considerable savings in terms of expert manpower, and domain independence. In this survey we look at the main approaches that have been taken towards automatic text categorisation within the general machine learning paradigm. Issues pertaining to document indexing, classiﬁer construction, and classiﬁer evaluation, will be discussed in detail. A ﬁnal section will be devoted to the techniques that have speciﬁcally been devised for an emerging application such as the automatic classiﬁcation of Web pages into “Yahoo!-like” hierarchically structured sets of categories.}, added-at = {2007-02-22T18:27:17.000+0100}, author = {Sebastiani, Fabrizio}, biburl = {https://www.bibsonomy.org/bibtex/2be7ac6440d1b65334811201b70c376eb/apo}, citeulike-article-id = {478973}, comment = {survey recommended by claudia niederee}, interhash = {d945d9218673dad37dc2a06cbf9e554c}, intrahash = {be7ac6440d1b65334811201b70c376eb}, journal = {ACM Computing Surveys}, keywords = {learning machine da}, priority = {2}, timestamp = {2007-02-22T18:27:18.000+0100}, title = {Machine learning in automated text categorization}, url = {http://portal.acm.org/ft_gateway.cfm?id=505283\&type=pdf\&dl=ACM\&dl=ACM\&CFID=11111111\&CFTOKEN=2222222}, year = 2002 }

BibSonomy

Kopieren Löschen Diese Publikation zur Ablage hinzufügen
Community-Eintrag
Versionsverlauf dieses Eintrags
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Machine learning in automated text categorization

Zusammenfassung

Links und Ressourcen

Tags

Community

Zitieren Sie diese Publikation

Mehr Zitationsstile

Suchen auf

Metadaten

Kommentare und Rezensionen
(0)

BibSonomy

KopierenLöschenDiese Publikation zur Ablage hinzufügenCommunity-EintragVersionsverlauf dieses EintragsURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Machine learning in automated text categorization

Zusammenfassung

Links und Ressourcen

Tags

Community

Zitieren Sie diese Publikation

Mehr Zitationsstile

Suchen auf

Metadaten

Kommentare und Rezensionen (0)

Kopieren Löschen Diese Publikation zur Ablage hinzufügen
Community-Eintrag
Versionsverlauf dieses Eintrags
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Machine learning in automated text categorization

Kommentare und Rezensionen
(0)