copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Feature selection and feature extraction for text categorization

D. Lewis. HLT '91: Proceedings of the workshop on Speech and Natural Language, page 212--217. Morristown, NJ, USA, Association for Computational Linguistics, (1992)
DOI: http://dx.doi.org/10.3115/1075527.1075574

Abstract

The effect of selecting varying numbers and kinds of features for use in predicting category membership was investigated on the Reuters and MUC-3 text categorization data sets. Good categorization performance was achieved using a statistical classifier and a proportional assignment strategy. The optimal feature set size for word-based indexing was found to be surprisingly low (10 to 15 features) despite the large training sets. The extraction of new text features by syntactic analysis and feature clustering was investigated on the Reuters data set. Syntactic indexing phrases, clusters of these phrases, and clusters of words were all found to provide less effective representations than individual words.

Description

Feature selection and feature extraction for text categorization

Links and resources

BibTeX key: lewis1992featureselection
entry type: inproceedings
address: Morristown, NJ, USA
booktitle: HLT '91: Proceedings of the workshop on Speech and Natural Language
year: 1992
pages: 212--217
publisher: Association for Computational Linguistics
DOI: http://dx.doi.org/10.3115/1075527.1075574
isbn: 1-55860-272-0
location: Harriman, New York
url: http://portal.acm.org/citation.cfm?id=1075574

@ngrandy's tags highlighted

Cite this publication

search on

Meta data

Last update 17 years ago
Created 17 years ago

Comments and Reviews
(0)

There is no review or comment yet. You can write one!

BibSonomy

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Feature selection and feature extraction for text categorization

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

BibSonomy

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Feature selection and feature extraction for text categorization

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Feature selection and feature extraction for text categorization

Comments and Reviews
(0)