The Cataloging Lab is a place for catalogers and anyone who cares about library metadata to experiment with creating better controlled vocabularies. Suggesting additions and changes to the Library of Congress Subject Headings vocabulary can be an isolating endeavor—it can be difficult to determine if your heading has already been proposed or if someone else is working on a proposal at the same time you are. The Cataloging Lab is designed to be a wiki where folks can collaborate on headings together to create stronger proposals.
There are many different folk tales in the world, but many tales are variations on a limited number of themes. The classification system originally designed by Aarne, and later revised first by Thompson and later by Uther, is intended to bring out the similarities between tales by grouping variants of the same tale under the same ATU category. like hraf
In this post, I want to show how I use NLTK for preprocessing and tokenization, but then apply machine learning techniques (e.g. building a linear SVM using stochastic gradient descent) using Scikit-Learn.
In this post you will see 5 recipes of supervised classification algorithms applied to small standard datasets that are provided with the scikit-learn library.
I. Androutsopoulos, J. Koutsias, K. Cb, and C. Spyropoulos. In Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval, page 160--167. (2000)
G. Cormack, José, and E. Sánz. CIKM '07: Proceedings of the sixteenth ACM conference on Conference on information and knowledge management, page 313--320. New York, NY, USA, ACM, (2007)
G. Cormack, José, and E. Sánz. SIGIR '07: Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval, page 871--872. New York, NY, USA, ACM, (2007)
M. Kelly, D. Hand, and N. Adams. KDD '99: Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, page 367--371. New York, NY, USA, ACM, (1999)
Y. Song, L. Zhang, and C. Giles. CIKM '08: Proceeding of the 17th ACM conference on Information and knowledge mining, page 93--102. New York, NY, USA, ACM, (2008)