A very common workflow is to index some data based on its embeddings and then given a new query embedding retrieve the most similar examples with k-Nearest Neighbor search. For example, you can imagine embedding a large collection of papers by their abstracts and then given a new paper of interest retrieve the most similar papers to it.
TLDR in my experience it ~always works better to use an SVM instead of kNN, if you can afford the slight computational hit
MIT 6.034 Artificial Intelligence, Fall 2010 View the complete course: http://ocw.mit.edu/6-034F10 Instructor: Patrick Winston In this lecture, we explore su...
In this post, I want to show how I use NLTK for preprocessing and tokenization, but then apply machine learning techniques (e.g. building a linear SVM using stochastic gradient descent) using Scikit-Learn.
In the previous post on Support Vector Machines (SVM), we looked at the mathematical details of the algorithm. In this post, I will be discussing the practical implementations of SVM for classification as well as regression. I will be using the iris dataset as an example for the classification problem, and a randomly generated data as an example for the regression problem.
S. Kiritchenko, X. Zhu, C. Cherry, and S. Mohammad. Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), page 437--442. Dublin, Ireland, Association for Computational Linguistics, (August 2014)
P. Molchanov, S. Gupta, K. Kim, and J. Kautz. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, page 1-7. IEEE, (September 2015)
S. Lahoti, S. Kayal, S. Kumbhare, I. Suradkar, and V. Pawar. 2018 9th International Conference on Computing, Communication and Networking Technologies (ICCCNT), page 1-6. Aurangabad, Maharashtra, India, IEEE, (July 2018)
T. Rezende, C. Castro, S. Almeida, and F. Guimarães. Anais do XIII Simpósio Brasileiro de Automação Inteligente, page 465-470. Universidade Federal do Rio Grande do Sul (UFRGS), (2017)
N. Gunasekara, S. Pang, and N. Kasabov. Neural Information Processing. Models and Applications, volume 6444 of Lecture Notes in Computer Science, Springer Berlin Heidelberg, (2010)
S. Sinha, S. Kushwaha, P. Burje, S. Jain, and A. Singh. International Journal on Recent and Innovation Trends in Computing and Communication, 3 (3):
934--939(March 2015)
H. Kim, J. Choi, D. Choi, H. Choi, and P. Kim. Proceedings of the 2012 ACM Research in Applied Computation Symposium, page 310--315. New York, NY, USA, ACM, (2012)
H. Yu, J. Han, and K. Chang. Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, page 239--248. New York, NY, USA, ACM, (2002)
K. Chen, T. Chen, G. Zheng, O. Jin, E. Yao, and Y. Yu. Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval, page 661--670. ACM, (2012)