A very common workflow is to index some data based on its embeddings and then given a new query embedding retrieve the most similar examples with k-Nearest Neighbor search. For example, you can imagine embedding a large collection of papers by their abstracts and then given a new paper of interest retrieve the most similar papers to it.
TLDR in my experience it ~always works better to use an SVM instead of kNN, if you can afford the slight computational hit
MIT 6.034 Artificial Intelligence, Fall 2010 View the complete course: http://ocw.mit.edu/6-034F10 Instructor: Patrick Winston In this lecture, we explore su...
In this post, I want to show how I use NLTK for preprocessing and tokenization, but then apply machine learning techniques (e.g. building a linear SVM using stochastic gradient descent) using Scikit-Learn.
In the previous post on Support Vector Machines (SVM), we looked at the mathematical details of the algorithm. In this post, I will be discussing the practical implementations of SVM for classification as well as regression. I will be using the iris dataset as an example for the classification problem, and a randomly generated data as an example for the regression problem.
S. Lahoti, S. Kayal, S. Kumbhare, I. Suradkar, and V. Pawar. 2018 9th International Conference on Computing, Communication and Networking Technologies (ICCCNT), page 1-6. Aurangabad, Maharashtra, India, IEEE, (July 2018)
T. Rezende, C. Castro, S. Almeida, and F. Guimarães. Anais do XIII Simpósio Brasileiro de Automação Inteligente, page 465-470. Universidade Federal do Rio Grande do Sul (UFRGS), (2017)