randomfun/knn_vs_svm.ipynb at master · karpathy/randomfun · GitHub

Description

A very common workflow is to index some data based on its embeddings and then given a new query embedding retrieve the most similar examples with k-Nearest Neighbor search. For example, you can imagine embedding a large collection of papers by their abstracts and then given a new paper of interest retrieve the most similar papers to it.

TLDR in my experience it ~always works better to use an SVM instead of kNN, if you can afford the slight computational hit

Preview

Users

Comments and Reviewsshow / hide

Please log in to take part in the discussion (add own reviews or comments).

BibSonomy