bookmark

randomfun/knn_vs_svm.ipynb at master · karpathy/randomfun · GitHub


Description

A very common workflow is to index some data based on its embeddings and then given a new query embedding retrieve the most similar examples with k-Nearest Neighbor search. For example, you can imagine embedding a large collection of papers by their abstracts and then given a new paper of interest retrieve the most similar papers to it.

TLDR in my experience it ~always works better to use an SVM instead of kNN, if you can afford the slight computational hit

Preview

Tags

Users

  • @jil

Comments and Reviews