"a good database for people who want to try learning techniques and pattern recognition methods on real-world data while spending minimal efforts on preprocessing and formatting"
A very common workflow is to index some data based on its embeddings and then given a new query embedding retrieve the most similar examples with k-Nearest Neighbor search. For example, you can imagine embedding a large collection of papers by their abstracts and then given a new paper of interest retrieve the most similar papers to it.
TLDR in my experience it ~always works better to use an SVM instead of kNN, if you can afford the slight computational hit
In computer science, a kd-tree (short for k-dimensional tree) is a space-partitioning data structure for organizing points in a k-dimensional space. kd-trees are a useful data structure for several applications, such as searches involving a multidimensional search key (e.g. range searches and nearest neighbour searches).
The limitations of backpropagation learning can now be overcome by using multilayer neural networks that contain top-down connections and training them to /generate/ sensory data rather than to classify it. (...) much better than previous approaches
N. Lathia, S. Hailes, and L. Capra. RecSys '08: Proceedings of the 2008 ACM conference on Recommender systems, page 227--234. New York, NY, USA, ACM, (2008)
P. Pantel, and D. Lin. Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, page 613--619. New York, NY, USA, ACM, (2002)
F. Suchanek, G. Ifrim, and G. Weikum. 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD 2006), page 712--717. New York, NY, USA, ACM, (2006)