A very common workflow is to index some data based on its embeddings and then given a new query embedding retrieve the most similar examples with k-Nearest Neighbor search. For example, you can imagine embedding a large collection of papers by their abstracts and then given a new paper of interest retrieve the most similar papers to it.
TLDR in my experience it ~always works better to use an SVM instead of kNN, if you can afford the slight computational hit
We observed that generally the embedding representation is very rich and information dense. For example, reducing the dimensionality of the inputs using SVD or PCA, even by 10%, generally results in worse downstream performance on specific tasks.
K. Sullivan, und S. Luke. GECCO '07: Proceedings of the 9th annual conference on
Genetic and evolutionary computation, 2, Seite 1702--1707. London, ACM Press, (7-11 July 2007)
Y. Shan, D. Paull, und R. McKay. Ecological Modelling, 195 (1-2):
129--138(15 May 2006)Selected Papers from the Third Conference of the
International Society for Ecological Informatics
(ISEI), August 26--30, 2002, Grottaferrata, Rome,
Italy.
M. Khoury, F. Guerin, und G. Coghill. Genetic and Evolutionary Computation Conference
(GECCO2007) workshop program, Seite 2769--2776. London, United Kingdom, ACM Press, (7-11 July 2007)
K. Imamura, J. Foster, und A. Krings. The Second NASA/DoD workshop on Evolvable Hardware, Seite 75--80. Palo Alto, California, IEEE Computer Society, (13-15 July 2000)