Measuring the Difficulty of Distance-Based Indexing
M. Skala. String Processing and Information Retrieval, page 103--114. Berlin, Heidelberg, Springer Berlin Heidelberg, (2005)
Abstract
Data structures for similarity search are commonly evaluated on data in vector spaces, but distance-based data structures are also applicable to non-vector spaces with no natural concept of dimensionality. The intrinsic dimensionality statistic of Chávez and Navarro provides a way to compare the performance of similarity indexing and search algorithms across different spaces, and predict the performance of index data structures on non-vector spaces by relating them to equivalent vector spaces. We characterise its asymptotic behaviour, and give experimental results to calibrate these comparisons.
Description
Measuring the Difficulty of Distance-Based Indexing | SpringerLink
%0 Conference Paper
%1 Skala2005DistanceIndexingDifficulty
%A Skala, Matthew
%B String Processing and Information Retrieval
%C Berlin, Heidelberg
%D 2005
%E Consens, Mariano
%E Navarro, Gonzalo
%I Springer Berlin Heidelberg
%K distance-based-indexing distance-metrics high-dimensional-data high-dimensional-indexing
%P 103--114
%T Measuring the Difficulty of Distance-Based Indexing
%X Data structures for similarity search are commonly evaluated on data in vector spaces, but distance-based data structures are also applicable to non-vector spaces with no natural concept of dimensionality. The intrinsic dimensionality statistic of Chávez and Navarro provides a way to compare the performance of similarity indexing and search algorithms across different spaces, and predict the performance of index data structures on non-vector spaces by relating them to equivalent vector spaces. We characterise its asymptotic behaviour, and give experimental results to calibrate these comparisons.
%@ 978-3-540-32241-2
@inproceedings{Skala2005DistanceIndexingDifficulty,
abstract = {Data structures for similarity search are commonly evaluated on data in vector spaces, but distance-based data structures are also applicable to non-vector spaces with no natural concept of dimensionality. The intrinsic dimensionality statistic of Ch{\'a}vez and Navarro provides a way to compare the performance of similarity indexing and search algorithms across different spaces, and predict the performance of index data structures on non-vector spaces by relating them to equivalent vector spaces. We characterise its asymptotic behaviour, and give experimental results to calibrate these comparisons.},
added-at = {2019-06-26T23:45:55.000+0200},
address = {Berlin, Heidelberg},
author = {Skala, Matthew},
biburl = {https://www.bibsonomy.org/bibtex/29cce62a0867de7eea83ba5a8d3434e07/salotz},
booktitle = {String Processing and Information Retrieval},
description = {Measuring the Difficulty of Distance-Based Indexing | SpringerLink},
editor = {Consens, Mariano and Navarro, Gonzalo},
interhash = {216f8bb0d990dd18a11bbc0d17cbafd3},
intrahash = {9cce62a0867de7eea83ba5a8d3434e07},
isbn = {978-3-540-32241-2},
keywords = {distance-based-indexing distance-metrics high-dimensional-data high-dimensional-indexing},
pages = {103--114},
publisher = {Springer Berlin Heidelberg},
timestamp = {2019-06-26T23:45:55.000+0200},
title = {Measuring the Difficulty of Distance-Based Indexing},
year = 2005
}