D. Spinellis. Information Retrieval, 8 (1):
5-(2005)
Abstract
The infrastructure of a typical search engine can be used to calculate and resolve persistent document identifiers: a string that can uniquely identify and locate a document on the Internet without reference to its original location (URL). Bookmarking a document using such an identifier allows its retrieval even if the document's URL, and, in many cases, its contents change. Web client applications can offer facilities for users to bookmark a page by reference to a search engine and the persistent identifier instead of the original URL. The identifiers are calculated using a global Internet term index; a document's unique identifier consists of a word or word combination that occurs uniquely in the specific document. We use a genetic algorithm to locate a minimal unique document identifier: the shortest word or word combination that will locate the document. We tested our approach by implementing tools for indexing a document collection, calculating the persistent identifiers, performing queries, and distributing the computation and storage load among many computers.
%0 Journal Article
%1 ident3
%A Spinellis, Diomidis
%D 2005
%J Information Retrieval
%K identifier
%N 1
%P 5-
%T Index-Based Persistent Document Identifiers
%U http://www.dmst.aueb.gr/dds/pubs/jrnl/2005-IR-PDI/html/Spi04b.pdf
%V 8
%X The infrastructure of a typical search engine can be used to calculate and resolve persistent document identifiers: a string that can uniquely identify and locate a document on the Internet without reference to its original location (URL). Bookmarking a document using such an identifier allows its retrieval even if the document's URL, and, in many cases, its contents change. Web client applications can offer facilities for users to bookmark a page by reference to a search engine and the persistent identifier instead of the original URL. The identifiers are calculated using a global Internet term index; a document's unique identifier consists of a word or word combination that occurs uniquely in the specific document. We use a genetic algorithm to locate a minimal unique document identifier: the shortest word or word combination that will locate the document. We tested our approach by implementing tools for indexing a document collection, calculating the persistent identifiers, performing queries, and distributing the computation and storage load among many computers.
@article{ident3,
abstract = {The infrastructure of a typical search engine can be used to calculate and resolve persistent document identifiers: a string that can uniquely identify and locate a document on the Internet without reference to its original location (URL). Bookmarking a document using such an identifier allows its retrieval even if the document's URL, and, in many cases, its contents change. Web client applications can offer facilities for users to bookmark a page by reference to a search engine and the persistent identifier instead of the original URL. The identifiers are calculated using a global Internet term index; a document's unique identifier consists of a word or word combination that occurs uniquely in the specific document. We use a genetic algorithm to locate a minimal unique document identifier: the shortest word or word combination that will locate the document. We tested our approach by implementing tools for indexing a document collection, calculating the persistent identifiers, performing queries, and distributing the computation and storage load among many computers.},
added-at = {2006-11-01T16:14:41.000+0100},
author = {Spinellis, Diomidis},
biburl = {https://www.bibsonomy.org/bibtex/28bc72e653cd017032ec94df68caabc8f/nichtich},
interhash = {fec953b2a09662fc41929618bbf86937},
intrahash = {8bc72e653cd017032ec94df68caabc8f},
journal = {Information Retrieval},
keywords = {identifier},
number = 1,
pages = {5-},
timestamp = {2017-04-02T20:21:46.000+0200},
title = {Index-Based Persistent Document Identifiers},
url = {http://www.dmst.aueb.gr/dds/pubs/jrnl/2005-IR-PDI/html/Spi04b.pdf},
volume = 8,
year = 2005
}