@msn

Information redundancy across metadata collections

. Information Processing & Management, 43 (3): 740 - 751 (2007)Special Issue on Heterogeneous and Distributed IR.
DOI: DOI: 10.1016/j.ipm.2006.06.004

Abstract

Metadata records made available by content providers often lack the implicit information of their original use environment. Metadata aggregators therefore tend to emphasize completeness as a primary quality for shareable metadata. However, when adding implicit information to item-level records, data providers increase the redundancy of information contained in records from the same collection. The present paper reports on an effort to assess the extent and potential impact of information redundancy in metadata collections aggregated using the Open Archives Protocol for Metadata Harvesting. The first experiment quantifies the resemblance of metadata records on a collection-by-collection basis across 176 metadata collections aggregated for the CIC metadata portal. A second experiment measures the tendency of items from the same collection to appear together in results lists generated for a set of user queries. Results of the analyses correlate and suggest that within some collections item-level metadata records are not sufficiently differentiated to support certain digital library functions well. Metadata collections have a distinct role when included in larger aggregations, and in that role a minimum level of descriptive granularity is required to support digital library functions implemented by service providers. The experiments suggest possible ways to deal simultaneously with metadata record completeness, consistency, and redundancy.

Description

ScienceDirect - Information Processing & Management : Information redundancy across metadata collections

Links and resources

Tags

community

  • @msn
  • @dblp
@msn's tags highlighted