BibSonomy :: bibtex  ::

tag user group author concept BibTeX key search:all search:msn
A blue social bookmark and publication sharing system.
tags · relations · groups · popular
help · blog · about
login · register
msn's BibTeX entry:  

Query relaxation by structure and semantics for retrieval of logical Web documents

Knowledge and Data Engineering, IEEE Transactions on, 14(4): 768--791, 2002.
Authors: Wen-Syan Li and K. S. Candan and Quoc Vu and D. Agrawal
URL: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1019213
Tags: research.db.consolidation research.ir.partitioning tech.www
Abstract: Since the Web encourages hypertext and hypermedia document authoring (e.g., HTML or XML), Web authors tend to create documents that are composed of multiple pages connected with hyperlinks. A Web document may be authored in multiple ways, such as: (1) all information in one physical page, or (2) a main page and the related information in separate linked pages. Existing Web search engines, however, return only physical pages containing keywords. We introduce the concept of information unit, which can be viewed as a logical Web document consisting of multiple physical pages as one atomic retrieval unit. We present an algorithm to efficiently retrieve information units. Our algorithm can perform progressive query processing. These functionalities are essential for information retrieval on the Web and large XML databases. We also present experimental results on synthetic graphs and real Web data
| URL | BibTeX  
@article{citeulike:593448,
title = {Query relaxation by structure and semantics for retrieval of logical Web documents},
author = {Wen-Syan Li and K. S. Candan and Quoc Vu and D. Agrawal},
journal = {Knowledge and Data Engineering, IEEE Transactions on},
number = {4},
pages = {768--791},
url = {http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1019213},
volume = {14},
year = {2002},
abstract = {Since the Web encourages hypertext and hypermedia document authoring (e.g., HTML or XML), Web authors tend to create documents that are composed of multiple pages connected with hyperlinks. A Web document may be authored in multiple ways, such as: (1) all information in one physical page, or (2) a main page and the related information in separate linked pages. Existing Web search engines, however, return only physical pages containing keywords. We introduce the concept of information unit, which can be viewed as a logical Web document consisting of multiple physical pages as one atomic retrieval unit. We present an algorithm to efficiently retrieve information units. Our algorithm can perform progressive query processing. These functionalities are essential for information retrieval on the Web and large XML databases. We also present experimental results on synthetic graphs and real Web data},
citeulike-article-id = {593448}, priority = {2},
keywords = {research.db.consolidation research.ir.partitioning tech.www }
}