Search Technique Using Wildcards or Truncation: A Tolerance Rough Set Clustering Approach

. International Journal of Advanced Computer Science and Applications(IJACSA) (2010)


Search engine technology plays an important role in web information retrieval. However, with Internet information explosion, traditional searching techniques cannot provide satisfactory result due to problems such as huge number of result Web pages, unintuitive ranking etc. Therefore, the reorganization and post-processing of Web search results have been extensively studied to help user effectively obtain useful information. This paper has basically three parts. First part is the review study on how the keyword is expanded through truncation or wildcards (which is a little known feature but one of the most powerful one) by using various symbols like * or! The primary goal in designing this is to restrict ourselves by just mentioning the keyword using the truncation or wildcard symbols rather than expanding the keyword into sentential form. The second part of this paper gives a brief idea about the tolerance rough set approach to clustering the search results. In tolerance rough set approach we use a tolerance factor considering which we cluster the information rich search result and discard the rest. But it may so happen that the discarded results do have some information which may not be up to the tolerance level; still they do contain some information regarding the query. The third part depicts a proposed algorithm based on the above two and thus solving the above mentioned problem that usually arise in the tolerance rough set approach . The main goal of this paper is to develop a search technique through which the information retrieval will be very fast, reducing the amount of extra labor needed on expanding the query.

