Inproceedings,

Browsing semi-structured web texts using formal concept analysis

, and .
In Proceedings of the 9th International Conference on Conceptual Structures, page 319--332. (2001)

Abstract

Abstract. Query-directed browsing of unstructured Web-texts using Formal Concept Analysis (FCA) confronts two problems. Firstly on-line Web-data is sometimes unstructured and any FCA-system must include additional mechanisms to structure input sources. Secondly many online collections are large and dynamic so a Web-robot must be used to automatically extract data. These issues are addressed in this paper. We report on the construction of a Web-based FCA system for browsing classified advertisements for real-estate properties 1. Real-estate advertisements were chosen because they are typical of semi-structured textual information sources accessible on the Web. Furthermore, the analysis of real-estate data using FCA is a classic example used in introductory courses on FCA. However, unlike the classic FCA real-estate example, whose input is a structure relational database, we automatically mine Web-based texts for their structure. 1

Tags

Users

  • @jonducrou
  • @stumme
  • @jamesh

Comments and Reviews