copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Learning object models from semistructured Web documents

S. Ye, and T. Chua. (2006)
DOI: 10.1109/TKDE.2006.47

Abstract

This paper presents an automated approach to learning object models by means of useful object data extracted from data-intensive semistructured Web documents such as product descriptions. Modeling intensive data on the Web involves the following three phrases: first, we identify the object region covering the descriptions of object data when irrelevant contents from the Web documents are excluded. Second, we partition the contents of different object data appearing in the object region and construct object data using hierarchical XML outputs. Third, we induce the abstract object model from the analogous object data. This model would match the corresponding object data from a Web site more precisely and comprehensively than the existing handcrafted ontologies. The main contribution of this study is in developing a fully automated approach to extract object data and object model from semistructured Web documents using kernel-based matching and view syntax interpretation. Our system, OnModer, can automatically construct object data and induce object models from complicated Web documents, such as the technical descriptions of personal computers and digital cameras downloaded from manufacturers' and vendors' sites. A comparison with the available hand-crafted ontologies and tests on an open corpus demonstrate that our framework is effective in extracting meaningful and comprehensive models.

Description

Welcome to IEEE Xplore 2.0: Learning object models from semistructured Web documents

Links and resources

BibTeX key: Ye:2006
entry type: article
booktitle: Transactions on Knowledge and Data Engineering
year: 2006
pages: 334- 349
volume: 18
issn: 1041-4347
DOI: 10.1109/TKDE.2006.47
url: http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1583583

@wnpxrz's tags highlighted

Cite this publication

@article{Ye:2006, abstract = {This paper presents an automated approach to learning object models by means of useful object data extracted from data-intensive semistructured Web documents such as product descriptions. Modeling intensive data on the Web involves the following three phrases: first, we identify the object region covering the descriptions of object data when irrelevant contents from the Web documents are excluded. Second, we partition the contents of different object data appearing in the object region and construct object data using hierarchical XML outputs. Third, we induce the abstract object model from the analogous object data. This model would match the corresponding object data from a Web site more precisely and comprehensively than the existing handcrafted ontologies. The main contribution of this study is in developing a fully automated approach to extract object data and object model from semistructured Web documents using kernel-based matching and view syntax interpretation. Our system, OnModer, can automatically construct object data and induce object models from complicated Web documents, such as the technical descriptions of personal computers and digital cameras downloaded from manufacturers' and vendors' sites. A comparison with the available hand-crafted ontologies and tests on an open corpus demonstrate that our framework is effective in extracting meaningful and comprehensive models.}, added-at = {2007-11-03T17:31:58.000+0100}, author = {Ye, S. and Chua, T.-S.}, biburl = {https://www.bibsonomy.org/bibtex/2b5a66554b18c9aa5ac27306fb81917e8/wnpxrz}, booktitle = {Transactions on Knowledge and Data Engineering}, description = {Welcome to IEEE Xplore 2.0: Learning object models from semistructured Web documents}, doi = {10.1109/TKDE.2006.47}, interhash = {5d3babb46ed9607f5cf7a8c90c33d4a8}, intrahash = {b5a66554b18c9aa5ac27306fb81917e8}, issn = {1041-4347}, keywords = {document imported machinelearning semistructured web}, pages = {334- 349}, timestamp = {2007-11-03T17:31:58.000+0100}, title = {Learning object models from semistructured Web documents}, url = {http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1583583}, volume = 18, year = 2006 }

BibSonomy

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Learning object models from semistructured Web documents

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews
(0)

BibSonomy

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Learning object models from semistructured Web documents

Abstract

Description

Links and resources

Tags

community

Cite this publication

More citation styles

search on

Meta data

Comments and Reviews (0)

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Learning object models from semistructured Web documents

Comments and Reviews
(0)