group :: lwa | BibSonomy

закладки (спрятать)36
показать
всё
только закладки
закладки на страницу
5
10
20
50
100
RSS
BibTeX
XML

1Extracting Key Terms From Noisy and Multi-theme Documents - WWW2009 EPrints
We present a novel method for key term extraction from text documents. In our method, document is modeled as a graph of semantic relationships between terms of that document. We exploit the following remarkable feature of the graph: the terms related to the main topics of the document tend to bunch up into densely interconnected subgraphs or commu- nities, while non-important terms fall into weakly intercon- nected communities, or even become isolated vertices. We apply graph community detection techniques to partition the graph into thematically cohesive groups of terms. We introduce a criterion function to select groups that contain key terms discarding groups with unimportant terms. To weight terms and determine semantic relatedness between them we exploit information extracted from Wikipedia. Using such an approach gives us the following two ad- vantages. First, it allows effectively processing multi-theme documents. Second, it is good at filtering out noise infor- mation in the document, such as, for example, navigational bars or headers in web pages. Evaluations of the method show that it outperforms exist- ing methods producing key terms with higher precision and recall. Additional experiments on web pages prove that our method appears to be substantially more effective on noisy and multi-theme documents than existing methods.
15 лет назад , @dbenz
attended
www2009
attendedwww2009
копироватьудалить
- Запись сообщества
- посмотреть историю записи
2Measuring the Similarity between Implicit Semantic Relations from the Web - WWW2009 EPrints
Measuring the similarity between semantic relations that hold among entities is an important and necessary step in various Web related tasks such as relation extraction, information retrieval and analogy detection. For example, consider the case in which a person knows a pair of entities (e.g. Google, YouTube), between which a partic- ular relation holds (e.g. acquisition). The person is interested in retrieving other such pairs with similar relations (e.g. Microsoft, Powerset). Existing keyword-based search engines cannot be ap- plied directly in this case because, in keyword-based search, the goal is to retrieve documents that are relevant to the words used in a query – not necessarily to the relations implied by a pair of words. We propose a relational similarity measure, using a Web search en- gine, to compute the similarity between semantic relations implied by two pairs of words. Our method has three components: repre- senting the various semantic relations that exist between a pair of words using automatically extracted lexical patterns, clustering the extracted lexical patterns to identify the different patterns that ex- press a particular semantic relation, and measuring the similarity between semantic relations using a metric learning approach. We evaluate the proposed method in two tasks: classifying semantic relations between named entities, and solving word-analogy ques- tions. The proposed method outperforms all baselines in a relation classification task with a statistically significant average precision score of 0.74. Moreover, it reduces the time taken by Latent Relational Analysis to process 374 word-analogy questions from 9 days to less than 6 hours, with an SAT score of 51%.
15 лет назад , @dbenz
attended
www2009
attendedwww2009
копироватьудалить
- Запись сообщества
- посмотреть историю записи
1Triplify - Light-Weight Linked Data Publication from Relational Databases - WWW2009 EPrints
In this paper we present Triplify – a simplistic but effective approach to publish Linked Data from relational databases. Triplify is based on mapping HTTP-URI requests onto relational database queries. Triplify transforms the resulting relations into RDF statements and publishes the data on the Web in various RDF serializations, in particular as Linked Data. The rationale for developing Triplify is that the largest part of information on the Web is already stored in structured form, often as data contained in relational databases, but usually published by Web applications only as HTML mixing structure, layout and content. In order to reveal the pure structured information behind the current Web, we have implemented Triplify as a light-weight software component, which can be easily integrated into and deployed by the numerous, widely installed Web applications. Our approach includes a method for publishing update logs to enable incremental crawling of linked data sources. Triplify is complemented by a library of conﬁgurations for common relational schemata and a REST-enabled data source registry. Triplify conﬁgurations containing mappings are provided for many popular Web applications, including osCommerce, WordPress, Drupal, Gallery, and phpBB. We will show that despite its light-weight architecture Triplify is usable to publish very large datasets, such as 160GB of geo data from the OpenStreetMap project.
15 лет назад , @dbenz
attended
www2009
attendedwww2009
копироватьудалить
- Запись сообщества
- посмотреть историю записи
1SOFIE: A Self-Organizing Framework for Information Extraction - WWW2009 EPrints
This paper presents SOFIE, a system for automated on- tology extension. SOFIE can parse natural language docu- ments, extract ontological facts from them and link the facts into an ontology. SOFIE uses logical reasoning on the exist- ing knowledge and on the new knowledge in order to disam- biguate words to their most probable meaning, to reason on the meaning of text patterns and to take into account world knowledge axioms. This allows SOFIE to check the plau- sibility of hypotheses and to avoid inconsistencies with the ontology. The framework of SOFIE unites the paradigms of pattern matching, word sense disambiguation and ontolog- ical reasoning in one unified model. Our experiments show that SOFIE delivers high-quality output, even from unstruc- tured Internet documents.
15 лет назад , @dbenz
attended
www2009
attendedwww2009
копироватьудалить
- Запись сообщества
- посмотреть историю записи
3Evaluating Similarity Measures for Emergent Semantics of Social Tagging - WWW2009 EPrints
Social bookmarking systems and their emergent information structures, known as folksonomies, are increasingly important data sources for Semantic Web applications. A key question for harvesting semantics from these systems is how to extend and adapt traditional notions of similarity to folksonomies, and which measures are best suited for applications such as navigation support, semantic search, and ontology learning. Here we build an evaluation framework to compare various general folksonomy-based similarity measures derived from established information-theoretic, statistical, and practical measures. Our framework deals generally and symmetrically with users, tags, and resources. For evaluation purposes we focus on similarity among tags and resources, considering different ways to aggregate annotations across users. After comparing how tag similarity measures predict user-created tag relations, we provide an external grounding by user-validated semantic proxies based on WordNet and the Open Directory. We also investigate the issue of scalability. We ﬁnd that mutual information with distributional micro-aggregation across users yields the highest accuracy, but is not scalable; per-user projection with collaborative aggregation provides the best scalable approach via incremental computations. The results are consistent across resource and tag similarity.
15 лет назад , @dbenz
attended
myown
www2009
itegpub
attendedmyownwww2009itegpub
копироватьудалить
- Запись сообщества
- посмотреть историю записи

&lang;&lang;
⟨
1
2
3
&rang;
⟩⟩

публикации (спрятать)1
показать
всё
только публикации
публикации на страницу
5
10
20
50
100
расширенный...
RSS
BibTeX
RDF
дальше...

31Evaluating Similarity Measures for Emergent Semantics of Social Tagging
B. Markines, C. Cattuto, F. Menczer, D. Benz, A. Hotho, и G. Stumme. 18th International World Wide Web Conference, стр. 641--641. (апреля 2009)
13 лет назад , @dbenz
2009
myown
methods_concepts
www2009
taggingsurvey
ol_web2.0
semantic_relatedness
tagorapub
social_similarity
itegpub
2009myownmethods_conceptswww2009taggingsurveyol_web2.0semantic_relatednesstagorapubsocial_similarityitegpub
копироватьудалитьдобавить публикацию в буфер

&lang;&lang;
⟨
1
&rang;
⟩⟩

BibSonomy

закладки (спрятать)36
показать
всё
только закладки
закладки на страницу
5
10
20
50
100
RSS
BibTeX
XML

1Extracting Key Terms From Noisy and Multi-theme Documents - WWW2009 EPrints

2Measuring the Similarity between Implicit Semantic Relations from the Web - WWW2009 EPrints

1Triplify - Light-Weight Linked Data Publication from Relational Databases - WWW2009 EPrints

1SOFIE: A Self-Organizing Framework for Information Extraction - WWW2009 EPrints

3Evaluating Similarity Measures for Emergent Semantics of Social Tagging - WWW2009 EPrints

публикации (спрятать)1
показать
всё
только публикации
публикации на страницу
5
10
20
50
100
расширенный...
RSS
BibTeX
RDF
дальше...

31Evaluating Similarity Measures for Emergent Semantics of Social Tagging

LWA - Lernen, Wissen, Adaptivität

просмотр

сходные по теме тэги

тэги

BibSonomy

закладки (спрятать)36 показатьвсётолько закладкизакладки на страницу5102050100 RSSBibTeXXML

1Extracting Key Terms From Noisy and Multi-theme Documents - WWW2009 EPrints

2Measuring the Similarity between Implicit Semantic Relations from the Web - WWW2009 EPrints

1Triplify - Light-Weight Linked Data Publication from Relational Databases - WWW2009 EPrints

1SOFIE: A Self-Organizing Framework for Information Extraction - WWW2009 EPrints

3Evaluating Similarity Measures for Emergent Semantics of Social Tagging - WWW2009 EPrints

публикации (спрятать)1 показатьвсётолько публикациипубликации на страницу5102050100 расширенный... RSSBibTeXRDFдальше...

31Evaluating Similarity Measures for Emergent Semantics of Social Tagging

LWA - Lernen, Wissen, Adaptivität

просмотр

сходные по теме тэги

тэги

закладки (спрятать)36
показать
всё
только закладки
закладки на страницу
5
10
20
50
100
RSS
BibTeX
XML

публикации (спрятать)1
показать
всё
только публикации
публикации на страницу
5
10
20
50
100
расширенный...
RSS
BibTeX
RDF
дальше...