@bitsnbrains

Visual characterization of biomedical texts with word entropy

, , , and . Network Tools and Applications in Biology (NETTAB 2010), Biological Wikis, page 139-142. Napoli, Italy, (2010)

Abstract

Recently, the relation between the entropy of words (a new measure from Information Theory introduced by Montemurro in 2001) and the role of words in literary texts, as well as the capacity of entropy for clustering words, has been shown. Our final goal is to investigate if and how the list of ranked words (using entropy) can be useful in other more practical contexts, such as information retrieval task or automatic classification of bio-medical textual data. In this work, we analyze the effectiveness of the keywords selected by the Montemurro's approach to capture the semantics behind biomedical text collections, and using the spectrum of words we offer a visual representation of the text's content. Besides, we compare the resulting keyword lists with the ones obtained with TF-IDF measure, and discuss some of the most interesting facts obtained from this comparison.

Links and resources

Tags

community

  • @sangarbl
  • @bitsnbrains
  • @freesearch
@bitsnbrains's tags highlighted