Abstract

Text mining is the use of automated methods for exploiting the enormous amount of knowledge available in the biomedical literature. There are at least as many motivations for doing text mining work as there are types of bioscientists. Model organism database curators have been heavy participants in the development of the field due to their need to process large numbers of publications in order to populate the many data fields for every gene in their species of interest. Bench scientists have built biomedical text mining applications to aid in the development of tools for interpreting the output of high-throughput assays and to improve searches of sequence databases (see 1 for a review). Bioscientists of every stripe have built applications to deal with the dual issues of the double-exponential growth in the scientific literature over the past few years and of the unique issues in searching PubMed/MEDLINE for genomics-related publications. A surprising phenomenon can be noted in the recent history of biomedical text mining: although several systems have been built and deployed in the past few years—Chilibot, Textpresso, and PreBIND (see Text S1 for these and most other citations), for example—the ones that are seeing high usage rates and are making productive contributions to the working lives of bioscientists have been built not by text mining specialists, but by bioscientists. We speculate on why this might be so below.

Links and resources

Tags

community

  • @yourwelcome
  • @dblp
  • @cbrewster
@yourwelcome's tags highlighted