Bioconductor provides tools for the analysis and comprehension of high-throughput genomic data. Bioconductor uses the R statistical programming language, and is open source and open development. It has two releases each year, 671 software packages, and an active user community. Bioconductor is also available as an Amazon Machine Image (AMI).
Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. It is also well-suited for developing new machine learning schemes.
The BioScope corpus consists of medical and biological texts annotated for negation, speculation and their linguistic scope. This was done to allow a comparison between the development of systems for negation/hedge detection and scope resolution. The corpus is publicly available for research purposes.
The Model Organism Databases (MODs) are working with the InterMine group to enable faster comparative studies and develop tools that make analysis accessible to the wider scientific community.
R. Calavia, F. Annanouch, X. Correig, and O. Yanes. Journal of Proteomics, 75 (16):
5061 - 5068(2012)Special Issue: Imaging Mass Spectrometry: A User’s Guide to a New Technique for Biological and Biomedical Research.