At the highest level of description, this book is about data mining. However,
it focuses on data mining of very large amounts of data, that is, data so large
it does not fit in main memory. Because of the emphasis on size, many of our
examples are about the Web or data derived from the Web. Further, the book
takes an algorithmic point of view: data mining is about applying algorithms
to data, rather than using data to “train” a machine-learning engine of some
sort.
A java-based framework for index-structure supported knowledge discovery and data mining algorithms with a fundamental approach to separate data management (file parsers, database connections, data types) and algorithms (distances, distance functions, and data mining algorithms).
Mloss is a community effort at producing reproducible research via open source software, open access to data and results, and open standards for interchange.
B. Navarro Bullock, H. Lerch, A. Roßnagel, A. Hotho, and G. Stumme. Proceedings of the 11th International Conference on Knowledge Management and Knowledge Technologies, page 15:1--15:8. New York, NY, USA, ACM, (2011)
E. Hensinger, I. Flaounas, and N. Cristianini. Artificial Intelligence Applications and Innovations, volume 339 of IFIP Advances in Information and Communication Technology, chapter 25, Springer Boston, Berlin, Heidelberg, (2010)
W. van der Aalst, and C. Günther. Seventh International Conference on Application of Concurrency to System Design (ACSD 2007), 10-13 July 2007, Bratislava, Slovak Republic, page 3--12. Washington, DC, USA, IEEE Computer Society, (2007)
N. Archak, A. Ghose, and P. Ipeirotis. KDD '07: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, page 56-65. New York, NY, USA, ACM, (2007)