Elefant (Efficient Learning, Large-scale Inference, and Optimisation Toolkit) is an open source library for machine learning licensed under the Mozilla Public License (MPL). We develop an open source machine learning toolkit which provides
algorithms for machine learning utilising the power of multi-core/multi-threaded processors/operating systems (Linux, WIndows, Mac OS X),
a graphical user interface for users who want to quickly prototype machine learning experiments,
tutorials to support learning about Statistical Machine Learning (Statistical Machine Learning at The Australian National University), and
detailed and precise documentation for each of the above.
The workshop aims to discuss key issues and practices of semantic mining. Thanks to the initiatives of the Linked Open Data and robust techniques for semantic annotation of Web, social, and sensor data, more semantic data is available. Many research efforts have been directed toward demonstrating semantic techniques to analyze and mine this growing resource. The workshop will provide a cross-disciplinary forum for researchers to showcase their innovation and efforts, and to further enhance existing bounds and create new connections among different communities. Here we solicit contributions on researches and practices of mining data semantics including theory, algorithms, and applications from computer science, life science, healthcare and other domains.
A great deal of research has focused on algorithms for learning features from un- labeled data. Indeed, much progress has been made on benchmark datasets like NORB and CIFAR by employing increasingly complex unsupervised learning al- gorithms and deep models. In this paper, however, we show that several very sim- ple factors, such as the number of hidden nodes in the model, may be as important to achieving high performance as the choice of learning algorithm or the depth of the model. Specifically, we will apply several off-the-shelf feature learning al- gorithms (sparse auto-encoders, sparse RBMs and K-means clustering, Gaussian mixtures) to NORB and CIFAR datasets using only single-layer networks. We then present a detailed analysis of the effect of changes in the model setup: the receptive field size, number of hidden nodes (features), the step-size (“stride”) be- tween extracted features, and the effect of whitening. Our results show that large numbers of hidden nodes and dense feature extraction are as critical to achieving high performance as the choice of algorithm itself—so critical, in fact, that when these parameters are pushed to their limits, we are able to achieve state-of-the- art performance on both CIFAR and NORB using only a single layer of features. More surprisingly, our best performance is based on K-means clustering, which is extremely fast, has no hyper-parameters to tune beyond the model structure it- self, and is very easy implement. Despite the simplicity of our system, we achieve performance beyond all previously published results on the CIFAR-10 and NORB datasets (79.6% and 97.0% accuracy respectively).
Die gezeigten Posts sind eventuell nicht akkurat bei Änderungen, die vor Kurzem vorgenommen worden. Wollen Sie jedoch akkurate Posts mit eingeschränkten Sortierungsmöglichkeiten, folgen Sie dem folgenden Link.
S. Baluja, D. Ravichandran, und D. Sivakumar. Proceeding of the International Conference on Knowledge Discovery and Information Retrieval (KDIR 2009), INSTICC, (6-8 10 2009)
M. Banko, und E. Brill. ACL '01: Proceedings of the 39th Annual Meeting on Association for Computational Linguistics, Seite 26--33. Morristown, NJ, USA, Association for Computational Linguistics, (2001)
B. Berendt, A. Hotho, und G. Stumme. International Semantic Web Conference, Volume 2342 von Lecture Notes in Computer Science, Seite 264 - 278. Springer Verlag, (2002)
S. Bloehdorn, und A. Hotho. Proceedings of the 4th IEEE International Conference on Data Mining (ICDM 2004), 1-4 November 2004, Brighton, UK, Seite 331-334. IEEE Computer Society, (November 2004)