Elefant (Efficient Learning, Large-scale Inference, and Optimisation Toolkit) is an open source library for machine learning licensed under the Mozilla Public License (MPL). We develop an open source machine learning toolkit which provides
The FOSS in Research and Student Innovation Miniconf brings together researchers and students with an active interest in Free and Open Source Software with the broader Linux.conf.au community to highlight exciting work taking place within the often esoteric world of academia and educational institutions.
The Miniconf is part of Linux.conf.au 2011, being held at the QUT Gardens Point Campus in Brisbane, Queensland in January.
Topics are split into two streams: FOSS in Research, which invites presentations on research relating to Free and Open Source Software; and Student Innovation, which explores new and exciting work in the FOSS world conducted by students. Presentations may be proposed in a 25-minute talk format (20 minutes talk + 5 minutes discussion).
This website provides tutorials and sample course content so CS students and educators can learn more about current computing technologies and paradigms. In particular, this content is Creative Commons licensed which makes it easy for CS educators to use in their own classes.
The Courses section contains tutorials, lecture slides, and problem sets for a variety of topic areas:
AJAX Programming
Algorithms
Distributed Systems
Web Security
Languages
In the Tools 101 section, you will find a set of introductions to some common tools used in Computer Science such as version control systems and databases.
The CS Curriculum Search will help you find teaching materials that have been published to the web by faculty from CS departments around the world. You can refine your search to display just lectures, assignments or reference materials for a set of courses.
This is a "tree of all knowledge" category, a top-level place to start when browsing Wikipedia categories for articles. This is the top level in terms of encyclopedia article function and content. It is intended to contain all and only the few most fundamental ontological categories which can reasonably be expected to contain every possible Wikipedia article under their category trees. These categories are: physical entities; biological entities; social entities; and intellectual entities.
An alternative root category, based on a somewhat more detailed initial classification, is Category:Main topic classifications.
This is a list of Wikipedia's major topic classifications. These are used throughout Wikipedia to organize the presentation of links to articles on its various reference systems, including Wikipedia's lists, portals, and categories.
Wikipedia is a terrific knowledge resource, and many recent studies in artificial intelligence, information retrieval and related fields used Wikipedia to endow computers with (some) human knowledge. Wikipedia dumps are publicly available in XML format, but they have a few shortcomings. First, they contain a lot of information that is often not used when Wikipedia texts are used as knowledge (e.g., ids of users who changed each article, timestamps of article modifications). On the other hand, the XML dumps do not contain a lot of useful information that could be inferred from the dump, such as link tables, category hierarchy, resolution of redirection links etc.
BitC is a new systems programming language. It seeks to combine the flexibility, safety, and richness of Standard ML or Haskell with the low-level expressiveness of C.
The GWT Window Manager provides a high level windowing system for the GWT applications. It offers a desktop component, dialog features , free floating windows and more. Try it by yourself and feel free to use it, it's free!
Due to an explosion of data, there has been an increasing demand for scalable machine learning and data mining algorithms in many applications, such as social network analysis, information retrieval, recommendation system, biology applications, multimedia, and e-commerce. The objective of this special issue is to connect academia and industry on the methods and experiences of large scale data analysis. We look for scalable machine learning, data mining algorithms, implementations, frameworks and case studies that target at real and practical scenarios for large datasets. The focus is to identify the real challenges in large-scale data mining and to investigate the scalable methods and practical solutions of the core machine learning and data mining problems with respect to both theoretical and experimental perspectives.
The M-tree is an index structure that can be used for the efficient resolution of similarity queries on complex objects to be compared using an arbitrary metric
Consensus clustering has emerged as an important elaboration of the classical clustering problem. Consensus clustering, also called aggregation of clustering (or partitions), refers to the situation in which a number of different (input) clusterings have been obtained for a particular dataset and it is desired to find a single (consensus) clustering which is a better fit in some sense than the existing clusterings. Consensus clustering is thus the problem of reconciling clustering information about the same data set coming from different sources or from different runs of the same algorithm. When cast as an optimization problem, consensus clustering is known as median partition, and has been shown to be NP-complete.
OnlyWire syndicates your content and articles to the web's top social networking sites with a single button click. The OnlyWire Bookmark & Share button gives your website and blog visitors the ability to post your content to all of their social networking sites.
Categories are pages that are used to group other pages on similar subjects together. This is done to help users find the pages they are looking for, even if they do not know whether it exists or what it is called.
Every page should belong to at least one category. A page may often be in several categories. However, putting a page in too many categories may not be useful.
Résumé, Curriculum Vitae or simply CV is an important brief about your professional life. It is likely to be one of the first contacts with a prospective employer. Curriculum Vitae means course of life in Latin. So what exactly should a Résumé contain and how detailed should it be? There is no silver bullet answer. ...
MegaMap is a Java implementation of a map (or hashtable) that can store an unbounded amount of data, limited only by the amount of disk space available. Objects stored in the map are persisted to disk. Good performance is achieved by an in-memory cache. The MegaMap can, for all practical reasons, be thought of as a map implementation with unlimited storage space.
EM has been shown to have favorable convergence properties, automatical satisfaction of constraints, and fast convergence. The next section explains the traditional approach to deriving the EM algorithm and proving its convergence property. Section 3.3 covers the interpretion the EM algorithm as the maximization of two quantities: the entropy and the expectation of complete-data likelihood. Then, the K-means algorithm and the EM algorithm are compared. The conditions under which the EM algorithm is reduced to the K-means are also explained. The discussion in Section 3.4 generalizes the EM algorithm described in Sections 3.2 and 3.3 to problems with partial-data and hidden-state. We refer to this new type of EM as the doubly stochastic EM. Finally, the chapter is concluded in Section 3.5.
In a recent piece called Strong Typing vs. Strong Testing, noted programmer and author Bruce Eckel makes an argument that dynamically typed languages such as Python are superior to statically typed languages such as Java and C++. I've done quite a bit of Python and Java programming, and even a little C++, so I can appreciate his position, but I think the conclusion goes too far. Whether Python is more productive than C++ or Java is one thing, whether static typing in general should be abandoned is quite another.