CLUTO is a software package for clustering low- and high-dimensional datasets and for analyzing the characteristics of the various clusters. CLUTO is well-suited for clustering data sets arising in many diverse application areas including information retrieval, customer purchasing transactions, web, GIS, science, and biology.
Is Ehcache a NoSQL store? No, I would not characterise it as that, but I have seen it used for some NoSQL use cases. In these situations it compared very well — with higher performance and more flexible consistency than the well-known NoSQL stores. Let me explain.
Ganglia is a scalable distributed monitoring system for high-performance computing systems such as clusters and Grids. It is based on a hierarchical design targeted at federations of clusters. It leverages widely used technologies such as XML for data representation, XDR for compact, portable data transport, and RRDtool for data storage and visualization. It uses carefully engineered data structures and algorithms to achieve very low per-node overheads and high concurrency. The implementation is robust, has been ported to an extensive set of operating systems and processor architectures, and is currently in use on thousands of clusters around the world. It has been used to link clusters across university campuses and around the world and can scale to handle clusters with 2000 nodes.
The Globus® Toolkit is an open source software toolkit used for building grids. It is being developed by the Globus Alliance and many others all over the world. A growing number of projects and companies are using the Globus Toolkit to unlock the potential of grids for their cause.
Hazelcast is an open source clustering and highly scalable data distribution platform for Java, which is:
* Lightening-fast; thousands of operations/sec.
* Fail-safe; no losing data after crashes.
* Dynamically scales as new servers added.
* Super-easy to use; include a single jar.
Hazelcast is pure Java. JVMs that are running Hazelcast will dynamically cluster. Although by default Hazelcast will use multicast for discovery, it can also be configured to only use TCP/IP for environments where multicast is not available or preferred.
News: all of the few remaining calls to scipy have been replaced with calls to numpy. Versions 0.1.8 and above do not require scipy as a dependency. Introduction This library provides Python functions for agglomerative clustering. Its features include * generating hierarchical clusters from distance matrices * computing distance matrices from observation vectors * computing statistics on clusters * cutting linkages to generate flat clusters * and visualizing clusters with dendrograms. The interface is very similar to MATLAB's Statistics Toolbox API to make code easier to port from MATLAB to Python/Numpy. The core implementation of this library is in C for efficiency.
ClusterViz is a software to visualize the clustering process using the family of k-means algorithms. The program is free software under the GNU General Public License (GPL). ClusterViz allows to cluster data while visualizing an up to three dimensional projection. The clustering process is visualized using OpenGL. As clustering algorithms the family of k-means algorithms is implemented, including mixture models.
The "Clustered Remoting For Spring Framework" (or Cluster4Spring) is alternative implementation of remoting subsystem included into Spring framework.
Clustered remoting scheme
While implementation of remoting in Spring is great, it has several limitations that are quite important and must be taken into consideration when building large enterprise-level distributed system.
Briefly, these limitations relate to the point-to-point model of remoting supported by Spring - generally speaking, the client may use only one instance of remote service. It is obvious that having only such a scheme of remoting, it is hard to develop fault-tolerant systems and implement some kinds of load balancing.
Another feature, which is currently missing in remoting subsystem offered by Spring framework, is lack of the ability to dynamically discover remote services.
The main purpose of Cluster4Spring is to extend remoting system of Spring framework and overcome limitations mentioned above.
This library provides Python functions for agglomerative clustering. Its features include * generating hierarchical clusters from distance matrices * computing distance matrices from observation vectors * computing statistics on clusters *
L. Hunyadi, и I. Vajk. Proc. of the 15th International Conference on Systems, Signals and Image Processing (IWSSIP), стр. 197--200. Bratislava, Slovakia, (25--28 June 2008)