The Workflow Patterns initiative is a joint effort of Eindhoven University of Technology (led by Professor Wil van der Aalst) and Queensland University of Technology (led by Professor Arthur ter Hofstede) which started in 1999. The aim of this initiative is to provide a conceptual basis for process technology. In particular, the research provides a thorough examination of the various perspectives (control flow, data, resource, and exception handling) that need to be supported by a workflow language or a business process modelling language. The results can be used for examining the suitability of a particular process language or workflow system for a particular project, assessing relative strengths and weaknesses of various approaches to process specification, implementing certain business requirements in a particular process-aware information system, and as a basis for language and tool development.
Organisatoren
François Bry (LMU München, DE)
Clemens Cap (Universität Rostock, DE)
Ingo Dahm (Microsoft Deutschland GmbH - Unterschleissheim, DE)
Julia Maintz (Microsoft Deutschland GmbH - Köln, DE)
Sebastian Schaffert (Salzburg Research Forschungsgesellschaft m.b.H., AT)
Abstract
We formalize clustering as a partitioning problem with a user-defined internal clustering criterion and present SINICC, an unbiased, empirical method for comparing internal clustering criteria. An application to multi-sensor fusion is described, where the data set is composed of inexact sensor “reports” pertaining to “objects” in an environment. Given these reports, the objective is to produce a representation of the environment, where each entity in the representation is the result of “fusing” sensor reports. Before one can perform fusion, however, the reports must be “associated” into homogeneous clusters. Simulated annealing is used to find a near-optimal partitioning with respect to each of several clustering criteria for a variety of simulated data sets. This method can then be used to determine the “best” clustering criterion for the multi-sensor fusion problem with a given fusion operator.
costs US $ 31.50
CiteSeerX - Document Details (Isaac Councill, Lee Giles): A novel evolution strategy implementing variable length genomes is developed to address the problem of dynamic partitional clustering. As opposed to static, dynamic partitional clustering does not require the a priori specification of the number of clusters. Results of the algorithm are presented and discussed for 2-D touching and non-touching cluster test cases.
Abstract
In this paper we consider the problem of clustering m objects into c clusters. The objects are represented by points in an n-dimensional Euclidean space, and the objective is to classify these m points into c clusters such that the distance between points within a cluster and its center (which is to be found) is minimized. The problem is a nonconvex program that has many local minima. It has been studied by many researchers and the most well-known algorithm for solving it is the k-means algorithm. In this paper, we develop a new algorithm for solving this problem based on a tabu search technique. Preliminary computational experience on the developed algorithm are encouraging and compare favorably with both the k-means and the simulated annealing algorithms.
costs US $ 31.50
Abstract:
In this paper, an algorithm for cluster generation using tabu search approach with simulated annealing is proposed. The main idea of this algorithm is to use the tabu search approach to generate non-local moves for the clusters and apply the simulated annealing technique to select suitable current best solution so that speed the cluster generation. Experimental results demonstrate the proposed tabu search approach with simulated annealing algorithm for cluster generation is superior to the tabu search approach with Generalised Lloyd algorithm. 1 Clustering Clustering is the process of grouping patterns into a number of clusters, each of which contains the patterns that are similar to each other in some way. The existing clustering algorithms can be simply classied into the following two categories: hierarchical clustering and partitional clustering [1]. The hierarchical clustering operates by partitioning the patterns into successively fewer structures. This method gives rise to a d...
CiteSeerX - Document Details (Isaac Councill, Lee Giles): This paper describes a genetically guided approach to optimizing the hard (J1) and fuzzy (Jm) c-means functionals used in cluster analysis. Our experiments show that a genetic algorithm ameliorates the difficulty of choosing an initialization for the c-means clustering algorithms. Experiments use six data sets, including the Iris data, magnetic resonance and color images. The genetic algorithm approach is generally able to find the lowest known Jm value or a Jm associated with a partition very similar to that associated with the lowest Jm value. On data sets with several local extrema, the GA approach always avoids the less desirable solutions. Degenerate partitions are always avoided by the GA approach, which provides an effiective method for optimizing clustering models whose objective function can be represented in terms of cluster centers. The time cost of genetic guided clustering is shown to make a series random initializations of fuzzy/hard c-means, where the partition a...
CiteSeerX - Document Details (Isaac Councill, Lee Giles): Clustering is a hard combinatorial problem and is defined as the unsupervised classification of patterns. The formation of clusters is based on the principle of maximizing the similarity between objects of the same cluster while simultaneously minimizing the similarity between objects belonging to distinct clusters. This paper presents a tool for database clustering using a rule-based genetic algorithm (RBCGA). RBCGA evolves individuals consisting of a fixed set of clustering rules, where each rule includes d non-binary intervals, one for each feature. The investigations attempt to alleviate certain drawbacks related to the classical minimization of square-error criterion by suggesting a flexible fitness function which takes into consideration, cluster asymmetry, density, coverage and homogeny.
CiteSeerX - Document Details (Isaac Councill, Lee Giles): This paper proposes a new evolutionary algorithm for subspace clustering in very large and high dimensional databases. The design includes task-specific coding and genetic operators, along with a non-random initialization procedure. Reported experimental results show the algorithm scales almost linearly with the size and dimensionality of the database as well as the dimensionality of the hidden clusters.
CiteSeerX - Document Details (Isaac Councill, Lee Giles): Clustering categorical databases presents special difficulties due to the absence of natural dissimilarities between objects. We present a solution that overcomes these difficulties that is based on an information-theoretical definition of dissimilarities between partitions of finite sets (applied to partitions of the set of objects to be clustered which are determined by categorical attributes) and makes use of genetic algorithms for finding an acceptable approximative clustering. We tested our method on databases for which the clustering of the rows is known in advance and we show that our proposed method finds the natural clustering of the data with a good classification rate, better than that of the classical algorithm k-means.
CiteSeerX - Document Details (Isaac Councill, Lee Giles): In a database with categorical attributes each attribute denes a partition whose classes can be regarded as natural clusters of rows. The main theme of this paper is nding a partition of the rows of the database that is as close as possible to the partitions associated to each attribute. The classes of this partition will then be treated as clusters of rows. We evaluate the closeness of two partitions by using certain generalizations of the classical conditional entropy. From this perspective, we wish to construct a partition (referred to as the median partition) such that the sum of the dissimilarities between this partition and all the partitions determined by the attributes of the database is minimal. Then, the problem of nding the median partition is an optimization problem over the space of all partitions of the rows of the database for which we give an approximative solution. To search more eciently the space of possible partitions, which can be very large, we are using a genetic algorithm. Partitions are represented by chromosomes and we tested both the classical techniques of mutation and crossover and certain special mutation and crossover methods that contain specic knowledge of the problem domain. Keywords: median partition, Shannon entropy, Gini index, mutations, crossover operations 1
Abstract--Tbe applicability of evolution strategies (ESs), population based stochastic optimization techniques,
to optimize clustering objective functions is explored. Clustering objective functions are categorized into
centroid and non-centroid type of functions. Optimization of the centroid type of objective functions is
accomplished by formulating them as functions of real-valued parameters using ESs. Both hard and fuzzy
clustering objective functions are considered in this study. Applicability of ESs to discrete optimization
problems is extended to optimize the non-centroid type of objective functions. As ESs are amenable to
parallelization, a parallel model (master/slave model) is described in the context of the clustering problem.
Results obtained for selected data sets substantiate the utility of ESs in clustering.
The Seminar on Stochastic Processes is an ongoing series of conferences. Many of the participants work on such topics as Markov processes, Brownian motion, Superprocesses, Stochastic Analysis, or Mathematical Finance. SSP encourages interaction between young and established researchers, and typically devotes half its program to informal and problem sessions.
This webinar will demonstrate the new features found in Modeling projects releasing with Europa, this year’s simultaneous release of Eclipse projects. From the new Java 5 and generics support in EMF, to the new UML2 Tools diagrams and enhancements in OCL and GMF, we will provide an overview of what’s new and exciting in Modeling followed by Q&A.