Non-redundant Subgroup Discovery Using a Closure System
M. Boley, and H. Grosskreutz. Machine Learning and Knowledge Discovery in Databases, (2009)
Abstract
Subgroup discovery is a local pattern discovery task, in which descriptions of subpopulations of a database are evaluated
against some quality function. As standard quality functions are functions of the described subpopulation, we propose to searchfor equivalence classes of descriptions with respect to their extension in the database rather than individual descriptions.These equivalence classes have unique maximal representatives forming a closure system. We show that minimum cardinality representativesof each equivalence class can be found during the enumeration process of that closure system without additional cost, whilefinding a minimum representative of a single equivalence class is NP-hard. With several real-world datasets we demonstrate that search space and output are significantly reduced by consideringequivalence classes instead of individual descriptions and that the minimum representatives constitute a family of subgroupdescriptions that is of same or better expressive power than those generated by traditional methods.
%0 Journal Article
%1 mario2009nonredundant
%A Boley, Mario
%A Grosskreutz, Henrik
%D 2009
%J Machine Learning and Knowledge Discovery in Databases
%K closure ecml09 imported relevancy sg_Discovery
%P 179--194
%T Non-redundant Subgroup Discovery Using a Closure System
%U http://dx.doi.org/10.1007/978-3-642-04180-8_29
%X Subgroup discovery is a local pattern discovery task, in which descriptions of subpopulations of a database are evaluated
against some quality function. As standard quality functions are functions of the described subpopulation, we propose to searchfor equivalence classes of descriptions with respect to their extension in the database rather than individual descriptions.These equivalence classes have unique maximal representatives forming a closure system. We show that minimum cardinality representativesof each equivalence class can be found during the enumeration process of that closure system without additional cost, whilefinding a minimum representative of a single equivalence class is NP-hard. With several real-world datasets we demonstrate that search space and output are significantly reduced by consideringequivalence classes instead of individual descriptions and that the minimum representatives constitute a family of subgroupdescriptions that is of same or better expressive power than those generated by traditional methods.
@article{mario2009nonredundant,
abstract = {Subgroup discovery is a local pattern discovery task, in which descriptions of subpopulations of a database are evaluated
against some quality function. As standard quality functions are functions of the described subpopulation, we propose to searchfor equivalence classes of descriptions with respect to their extension in the database rather than individual descriptions.These equivalence classes have unique maximal representatives forming a closure system. We show that minimum cardinality representativesof each equivalence class can be found during the enumeration process of that closure system without additional cost, whilefinding a minimum representative of a single equivalence class is NP-hard. With several real-world datasets we demonstrate that search space and output are significantly reduced by consideringequivalence classes instead of individual descriptions and that the minimum representatives constitute a family of subgroupdescriptions that is of same or better expressive power than those generated by traditional methods.},
added-at = {2009-09-24T11:06:17.000+0200},
author = {Boley, Mario and Grosskreutz, Henrik},
biburl = {https://www.bibsonomy.org/bibtex/2de68c69146f7d35e58dc6e73894d7646/lemmi},
description = {SpringerLink - Buchkapitel},
interhash = {367a5cf3c5e0e6fdf7d23ef091e07252},
intrahash = {de68c69146f7d35e58dc6e73894d7646},
journal = {Machine Learning and Knowledge Discovery in Databases},
keywords = {closure ecml09 imported relevancy sg_Discovery},
pages = {179--194},
timestamp = {2009-09-24T11:09:17.000+0200},
title = {Non-redundant Subgroup Discovery Using a Closure System},
url = {http://dx.doi.org/10.1007/978-3-642-04180-8_29},
year = 2009
}