MPI-INF/SWS Research Reports 1991-2021

2. Number - only D1


Discovering all most specific sentences by randomized algorithms

Gunopulos, Dimitrios and Mannila, Heikki and Saluja, Sanjeev

September 1996, 23 pages.

Status: available - back from printing

Data mining can in many instances be viewed as the task of computing a representation of a theory of a model or of a database. In this paper we present a randomized algorithm that can be used to compute the representation of a theory in terms of the most specific sentences of that theory. In addition to randomization, the algorithm uses a generalization of the concept of hypergraph transversals. We apply the general algorithm in two ways, for the problem of discovering maximal frequent sets in 0/1 data, and for computing minimal keys in relations. We present some empirical results on the performance of these methods on real data. We also show some complexity theoretic evidence of the hardness of these problems.

  • Attachement: (329 KBytes)

URL to this document:

Hide details for BibTeXBibTeX
  AUTHOR = {Gunopulos, Dimitrios and Mannila, Heikki and Saluja, Sanjeev},
  TITLE = {Discovering all most specific sentences by randomized algorithms},
  TYPE = {Research Report},
  INSTITUTION = {Max-Planck-Institut f{\"u}r Informatik},
  ADDRESS = {Im Stadtwald, D-66123 Saarbr{\"u}cken, Germany},
  NUMBER = {MPI-I-96-1-023},
  MONTH = {September},
  YEAR = {1996},
  ISSN = {0946-011X},