ADAPTIVE MICRO PARTITION AND HIERARCHICAL MERGING FOR ACCURATE MIXED DATA CLUSTERING

Adaptive micro partition and hierarchical merging for accurate mixed data clustering

Adaptive micro partition and hierarchical merging for accurate mixed data clustering

Blog Article

Abstract Heterogeneous attribute data (also called mixed data), characterized by attributes with numerical and categorical values, occur frequently across various scenarios.Since the annotation cost is high, clustering has emerged as a favorable technique for analyzing unlabeled mixed data.To address the complex real-world clustering task, this paper proposes a new clustering method called Adaptive Micro Partition and Hierarchical Merging (AMPHM) based on neighborhood rough set theory and a novel hierarchical merging mechanism.Specifically, we present a distance metric unified on numerical and categorical here attributes to leverage neighborhood rough sets in partitioning data objects into fine-grained compact clusters.

Then, we gradually merge the current most similar clusters to avoid lunch boxes incorporating dissimilar objects into a similar cluster.It turns out that the proposed approach breaks through the clustering performance bottleneck brought by the pre-set number of sought clusters k and cluster distribution bias, and is thus capable of clustering datasets comprising various combinations of numerical and categorical attributes.Extensive experimental evaluations comparing the proposed AMPHM with state-of-the-art counterparts on various datasets demonstrate its superiority.

Report this page