In this paper we present a clustering technique called Den-Clust that produces high quality initial seeds through a deterministicprocess without requiring an user input on the number of clusters k andthe radius of the clusters r. The high quality seeds are given input toK-Means as the set of initial seeds to produce the -- CORRECTION REQUIRED HERE -- 12;nal clusters. DenClustuses a density based approach for initial seed selection. It calculates thedensity of each record, where the density of a record is the number ofrecords that have the minimum distances with the record. This approachis expected to produce high quality initial seeds for K-Means resultingin high quality clusters from a dataset. The performance of DenClust iscompared with -- CORRECTION REQUIRED HERE -- 12;ve (5) existing techniques namely CRUDAW, AGCUK,Simple K-means (SK), Basic Farthest Point Heuristic (BFPH) and NewFarthest Point Heuristic (NFPH) in terms of three (3) external clusterevaluation criteria namely F-Measure, Entropy, Purity and two (2) in-ternal cluster evaluation criteria namely Xie-Beni Index (XB) and Sumof Square Error (SSE). We use three (3) natural datasets that we obtainfrom the UCI machine learning repository. DenClust performs betterthan all -- CORRECTION REQUIRED HERE -- 12;ve existing techniques in terms of all -- CORRECTION REQUIRED HERE -- 12;ve evaluation criteria forall three datasets used in this study.
|Conference||International Conference on Artificial Intelligence and Soft Computing|
|Period||01/06/14 → 05/06/14|