Abstract

In this paper we present a clustering technique called Den-Clust that produces high quality initial seeds through a deterministicprocess without requiring an user input on the number of clusters k andthe radius of the clusters r. The high quality seeds are given input to K-Means as the set of initial seeds to produce the final clusters. DenClustuses a density based approach for initial seed selection. It calculates thedensity of each record, where the density of a record is the number ofrecords that have the minimum distances with the record. This approachis expected to produce high quality initial seeds for K-Means resultingin high quality clusters from a dataset. The performance of DenClust iscompared with five (5) existing techniques namely CRUDAW, AGCUK,Simple K-means (SK), Basic Farthest Point Heuristic (BFPH) and NewFarthest Point Heuristic (NFPH) in terms of three (3) external clusterevaluation criteria namely F-Measure, Entropy, Purity and two (2) in-ternal cluster evaluation criteria namely Xie-Beni Index (XB) and Sumof Square Error (SSE). We use three (3) natural datasets that we obtain from the UCI machine learning repository. DenClust performs better than all five existing techniques in terms of all five evaluation criteria forall three datasets used in this study.
Original languageEnglish
Title of host publicationArtificial Intelligence and Soft Computing
Subtitle of host publication13th International Conference, ICAISC 2014, Zakopane, Poland, June 1-5, 2014, Proceedings, Part II
Place of PublicationSwitzerland
PublisherSpringer International Publishing
Pages784-795
Number of pages12
Volume8468
DOIs
Publication statusPublished - 2014
EventInternational Conference on Artificial Intelligence and Soft Computing - Zakopane, Poland, Poland
Duration: 01 Jun 201405 Jun 2014

Publication series

Name
ISSN (Print)0302-9743

Conference

ConferenceInternational Conference on Artificial Intelligence and Soft Computing
Country/TerritoryPoland
Period01/06/1405/06/14

Fingerprint

Dive into the research topics of 'DenClust: A density based seed selection approach for K-Means'. Together they form a unique fingerprint.

Cite this