Seed-Detective: A Novel Clustering Technique Using High Quality Seed for K-Means on Categorical and Numerical Attributes

Research output: Book chapter/Published conference paperConference paperpeer-review

6 Citations (Scopus)

Abstract

In this paper we present a novel clustering technique called Seed-Detective. It is a combination of modified versions of two existing techniques namely Ex-Detective and Simple K-Means. Seed-Detective first discovers a set of preliminary clusters using our modified Ex-Detective. The modified Ex-Detective allows a data miner to assign different weights (importance levels) for all attributes, both numerical and categorical. Centers of the preliminary clusters are then considered as initial seeds for the modified Simple K-Means, which unlike existing Simple K-Means does not randomly select the initial seeds. Centers of the preliminary clusters are naturally expected to be better quality seeds than the seeds that are chosen randomly. Having better quality initial seeds as input the modified Simple K-Means is expected to produce better quality clusters. We compare Seed-Detective with several existing techniques including Ex-Detective, Simple K-Means, Basic Farthest Point Heuristic (BFPH) and New Farthest Point Heuristic (NFPH) on two publicly available natural data sets. BFPH and NFPH were shown in the literature to be better than Simple K-Means. However, our initial experimental results indicate that Seed-Detective produces better clusters than other techniques, based on several evaluation criteria including F-measure, entropy and purity. Another contribution of this paper is the experimental result on Ex-Detective which was never tested before.
Original languageEnglish
Title of host publication9th Australasian Data Mining Conference
Subtitle of host publicationAusDM 2011
EditorsV Estivill-Castro, S Simoff
Place of PublicationSydney Australia
PublisherAustralian Computer Society Inc
Pages211-220
Number of pages10
Volume121
ISBN (Electronic)9781921770029
Publication statusPublished - 2011
EventThe 9th Australasian Data Mining Conference: AusDM 2011 - University of Ballarat, Ballarat, Australia
Duration: 01 Dec 201102 Dec 2011

Publication series

NameConferences in Research and Practice in Information Technology Series
PublisherAustralian Computer Society
Volume121
ISSN (Print)1445-1336

Conference

ConferenceThe 9th Australasian Data Mining Conference
CountryAustralia
CityBallarat
Period01/12/1102/12/11

Fingerprint Dive into the research topics of 'Seed-Detective: A Novel Clustering Technique Using High Quality Seed for K-Means on Categorical and Numerical Attributes'. Together they form a unique fingerprint.

Cite this