Automatic Selection of High Quality Initial Seeds for Generating High Quality Clusters without requiring any User Inputs

Research output: ThesisDoctoral Thesis

122 Downloads (Pure)


Clustering is an important data mining task which is a process of grouping similar records into one cluster and dissimilar records into different clusters. It is used in various fields for knowledge discovery and decision making. There are many existing clustering techniques. However, many of them have a number of limitations such as the requirement of various user inputs (such as the number of clusters) and getting stuck at local optima. It can be difficult for a user to provide the user inputs in advance. There is also room for further improvement of the quality of the clusters produced by the techniques. Since clustering is widely used in many fields it is important to produce clustering techniques that produce better quality clustering results. In this study we propose clustering techniques that produce high quality clusters without requiring any user input on the number of clusters. The proposed techniques produce high quality initial seeds that are then fed into K-Means to produce high quality clusters. We argue that the user should be allowed to assign (if he/she wants to) attribute weights in order to satisfy his/her clustering purpose. While our techniques allow the user to assign weights they also permit the user to perform clustering without any input on the weights. Moreover, we propose a technique that automatically selects attribute weights. Finally we propose a technique called GenClust that does not require any user input and produces better quality clusters than many existing techniques in terms of six cluster evaluation criteria over the 20 datasets that we used in the experiments.
Original languageEnglish
QualificationDoctor of Philosophy
Awarding Institution
  • Charles Sturt University
  • Islam, Zahid, Principal Supervisor
  • Bossomaier, Terry, Co-Supervisor
  • Zia, Tanveer, Co-Supervisor
Award date01 Mar 2014
Place of PublicationAustralia
Publication statusPublished - 2014

Fingerprint Dive into the research topics of 'Automatic Selection of High Quality Initial Seeds for Generating High Quality Clusters without requiring any User Inputs'. Together they form a unique fingerprint.

Cite this