Improving clustering via a fine-grained parallel genetic algorithm with information sharing

Storm Bartlett, Md Zahidul Islam

Research output: Book chapter/Published conference paperConference paperpeer-review

Abstract

Clustering is a very common unsupervised machine learning task, used to organise datasets into groups that can provide useful insight. Genetic algorithms (GAs) are often applied to the task of clustering as they are effective at finding viable solutions to optimization problems. Parallel genetic algorithms (PGAs) are an existing approach that maximizes the effectiveness of GAs by making them run in parallel with multiple independent subpopulations. Each subpopulation can also communicate by exchanging information throughout the genetic process, enhancing their overall effectiveness. PGAs offer greater performance by mitigating some of the weaknesses of GAs. Firstly, having multiple subpopulations enable the algorithm to more widely explore the solution space. This can reduce the probability of converging to poor-quality local optima, while increasing the chance of finding high-quality local optima. Secondly, PGAs offer improved execution time, as each subpopulation is processed in parallel on separate threads. Our technique advances an existing GA-based method called GenClust++, by employing a PGA along with a novel information sharing technique. We also compare our technique with 2 alternative information sharing functions, as well with no information sharing. On 5 commonly researched datasets, our approach consistently yields improved cluster quality and a markedly reduced runtime compared to GenClust++.

Original languageEnglish
Title of host publicationData Mining - 17th Australasian Conference, AusDM 2019, Proceedings
EditorsThuc D. Le, Lin Liu, Kok-Leong Ong, Yanchang Zhao, Warren H. Jin, Sebastien Wong, Graham Williams
PublisherSpringer
Pages3-15
Number of pages13
ISBN (Print)9789811516986
DOIs
Publication statusPublished - 23 Nov 2019
Event17th Australasian Data Mining Conference (AusDM '19) - InterContinental Adelaide at North Terrace, Adelaide, Australia
Duration: 02 Dec 201905 Dec 2019
https://web.archive.org/web/20191228213133/https://ausdm19.ausdm.org/index.php (Conference website)
https://web.archive.org/web/20191231040752/http://nugget.unisa.edu.au/AI2019/program.pdf (Conference program)

Publication series

NameCommunications in Computer and Information Science
Volume1127 CCIS
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937

Conference

Conference17th Australasian Data Mining Conference (AusDM '19)
Country/TerritoryAustralia
CityAdelaide
Period02/12/1905/12/19
OtherSince AusDM’02 the conference has showcased research in data mining, providing a forum for presenting and discussing the latest research and developments. Built on this tradition, AusDM’19 will facilitate the cross-disciplinary exchange of ideas, experience and potential research directions. Specifically, the conference seeks to showcase: Research Prototypes; Industry Case Studies; Practical Analytics Technology; and Research Student Projects. AusDM’19 will be a meeting place for pushing forward the frontiers of data mining in academia and industry.
Internet address

Fingerprint

Dive into the research topics of 'Improving clustering via a fine-grained parallel genetic algorithm with information sharing'. Together they form a unique fingerprint.

Cite this