Abstract
Clustering is a very common unsupervised machine learning task, used to organise datasets into groups that can provide useful insight. Genetic algorithms (GAs) are often applied to the task of clustering as they are effective at finding viable solutions to optimization problems. Parallel genetic algorithms (PGAs) are an existing approach that maximizes the effectiveness of GAs by making them run in parallel with multiple independent subpopulations. Each subpopulation can also communicate by exchanging information throughout the genetic process, enhancing their overall effectiveness. PGAs offer greater performance by mitigating some of the weaknesses of GAs. Firstly, having multiple subpopulations enable the algorithm to more widely explore the solution space. This can reduce the probability of converging to poor-quality local optima, while increasing the chance of finding high-quality local optima. Secondly, PGAs offer improved execution time, as each subpopulation is processed in parallel on separate threads. Our technique advances an existing GA-based method called GenClust++, by employing a PGA along with a novel information sharing technique. We also compare our technique with 2 alternative information sharing functions, as well with no information sharing. On 5 commonly researched datasets, our approach consistently yields improved cluster quality and a markedly reduced runtime compared to GenClust++.
Original language | English |
---|---|
Title of host publication | Data Mining - 17th Australasian Conference, AusDM 2019, Proceedings |
Editors | Thuc D. Le, Lin Liu, Kok-Leong Ong, Yanchang Zhao, Warren H. Jin, Sebastien Wong, Graham Williams |
Publisher | Springer |
Pages | 3-15 |
Number of pages | 13 |
ISBN (Print) | 9789811516986 |
DOIs | |
Publication status | Published - 23 Nov 2019 |
Event | 17th Australasian Data Mining Conference (AusDM '19) - InterContinental Adelaide at North Terrace, Adelaide, Australia Duration: 02 Dec 2019 → 05 Dec 2019 https://web.archive.org/web/20191228213133/https://ausdm19.ausdm.org/index.php (Conference website) https://web.archive.org/web/20191231040752/http://nugget.unisa.edu.au/AI2019/program.pdf (Conference program) |
Publication series
Name | Communications in Computer and Information Science |
---|---|
Volume | 1127 CCIS |
ISSN (Print) | 1865-0929 |
ISSN (Electronic) | 1865-0937 |
Conference
Conference | 17th Australasian Data Mining Conference (AusDM '19) |
---|---|
Country/Territory | Australia |
City | Adelaide |
Period | 02/12/19 → 05/12/19 |
Other | Since AusDM’02 the conference has showcased research in data mining, providing a forum for presenting and discussing the latest research and developments. Built on this tradition, AusDM’19 will facilitate the cross-disciplinary exchange of ideas, experience and potential research directions. Specifically, the conference seeks to showcase: Research Prototypes; Industry Case Studies; Practical Analytics Technology; and Research Student Projects. AusDM’19 will be a meeting place for pushing forward the frontiers of data mining in academia and industry. |
Internet address |
|