Abstract

The ability to extract knowledge from data has been the driving force of Data Mining since its inception, and of statistical modeling long before even that. Actionable knowledge often takes the form of patterns, otherwise known as rules, where a set of antecedents can be used to infer a consequent. In this paper we offer a solution to the problem of comparing different sets of rules. Our solution allows comparisons between rule lists that were derived from different techniques (such as different classification algorithms), or made from different samples of data (such as temporal data or data perturbed for privacy reasons). We propose using the Jaccard Index to measure the similarity between rule lists, by converting each rule into a single element within the set of rules. Our measure focuses on providing conceptual simplicity, computational simplicity, interpretability, and wide applicability. The results of this measure are compared to Prediction Accuracy in the context of a real-world data mining scenario.
Original languageEnglish
Title of host publicationProceedings of the 14th Australasian Data Mining Conference (AusDM 16)
Place of PublicationAustralia
PublisherCRPIT
Pages1-8
Number of pages8
Publication statusPublished - 2016
EventThe 14th Australasian Data Mining Conference: AusDM 2016 - Realm Hotel, Canberra, Australia
Duration: 06 Dec 201608 Dec 2016
http://ausdm16.ausdm.org/

Conference

ConferenceThe 14th Australasian Data Mining Conference
Country/TerritoryAustralia
CityCanberra
Period06/12/1608/12/16
Internet address

Fingerprint

Dive into the research topics of 'Measuring the similarity between rule lists'. Together they form a unique fingerprint.

Cite this