Clustering heterogeneous semi-structured social science datasets for security applications

D. B. Skillicorn, C. Leuprecht

Research output: Book chapter/Published conference paperChapter (peer-reviewed)peer-review


Social scientists have begun to collect large datasets that are heterogeneous and semi-structured, but the ability to analyze such data has lagged behind its collection. We design a process to map such datasets to a numerical form, apply singular value decomposition clustering, and explore the impact of individual attributes or fields by overlaying visualizations of the clusters. This provides a new path for understanding such datasets, which we illustrate with three real-world examples: the Global Terrorism Database, which records details of every terrorist attack since 1970; a Chicago police dataset, which records details of every drug-related incident over a period of approximately a month; and a dataset describing members of a Hezbollah crime/terror network in the U.S.
Original languageEnglish
Title of host publicationSecurity by design
Subtitle of host publicationInnovative perspectives on complex problems
EditorsAnthony J. Masys
Place of PublicationCham, Switzerland
Number of pages11
ISBN (Electronic)9783319780214
ISBN (Print)9783319780207
Publication statusPublished - 2018

Publication series

NameAdvanced Sciences and Technologies for Security Applications


Dive into the research topics of 'Clustering heterogeneous semi-structured social science datasets for security applications'. Together they form a unique fingerprint.

Cite this