Logo Logo
Help
Contact
Switch Language to German
Potha, N.; Maragoudakis, M.; Lyras, D. (2016): A biology-inspired, data mining framework for extracting patterns in sexual cyberbullying data. In: Knowledge-Based Systems, Vol. 96: pp. 134-155
Full text not available from 'Open Access LMU'.

Abstract

With the rapid growth of social media, users, especially adolescents, are spending significant amount of time on various social networking sites to connect with others, to share information, and to pursue common interests. However, as social networking has become widespread, certain people are finding illegal and unethical ways to use these communities as means for opening the door of inappropriate online activities. Thus, they are providing an open way for cybercrimes such as cyberbullying. In this paper, we deal with the aforementioned issue as a time series modelling methodology, aiming at the recognition of bullying patterns within the questions posed by a predator to his victims. Given a set of real world transcripts (i.e. the whole set of predator's questions), in which each question is numerically labelled in terms of severity, we first model each set of predator's questions as a time series. The next step is the main contribution of this paper, in terms of changing the representation scheme from time series data into symbolic representation. More specifically, inspired by the Multiple Sequence Alignment (MSA) method, commonly used in computational biology for identifying conserved regions of similarity among raw molecular data, we represent the set of signals according to a SAX (Symbolic Aggregate approXimation) symbolic representation, transforming each signal into a symbol string. The main rationale behind this adoption lies to the fact that the collected cyberbullying data can be converted to string sequences via SAX conversion, which in turn can be aligned, thus revealing conserved temporal patterns or slight variations in the attacking strategies of the predators. Experimental results, based on the clustering improvement of the aforementioned data using the extracted patterns instead of the time series data, justify our claims. (C) 2015 Elsevier B.V. All rights reserved.