D6.4 Open: Profile matching and risk indicators for potential young victims and offenders
This deliverable represents the last outcome of Task 6.3 “Profile matching and definition of risk indicators for potential young victims and offenders”, which is the third task of Work Package (WP) 6 in the RAYUELA project. This task obtains as input (from previous tasks) a series of potentially key factors/variables for detecting or probabilistically classifying the participants of the pilots. This also helps to create a series of risk patterns (of offender and victim) for the cybercrimes under consideration. However, the approach used in this task is based on a different mindset than the one used in Task 6.2 (based on Machine Learning predictions). In this task, an approach based on causality and Bayesian statistics is used.
More specifically, we have analysed the data collected in the RAYUELA pilots using Bayesian Networks. We have also been assisted by RAYUELA cyberbullying experts for proposing such network architectures. Subsequently, we perform a series of causal statistical analyses that help us identify key factors/drivers to determine the characteristics of potential victims and perpetrators. Finally, we note a series of comments and limitations on the techniques used and the data available so far, which make us cautious about the conclusions that can be drawn from the results.
It should be noted that the reliability of the results we can obtain depends on the cybercrime in question. In the case of cyberbullying, it is the only cybercrime considered for which we have a validated psychological questionnaire that players must answer [7]. This would be the data collected that is closest to a “ground truth” to serve as a validation/evaluation. In this way, the methods and conclusions drawn from the analysis of cyberbullying will be useful for the study of the other cybercrimes considered in RAYUELA.
Based on these findings, it appears that the variables collected through the RAYUELA serious game are promising for risk estimation. Although when looking for the strength of influence of multiple variables at the same time (i.e., multifactor analysis), the difference between the variables coming from the video game and those from profiling (demographic and psychological) narrows. It is important to exercise caution in interpreting the results due to the limited amount of data available for analysis, as well as the potential noise inherent in social science and video game data. Nevertheless, these initial results suggest that the RAYUELA serious game has the potential to be a valuable tool for social research purposes, highlighting the need for further exploration of its capabilities.