Multimodal Deep Learning Violence Detector for Child-Friendly Online Game

Jasson Prestiliano; Azhari Azhari; Arif Nurwidyantoro

doi:10.17083/9k96e890

Authors

Jasson Prestiliano Universitas Kristen Satya Wacana
Azhari Azhari Universitas Gadjah Mada
Arif Nurwidyantoro Universitas Gadjah Mada

DOI:

https://doi.org/10.17083/9k96e890

Keywords:

multimodal, Deep Learning, child-friendly rated, Online Games, visual violence detection, verbal violence detection

Abstract

The violence present in child-friendly internet games includes both visual and verbal aggression. Visual violence occurs when players perform actions that harm themselves or another player's avatar. On the other hand, verbal aggression often happens during player interactions, even if no physical action takes place. This study explores whether a multimodal deep-learning framework can more effectively detect violence by simultaneously analyzing visual and verbal signals, and whether a hybrid late fusion approach provides better results than traditional fusion methods. Methodologically, the visual modality integrates 3DCNN, BiLSTM, and attention mechanisms, while the verbal modality incorporates BERT and BiLSTM. Each modality is handled independently. The hybrid late fusion employs rule-based and softmax probability to integrate the outcomes of each modality. The proposed multimodal model achieves an average accuracy of 96.72%, with 99.14% for the visual modality and 94.30% for the verbal modality. This performance clearly surpasses existing state-of-the-art fusion methods. The novelty of this study lies in the combination of each modality model and its integration of a hybrid late fusion multimodal approach. Additionally, the study outlines the process and stages for incorporating the model into a system suitable for any child-friendly online game, creating an early warning system for parents.

Multimodal Deep Learning Violence Detector for Child-Friendly Online Game

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Issue

Section

License

How to Cite

Information

Developed By

Language