Effective tools for detection of violence are highly demanded, specially when dealing with video streams. Such tools have a wide range of applications, from forensics and law enforcement to parental control over the ever increasing amount of videos available online. Prior studies showed that deep learning has great potential in detecting violence, but focuses on detecting violence in general, or only specific cases of violent behavior. While the concept of violence is broad and highly subjective, simpler concepts such as fights, explosions, and gunshots, convey the idea of violence while being more objective. Even though different concepts relate to this same broader idea of violence, they differ widely in relation to whether or not they convey the idea of movement, the presence of a specific object, or even if they generate distinctive sounds. In this study, we propose to analyze different concepts related to violence and how to better describe these concepts exploring visual and auditory cues in order to reach a robust method to detect violence.

Multimodal Violence Detection in Videos

Bestagini P.;
2020-01-01

Abstract

Effective tools for detection of violence are highly demanded, specially when dealing with video streams. Such tools have a wide range of applications, from forensics and law enforcement to parental control over the ever increasing amount of videos available online. Prior studies showed that deep learning has great potential in detecting violence, but focuses on detecting violence in general, or only specific cases of violent behavior. While the concept of violence is broad and highly subjective, simpler concepts such as fights, explosions, and gunshots, convey the idea of violence while being more objective. Even though different concepts relate to this same broader idea of violence, they differ widely in relation to whether or not they convey the idea of movement, the presence of a specific object, or even if they generate distinctive sounds. In this study, we propose to analyze different concepts related to violence and how to better describe these concepts exploring visual and auditory cues in order to reach a robust method to detect violence.
2020
ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
978-1-5090-6631-5
computer vision
deeplearning
forensic computing
multimodal classification
violence classification
File in questo prodotto:
File Dimensione Formato  
2020_icassp_violence_detection.pdf

accesso aperto

: Post-Print (DRAFT o Author’s Accepted Manuscript-AAM)
Dimensione 461.07 kB
Formato Adobe PDF
461.07 kB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11311/1171194
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 29
  • ???jsp.display-item.citation.isi??? 10
social impact