Unsupervised feature learning for bootleg detection using deep learning architectures

Buccoli, Michele; Bestagini, Paolo; Zanoni, Massimiliano; Sarti, Augusto; Tubaro, Stefano

doi:10.1109/WIFS.2014.7084316

The widespread diffusion of portable devices capable of capturing high-quality multimedia data, together with the rapid proliferation of media sharing platforms, has determined an incredible growth of user-generated content available online. Since it is hard to strictly regulate this trend, illegal diffusion of copyrighted material is often likely to occur. This is the case of audio bootlegs, i.e., concerts illegally recorded and redistributed by fans. In this paper, we propose a bootleg detector, with the aim of disambiguating between: i) bootlegs unofficially recorded; ii) live concerts officially published; iii) studio recordings from officially released albums. The proposed method is based on audio feature analysis and machine learning techniques. We exploit a deep learning paradigm to extract highly characterizing features from audio excerpts, and a supervised classifier for detection. The method is validated against a dataset of nearly 500 songs, and results are compared to a state-of-the-art detector. The conducted experiments confirm the capability of deep learning techniques to outperform classic feature extraction approaches.