ECCV 2012 - LNCS 7572-7578 and 7583-7585

Complex Events Detection Using Data-Driven Concepts

Yang Yang and Mubarak Shah

Computer Vision Lab, University of Central Florida, USA
yyang@eecs.ucf.edu
shah@eecs.ucf.edu

Abstract. Automatic event detection in a large collection of unconstrained videos is a challenging and important task. The key issue is to describe long complex video with high level semantic descriptors, which should find the regularity of events in the same category while distinguish those from different categories. This paper proposes a novel unsupervised approach to discover data-driven concepts from multi-modality signals (audio, scene and motion) to describe high level semantics of videos. Our methods consists of three main components: we first learn the low-level features separately from three modalities. Secondly we discover the data-driven concepts based on the statistics of learned features mapped to a low dimensional space using deep belief nets (DBNs). Finally, a compact and robust sparse representation is learned to jointly model the concepts from all three modalities. Extensive experimental results on large in-the-wild dataset show that our proposed method significantly outperforms state-of-the-art methods.

LNCS 7574, p. 722 ff.

Full article in PDF | BibTeX