A statistical modeling approach to content based video retrieval
Milind R. Naphade, Sankar Basu, et al.
ICPR 2008
This research addresses the problem of automatically extracting semantic video scenes from feature films based on multi-modal information. A three-stage scene detection scheme is proposed. First, we use pure visual information to extract a coarse-level scene structure based on generated shot sinks. Second, audio cue is integrated to refine the scene detection results by considering various kinds of audiovisual scenarios. Finally, we introduce users into this process by allowing them to interactively tune the final results to their own satisfaction. The generated scene structure forms a compact yet meaningful abstraction of the video data, which can help facilitate the content access. Preliminary experiments on integrating multiple media cues for movie scene extraction have yielded encouraging results. © 2004 Wiley Periodicals, Inc.
Milind R. Naphade, Sankar Basu, et al.
ICPR 2008
W.D. Little, R. Williams
SIGGRAPH 1976
Graham Mann, Indulis Bernsteins
DIMEA 2007
C. Neti, Salim Roukos
ASRU 1997