ELM-N: E-learning Media Navigator
Chitra Dorai, Parviz Kermani, et al.
MM 2001
This paper describes a new unified representation for the information in a video. We reduce the dimensionality of the signal with either a singular-value decomposition (on the semantic and image data) or mel-frequency cepstral coefficients (on the audio data) and then concatenate the vectors to form a multi-dimensional representation of the video. Using scale-space techniques we find large jumps in the video's path, which we call edges. We use these techniques to analyze the temporal properties of the audio and image data in a video. This analysis creates a hierarchical segmentation of the video, or a table-of-contents, from the audio, semantic and image data.
Chitra Dorai, Parviz Kermani, et al.
MM 2001
T. Syeda-Mahmood, D. Ponceleon
MM 2001
Erich P. Stuntebeck, John S. Davis II, et al.
HotMobile 2008
M.X. Zhou, S. Pan
MM 2001