Conference paper
The thirteen colors of timbre
Hiroko Terasawa, Malcolm Slaney, et al.
WASPAA 2005
FaceSync is an optimal linear algorithm that finds the degree of synchronization between the audio and image recordings of a human speaker. Using canonical correlation, it finds the best direction to combine all the audio and image data, projecting them onto a single axis. FaceSync uses Pearson's correlation to measure the degree of synchronization between the audio and image data. We derive the optimal linear transform to combine the audio and visual information and describe an implementation that avoids the numerical problems caused by computing the correlation matrices.
Hiroko Terasawa, Malcolm Slaney, et al.
WASPAA 2005
Tong Zhang
NeurIPS 2000
Daniel M. Russell, Malcolm Slaney, et al.
HICSS 2006
Malcolm Slaney
ICASSP 2002