CENTER:Video Indexing and Content-Based Retrieval

  1. Shot boundary detection
  2. Keyframe selection
  3. Storyboard as the first generation video summarization
  4. Prototyping of content-based retrieval

What's Unique?

  • A compact description of meta data produced by 3-D/2-D wavelet transforms
  • Fast processing because of compact representations and their consistent reuse
  • Hence, real-time indexing
  • Low hit rate against false positives at high recall rates

Demonstrations on Shot Boundary Detection

Red bars come to appear, when a shot boundary has been detected
Sample Video: Color Harmony at Your Home

Compact Description of a Video Sequence

  • A wavelet transform is applied to find a compact description of a video interval. The duration of a video interval is assumed to be long enough to catch gradual shot transitions, and also to be short enough so that two or more shot transitions may not be involved with a single interval.
  • The compact descriptor consists of largest n wavelet coefficients that have been identified as significant by scanning coefficients in the order of coarse-to-fine as well as in the order of temporal first, horizontal second, and vertical third. Note that every significant coefficient in fine subbbands is quantized into a single bit for positive or negative.
  • There is no need to do a two-dimensional wavelet transform, if a target video sequence has been already encoded by a wavelet transform-based coding such as Motion JPEG 2000.
  • What has to be actually computed is a three-dimensional spatio-temporal wavelet transform. It is computed by a one-dimensional temporal wavelet transform of the coarsest subband components of an encoded video interval that has been encoded by a wavelet transform-based video codec. Hence the computational load for 3-dimensional wavelet transmorms is quite little.

Shot Boundary Detection

  • Shot boundaries in a video sequence are detected in a two-step procedure. Every video interval is checked if it is close to the next interval, before a particular frame is selected out of a significant interval characterized by a sharp feature.

Keyframe Selection

  • The number of keyframes is given by a user's preference.
  • Initial candidates for keyframes are generated by a representative frame of every shot. Their feature vectors are clustered and subsampled to find a convergence to a set of a preferable number of keyframes.
  • As a result, it displays a story board for an abstraction of a video sequence.
screen shots in process of story board production

#ref(): File not found: "wv_sbd150S.jpg" at page "KLab/VideoRetrieval"

#ref(): File not found: "wv_storyS.jpg" at page "KLab/VideoRetrieval"

Content-Based Video Retrieval

  • A reference is presented by a user. A reference can be a frame picture, a video clip, and a handout picture.
  • A particular representation of the reference is referred to as a 3DSD (Three-Dimensional Significance Descriptor), and is generated by 2D/3D wavelet transforms. Target video sequences share the same structure of feature description with the 3DSD. The structure is so simple that fast matching between a reference and target video sequences is implemented.
  • Top 20 keyframes are displayed to a user and he/she will finally pick up the best matching target.
a query picture and top 20 keyframes, A is for atom

#ref(): File not found: "cbvrDrAtom.jpg" at page "KLab/VideoRetrieval"


  1. S. Hasebe, M. Nagumo, S. Muramatsu, and H. Kikuchi, Two-Step Detection of Video Shot Boundaries in a Wavelet Transform Domain, J. Inst. of Image Electronics Engineers of Japan, Vol. 34, No. 1, pp. 17-26, Jan. 2005.
  2. Satoshi Hasebe, Makoto Nagumo, Shogo Muramatsu, and Hisakazu Kikuchi, Video Key Frame Selection by Clustering Wavelet Coefficients, EUSIPCO 2004, No. 1679, pp. 2303-2306, Vienna, Austria, Sep. 2004.
  3. S. Hasebe, M. Nagumo, S. Muramatsu, and H. Kikuchi, Wavelet-Based Keyframe Selection Techniques for Video Abstraction, Proc. ITC-CSCC, 6E3L-2, Matsushima, Sendai, July 2004.
  4. S. Hasebe, S. Muramatsu, S. Sasaki, J. Zhou, and H. Kikuchi; Two-Step Algorithm for Detecting Video Shot Boundaries in a Wavelet Transform Domain, International Symp. on Image and Signal Processing and Analysis (ISPA 2003), Rome, Italy, Sep. 2003.
  5. S. Hasebe, S. Muramatsu, S. Sasaki and H. Kikuchi; Video Querying Based on Three-Dimensional Wavelet Transform, Proc. ITC-CSCC 2001, Hotel Clement Tokushima, Tokushima, pp. 1196-1199, July 10-12, 2001.

Front page   Edit Freeze Diff Backup Upload Copy Rename Reload   New List of pages Search Recent changes   Help   RSS of recent changes
Last-modified: 2014-07-29 (Tue) 20:44:06 (1697d)