0% found this document useful (0 votes)

249 views11 pages

Mpeg 2 Compressed Domain Algorithms For Video Analysis

Uploaded by

Asheke Zinab

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

249 views11 pages

Mpeg 2 Compressed Domain Algorithms For Video Analysis

Uploaded by

Asheke Zinab

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

Hindawi Publishing Corporation

EURASIP Journal on Applied Signal Processing

Volume 2006, Article ID 56940, Pages 1–11
DOI 10.1155/ASP/2006/56940

MPEG-2 Compressed-Domain Algorithms for

Video Analysis

Wolfgang Hesseler and Stefan Eickeler

Fraunhofer IMK, Schloss Birlinghoven, 53754 Sankt Augustin, Germany

Received 1 September 2004; Revised 2 June 2005; Accepted 6 June 2005

This paper presents new algorithms for extracting metadata from video sequences in the MPEG-2 compressed domain. Three
algorithms for eﬃcient low-level metadata extraction in preprocessing stages are described. The first algorithm detects camera
motion using the motion vector field of an MPEG-2 video. The second method extends the idea of motion detection to a limited
region of interest, yielding an eﬃcient algorithm to track objects inside video sequences. The third algorithm performs a cut
detection using macroblock types and motion vectors.

Copyright © 2006 Hindawi Publishing Corporation. All rights reserved.

1. INTRODUCTION MPEG-7 is a common standard to store the metadata

information that is generated by many automatic video-
The demand for indexing techniques capable of handling in- analysis systems. A video-analysis system that automatically
creasing amounts of video at low cost can only be satisfied produces metadata conforming the MPEG-7 standard [1] is
using automatic methods for metadata extraction. The se- the iFinder, developed by the Fraunhofer IMK. The algo-
mantics of metadata range from low level over mid-level to rithms presented in this paper are included as a new module
high level. Low-level metadata are information that can be into the iFinder, extend its analytic capacity with additional
extracted more or less directly from the video signal. Typical metadata such as camera motion, and speed up the existing
examples of low-level metadata are the color histogram of methods.
an image or spectrogram features of an audio waveform. In Most digital video in high quality, such as DVD or DVB
most cases, the low-level features are used for simple query (digital television), is encoded in the MPEG-2 compression
by example applications or as a preprocessing step to gener- standard. MPEG is an ISO standardized video- and audio-
ate mid- and high-level information. Mid-level metadata are compression format. A central component of MPEG is mo-
more understandable for humans and can be derived from tion compensation. With each pixel block (called a mac-
the images of a video sequence using pattern classification roblock), the encoder looks for a matching pixel block in a
methods. Examples are the location of human faces in video previous or following frame and stores only its position using
sequences or the spoken words in the audio channel. High- a motion vector. This search is computationally very expen-
level metadata give a comprehensive semantic description of sive. Analysis software that generates metadata like camera
the data. General methods to create high-level metadata are motion or object tracking can achieve a large speed improve-
still under development, but techniques for automatic gen- ment by manipulating motion vectors in the compressed do-
eration of high-level metadata are highly advanced for spe- main instead of performing a block matching in the uncom-
cial cases like face recognition which gives the name of a per- pressed domain.
son shown in the video sequence, or the topic recognition The algorithms presented in this paper all work in the
which gives the type of subject being discussed in the au- compressed domain. Instead of transforming the compressed
dio. domain back into the raw pixel domain, the algorithms work
Usually the level of generated metadata increases within in the compressed domain and extract information from the
the processing chain, which starts with low-level preprocess- encoded video (e.g., from the DCT coeﬃcients and motion
ing and ends with high-level methods. Generally, computa- vectors). In most cases, the camera motion detection algo-
tional complexity increases with the level of metadata. It is rithm is usually able to detect a camera motion with pixel
essential to have eﬃcient methods at all levels of process- precision even if the underlying motion vector field con-
ing. tains many outliers due to moving objects or due to large
2 EURASIP Journal on Applied Signal Processing

low-textured areas in the video. In this paper, a camera move- these algorithms are not well suited for video sequences with
ment is a generic name encompassing a pan or a track- large moving objects. Kobla et al. [5] distinguish eight mo-
ing shot, where the camera changes its position. The object tion directions and assume a camera motion if the direc-
tracking algorithm is based on the camera motion detection tion receiving the highest number of vectors receives more
algorithm. Objects like faces can be tracked over several min- than twice as many vectors as the second highest direc-
utes in the ideal case. Finally, we present a cut detection algo- tion.
rithm using macroblock types and motion vectors. The object tracking algorithms that are commonly used
in the uncompressed domain are the mean shift algorithm
2. MPEG-2 VIDEO COMPRESSION [6] and the Kalman filter [7]. Also, block-matching algo-
rithms [8] are used where the object is divided into small
As mentioned above, MPEG-2 compression is the standard blocks of constant size that are searched for in the next frame.
for digital video compression. MPEG-2 is downwardly com- In the compressed domain, Kobla et al. [9] present a method
patible with MPEG-1, which means that the algorithms pre- for determining the flow of a pixel block using motion vec-
sented in this paper are also applicable to MPEG-1. The tors and frame types. However, this method is too imprecise
MPEG-2 algorithm uses two basic methods to compress to use for object tracking. In [10], Khan et al. present experi-
digital video: frequency-domain compression for spatial re- ments with object tracking using compressed-domain MPEG
dundancy (as with JPEG picture compression) and block- data with P-frames only.
based motion compensation for the temporal redundancy. Cut detection algorithms can be divided into compressed
An MPEG-2 video consists of three frame types: I-, P-, and and uncompressed algorithms. Uncompressed-domain algo-
B-frames, which are grouped into GOPs (group of pictures). rithms use pixelwise difference [11], histograms [12], edge
The I-frame is encoded similarly to a JPEG image. It does tracking [13], and so forth. Such algorithms are, how-
not make use of information from any other frame. P-frames ever, computationally expensive. Some compressed-domain
are predicted from the previous I- or P-frame. B-frames use methods use the increasing bit rate as an indicator [14] or ap-
parts of both the previous and the following I- or P-frame. All proximated DC coefficients [15]. Kobla et al. [5], Calic and
three frame types use the DCT to transform the picture from Izquierdo [16] use an approach similar to ours by checking
the spatial domain to the frequency domain, which is better the ratios of forward- and backward-predicted macroblocks.
suited to exploit redundancy. In the I-frames, the DCT data However, experiments show many erroneous detections in
is derived directly from the image; whereas in the P- and B- sections with no or little motion only.
frames, the DCT data is derived from the residual error after
prediction. Each frame is divided into 16 × 16 pixel blocks 3. DETECTING CAMERA MOTION
called macroblocks. In the motion estimation process, the
encoder checks each macroblock of a P-frame if it can be pre- The encoder has already done the computationally most
dicted from a pixel block in the previous reference frame (I- complex part of detecting camera motion by calculating a
or P-frame) with a possible offset to compensate for motion motion vector field. However, the vectors do not always cor-
(the so-called motion vector) and a residual error which is respond to true motion since the encoder optimizes the
DCT transformed. If the coefficients become too big, the en- MPEG encoding for picture quality and not to reflect the ex-
coder decides to intracode the macroblock. The encoding of act motion of the objects covered by the macroblocks. For
B-frames works similarly. The difference is that B-frames can this reason, numerous outliers are to be expected. Moreover,
be predicted from the preceding or the following reference moving objects will result in motion vectors that do not re-
frame or from both frames. flect the camera motion.
The MPEG-2 standard only specifies the bit format and
the decoding of the MPEG-2 stream, the algorithms and 3.1. Algorithm
methods for the encoding, such as those needed for motion
estimation, are not defined in the standard. For this reason, We only use P-frames for the analysis of camera movements
how well the motion vectors reflect the motion of the camera since the distances between B-frames and reference frames
or an object depends on the encoder. vary. This means that the camera motion can only be deter-
mined for every second or third frame. However this restric-
2.1. Related Work tion represents only a minor drawback, since every camera
movement that does not last longer than several frames can
Detecting camera motion in the compressed domain was rather be regarded as camera wobble.
investigated by Wang and Huang [2]. They use a recursive A two-dimensional histogram of motion vectors is used
outlier-rejecting least-square algorithm. Smolic et al. [3] use and the peak is interpreted as reflecting camera motion.
the M-estimator method to underweight outlier motion vec- However, the motion vectors spread widely, especially with
tors. Neither of these algorithms take into account motion a fast movement of either camera or objects. For this reason,
vectors of low-textured macroblocks that do not reflect mo- the histogram of the motion vectors will usually not result in
tion. Kuhn [4] uses the DCT coefficients to determine a list a steep peak, but rather in a peak region. We solve this prob-
of macroblocks with high-texture information and performs lem by using a pyramid filter that includes 2 pixels in each
an IDCT for them in order to calculate their motion. All direction to detect the maximum. This filter is represented as
W. Hesseler and S. Eickeler 3

a weight matrix W with elements wi, j : Texture analysis is only performed on the luminance compo-
nent since the encoder usually also uses only the luminance
⎛ ⎞ for motion estimation. When the motion vector histogram is
1 1 1 1 1 built, the count for each element is not increased by one, but
⎜1 2 2 2 1⎟
⎜
1 ⎜ ⎟ rather by the corresponding weight s(Sx,y ) of the macroblock.
W= ·⎜1 2 3 2 1⎟
⎟. (1) In this way, the low-textured macroblocks are underweighted
3 ⎜
⎝1 2 2 2
⎟
1⎠ or excluded from the analysis.
1 1 1 1 1 Experiments have shown that a simple approach that
classifies all macroblocks as either sufficiently or nonsuffi-
ciently textured yields adequate performance. So we get
hx,y are the accumulated weights of the motion vectors (x, y).
For each vector (x, y), the weighted frequency of the peak ⎧
region called Hx,y is calculated as follows: ⎨1 for Sx,y ≥ SThreshold ,
s(Sx,y ) = ⎩ (3)
0 for Sx,y < SThreshold .

2
2
Hx,y = wi+3, j+3 · hx+i,y+ j . (2) These weights insure that only the motion vectors of mac-
i=−2 j =−2 roblocks that are sufficiently textured are used in the analysis.

The peak region around the vectors (x, y) is then given by 3.1.2. Approximation of the DCT coefficients
the maximum max(Hx,y ). The Hx,y values can be calculated The DCT coefficients of I-frames and intracoded mac-
quickly using the integral image algorithm [17]. roblocks of P-frames can be used directly from the MPEG-
Motion vectors are stored with half-pixel accuracy. Thus, 2 compressed-domain data after extraction. For non-intra-
the vector of the peak region has to be divided by two in or- coded macroblocks, only the motion vector and the differ-
der to obtain pixel accuracy. We chose a computationally less ence with the referenced pixel block are stored as DCT coef-
complex method and divided all motion vectors by two and ficients. For this reason, the DCT coefficients for these mac-
built the histogram with these values. When interlaced mo- roblocks have to be calculated. There are computationally
tion estimation was used, the two motion vectors for each expensive algorithms that can do a precise calculation [18–
macroblock were averaged. 20]. Since we only need information concerning whether the
macroblock is sufficiently textured, an approximation is ad-
3.1.1. Low-textured areas equate. A DCT block of a P-frame can be built up by using
up to four adjacent DCT blocks in a reference frame. If all
Large areas in the video that are low textured, that is, that these four DCT blocks are low textured and the error DCT
do not have any edges, result in long chaotic motion vectors coefficients are small, the composed DCT block will be low
that do not reflect the camera motion. Since the macroblocks textured, too, assuming that there is no edge between the in-
in such areas are similar to each other, the encoder is free to dividual blocks in the spatial domain.
use an arbitrary pixel block as reference, which minimizes the To approximate the DCT matrix of the current 8 × 8 pixel
error, but does not reflect the true motion. Figure 1 shows an block BCur , we calculate the weighted sum of the DCT matri-
example. ces of the pixel blocks B1 − B4 plus the error DCT as shown
A motion vector can be regarded as usable for the de- below:
tection of camera motion if its macroblock is textured, that
is, if its brightness varies. This is reflected by the 63 AC co-
4

efficients of the four DCT blocks of which the macroblock DCT BCur = wi · DCT Bi + DCTErr . (4)
consists. The DC coefficient only gives information on the i=1
average brightness of the DCT block, but does not indicate
the texture. The weights wi are given by the fractions of the areas of the
Increased AC coefficients in the upper left area reflect DCT block that overlap with the reference DCT blocks. Since
variations in brightness of the 8 × 8 pixel block. Most of the we work with half-pixel accuracy, the weights are a multiple
other coefficients are zero or near zero. Experiments showed of 1/256. Figure 2 sketches the procedure.
that it is sufficient to evaluate the nine AC coefficients in the The approximated DCT coefficients will be used for the
top left triangle of the DCT block. calculation of the DCT coefficients of the subsequent P-
In order to be independent of the scanning pattern used, frame. Errors will therefore propagate.
we select the nine AC coefficients after scanning the coeffi- When interlaced motion compensation with two motion
cients in the zigzag or alternate scan order. The absolute val- vectors is used, a special treatment is required. In the frame
ues of the nine AC coefficients of all four DCT blocks com- picture modus, the pixel rows of both 16 × 8 pixel blocks are
prising the macroblock are added. The sum of these 36 coef- interleaved. This is emulated in the DCT domain by averag-
ficients for the macroblock at the macroblock position (x, y) ing each coefficient of the upper and lower DCT blocks. In
reflects its texture and is denoted as Sx,y . This value is used for the field picture modus, the upper and lower 16 × 8 pixel
the weight function s(Sx,y ). Its value range is between 0 and 1. blocks have their own motion vectors. In the DCT domain,
4 EURASIP Journal on Applied Signal Processing

Figure 1: Motion vectors spread in low-textured areas.

BRef
B1 B2
y BCur
x

B3
B4

Figure 2: DCT approximation: BCur uses BRef as reference which is located inside B1 − B4 . The motion vector is (x, y).

the upper two DCT blocks can be approximated using the 3.3. Results
upper motion vector and the lower two DCT blocks can be
approximated using the lower motion vector. The approximated DCT using a simple weighted sum ap-
proach is adequate to be used to determine if the macroblock
is sufficiently textured so that its motion vector will probably
3.2. Experiments
reflect the camera motion. The peak of a two-dimensional
Comparisons of the approximated DCT with the forward histogram using a 5 × 5 pyramid filter is suited to determine
DCT of the decoded data show higher differences especially the camera motion.
if textured and low-textured areas are adjacent. However, an
exact calculation is not important, but only the classifica- 4. OBJECT TRACKING USING MOTION VECTORS
tion as sufficiently and nonsufficiently textured. Experiments
with newscasts video material resulted in 94% correct classi- Encouraged by the results of the camera motion detection,
fication of the macroblocks compared with the classification we performed tests involving the use of motion vectors to
using the forward DCT, which is sufficient. track objects. An object motion can be regarded as “camera”
We used test sequences mainly from newscasts that were motion that is restricted to the object. By using the motion
captured directly from the satellite in MPEG-2 format using vector information, a speed increase could be achieved com-
a Linux system with DVB-S card. The camera motion was pared to pixel domain tracking algorithms.
then determined by hand and compared with the computed
results of our algorithm, the M-estimator approach, and with
a simple average algorithm. The histogram algorithm with 4.1. Differences to camera motion detection
texture analysis was usually able to detect the camera mo-
tion precisely, whereas the M-estimator and the average al- Compared to the detection of camera motion for the whole
gorithms’ vectors were too small, especially with fast cam- frame, there are additional challenges for the object tracking
era movements. The algorithm proved to be highly robust that is restricted to a subimage.
against objects moving in the foreground. An object did not
have an influence on the peak region unless its size increased 4.1.1. Problems using MPEG-2 compressed-
to nearly 50% of the frame size within the artificial test se- domain data
quences. In this case, the object motion was interpreted as
camera motion. Many smaller moving objects could wrongly One problem when using the MPEG-2 compressed-domain
lead to a detection of the 0-vector as camera motion. data is that the motion vectors are not available continuously
W. Hesseler and S. Eickeler 5

for all frames. The number of frames between the current usually influence the peak of the histogram. The majority of
frame and the reference frame varies. Moreover, there are no the motion vectors still reflect the object movement.
motion vectors for I-frames. The object rectangle will in most cases not correspond
Another problem is that the macroblock grid does not to the macroblock grid so that many macroblocks or refer-
match the object boundaries. An edge exact tracking is there- enced pixel blocks will partly contain background. We there-
fore not possible. Instead, a rectangle of constant size con- fore introduce a weight for each macroblock or referenced
taining the object is tracked. The macroblocks and the refer- pixel block, respectively. If all 16 × 16 pixels are located inside
enced pixel blocks will partly contain background. the object rectangle, the weight is 1, otherwise, the weight is
Object tracking using motion vectors has a few require- proportional to the number of overlapping pixels.
ments concerning the object to be tracked: The analysis is done with half-pixel accuracy. This means
that the histogram is built using half-pixel without rounding
(i) object must be textured but not periodically textured;
to pixels beforehand. The object rectangle’s coordinates are
(ii) object must be of suﬃcient size;
also internally calculated with half-pixel accuracy. The two
(iii) object rectangle must contain no or small portions of
motion vectors of macroblocks using interlaced motion es-
background only;
timation are both used with half-weight when building the
(iv) object must not change its appearance abruptly, its size
histogram.
should remain constant.
hx,y are the accumulated weights of the motion vectors
If the object is not textured, the algorithm used to detect low- (x, y). For each vector (x, y) the weighted frequency of the
textured areas, as previously described, can be used. We pri- peak region called Hx,y is calculated. Because of the half-pixel
marily performed experiments with faces. If faces are suﬃ- accuracy, an extended range of three half-pixels (compared
ciently large, they meet the requirements on the object listed to two pixels with the detection of camera motion) is used
above and it is possible to track them using motion vec- to determine the peak region. The weight matrix W with el-
tors. ements wi, j is represented as
⎛ ⎞
1 1 1 1 1 1 1
4.1.2. The start object rectangle ⎜1
⎜ 2 2 2 2 2 1⎟
⎟
⎜1 2 3 3 3 2 1⎟
The first step to find the object rectangle that should be 1 ⎜
⎜
⎟
W= ·⎜ 1 2 3 4 3 2 1⎟
⎟ (5)
tracked is best performed in the uncompressed domain. For 4 ⎜
⎜1 2 3 3 3 2
⎟
1⎟
this purpose, the video must be decoded at least for the first ⎜ ⎟
⎝1 2 2 2 2 2 1⎠
frame after a cut. The object rectangle to be tracked will ei-
ther be determined by an automated system or by a human. 1 1 1 1 1 1 1
After the initial determination of the rectangle, the object and the peak region Hx,y is calculated as follows:
tracking algorithm only uses compressed-domain data. The
algorithm is passed the coordinates of the top left corner and
3
3
the size of the object rectangle. The size of the object will re- Hx,y = wi+4, j+4 · hx+i,y+ j . (6)
main constant throughout the whole tracking process. More- i=−3 j =−3

over, the edges of the object rectangle will always be located The calculation of the object rectangle is dependent on the
parallel to the image edges. frame type. Intracoded macroblocks are always ignored.
The motion vectors of forward-predicted macroblocks
4.2. Algorithm specify the position in the frame from which the macroblock
originates, whereas motion vectors of backward-predicted
The tracking algorithm applies the histogram in a manner macroblocks specify to which position in the next reference
similar to the camera motion detection, plus some additional frame the macroblock will move.
steps to increase the accuracy.
4.2.2. Backward search with P-frames
4.2.1. Histogram to determine the object’s
motion vector P-frames can only have motion vectors of forward-predicted
macroblocks. They specify the position of the pixel block that
Since the results of the object rectangle’s calculations are used is referred to in the previous reference frame. The number
again with subsequent frames, a high accuracy is required. As of pixels by which this pixel block and the object rectangle’s
seen with the camera motion detection, the histogram algo- overlap is calculated. The overlap specifies the weight of the
rithm with texture analysis can usually determine the exact motion vector used to build the motion vector histogram.
number of pixels. Since we assume that the object is suﬃ- The vector of the peak region is calculated by applying a
ciently textured, texture analysis is not required. We rely on a method similar to that used in the camera motion approach.
two-dimensional histogram which includes only the motion Adding this vector to the coordinates of the object rectan-
vectors of macroblocks that are at least partly located inside gle in the reference frame gives the coordinates of the object
the object rectangle. A slight and slow appearance change rectangle in the current P-frame. This rectangle is used as the
of the object like the mouth movement of a face does not reference rectangle for the following frames.
6 EURASIP Journal on Applied Signal Processing

4.2.3. Backward search and prediction with B-frames predicted macroblocks and backward- portions of bidirec-
tional motion vectors is built.
The determination of the current object rectangle for B- The algorithm does not support rotating objects. The ob-
frames is done in the same way that it was with P-frames, ject rectangle’s orientation will always remain the same so
except that the forward portions of bidirectional motion vec- that a 90-degree rotation will have the effect that the object
tors are included in the histogram as well. The reference rect- rectangle contains background from that point on. This is
angle is not updated since B-frames are never used as refer- also the case if the object size decreases. However, a slow in-
ence. crease of size will usually not result in problems. It is essential
No motion vectors are available, either for I-frames or for that the encoder uses B-frames without preferring the previ-
P-frames, in which all macroblocks that are part of the object ous or the following reference frame.
rectangle are intracoded. We therefore use the backward ref-
erences of B-frames to make a prediction concerning where 4.3. Experiments
the object rectangle will move in the next reference frame.
This is done again by using a histogram, but this time the We primarily performed experiments with faces. For com-
histogram is built up using the motion vectors of backward- parison, the ground-truth data of the position of the rect-
predicted macroblocks and the backward portions of bidi- angle containing the face were determined using the Rowley
rectional motion vectors. The sum of the peak-region vector face detection algorithm [21]. The rectangle position in the
and the object rectangle’s coordinates of the B-frame gives first frame of the sequence together with its size was used
the object rectangle’s coordinates in the next reference frame. as start object rectangle for the tracking algorithm. Another
Consequently, for a P-frame in the sequence B1 B2 P, face detection method, which could have been used to estab-
usually three calculations of the object rectangle are avail- lish ground truth, is presented in [22].
able which are identical in the ideal case: the two predictions Figure 4 shows the tracking of a speaker in the German
of the B-frames and the calculation for the P-frame. We use parliament. The object rectangle in the first frame was deter-
the latter one because it is most reliable since it is calculated mined using the Rowley algorithm. It is the starting point of
in one step. In contrast, a prediction of a B-frame requires the tracking. The white-dotted rectangle shows the reference
two steps, the B-frame calculation and the B-frame predic- tracking by the Rowley algorithm, whereas the solid rectan-
tion. However, experiments showed that the predictions that gle depicts our tracked object rectangle. The speaker’s head
were calculated with the B-frames are more precise than the moves several times quickly up and down and sidewards. The
P-frame calculations if the motion vectors spread, especially Rowley network was only trained on frontal faces and can-
in the case of fast object movements. Therefore, an additional not therefore always find the face with sufficient confidence.
pseudomotion vector with weight 2 is added to the P-frames Our tracking algorithm can link these face detections so that
histogram. It is determined by the difference between the co- it is possible to automatically determine that it is the same
ordinates of the object rectangle in the reference frame and face. The differences between the positions of the Rowley al-
the prediction. This pseudomotion vector does not influence gorithm and ours are only a few pixels. Even if the head is
the maximum in cases in which no fast movement occurs. slightly tilted, the face can still be tracked. The size of the
rectangle that was determined using the Rowley algorithm
varies, whereas the size of our rectangle is constant.
4.2.4. Using predicted coordinates with I-frames Experiments with low-quality video material, such as
recordings from security cameras, gave poor results. The mo-
For I-frames, the prediction of the object rectangle derived tion vectors do not reflect the object motion adequately. This
from the previous B-frame is used. This prediction is also is also the case if the object moves too fast or changes its ap-
used for P-frames in which all macroblocks that are part of pearance quickly. For this reason, the quality of the tracking
the object rectangle are intracoded. The reference rectangle is highly dependent on the quality of the motion vectors. A
is updated with these coordinates. high-quality video with low noise is required as well as an
Figure 3 shows an example sequence with three consecu- encoder with a good motion estimation.
tive frames with type P, B, and I. The object rectangle in the P- Table 1 shows the tracking results of five high-quality
frame (depicted with solid lines in the left picture) is known video sequences.
when the B-frame is analyzed. Each macroblock in the B-
frame is checked as to whether a pixel block of the object 4.4. Results
rectangle in the P-frame is used as reference. The macroblock
depicted with dashed lines is located completely inside the The results of the presented object tracking using MPEG-2
object rectangle so that its weight is 1 whereas the weight of motion vectors are very good if the video material is of high
the macroblock depicted with broken lines is smaller since it quality and the MPEG-2 encoders have a good motion esti-
is located partly inside the object rectangle. The object mo- mation as they do with newscasts and parliament transmis-
tion vector is calculated using the histogram. By subtracting sions. Consequently, these are the fields of application of ob-
the motion vector from the coordinates of the object rect- ject tracking using motion vectors. If the video material is of
angle in the reference frame, its position in the B-frame can low quality, especially noisy and blurry as with security cam-
be determined. To predict the object rectangle’s position in eras, the motion vectors do not reflect the object motion so
the I-frame, a histogram of the motion vectors of backward that object tracking using motion vectors is not possible.
W. Hesseler and S. Eickeler 7

P B I

Figure 3: Example of object tracking: the object rectangle (with solid lines) is calculated for the B-frame shown in the middle. An object’s
motion vector is calculated relative to the object rectangle’s position in the P-frame. After this, an object’s motion vector is calculated to
predict the object rectangle’s position in the I-frame. The grid depicts the macroblock boundaries, the squares with dashed and broken
squares depict two examples: macroblocks and the referenced pixel blocks, respectively.

Figure 4: Face tracking for 53 seconds of a parliament speaker. The solid rectangle localizes the tracked object. The dotted rectangle reflects
the reference data which were determined using the Rowley algorithm. In several cases, the reference algorithm failed to detect a face with
suﬃcient confidence.

5. CUT DETECTION Cut detection algorithms using the ratio of forward-

and backward-predicted macroblocks rely on the fact that
The first step of any video sequence analysis is usually the cut B-frames are “average” frames between two reference frames.
detection. A cut is defined as an abrupt scene change between For this reason, the number of macroblocks using the pre-
two consecutive frames. ceding and the following reference frames should be roughly
8 EURASIP Journal on Applied Signal Processing

Table 1: Tracking test results of five video sequences. Example frames of test sequence 3 are shown in Figure 4. Rectangle’s average overlap
means the average overlap of the tracked object rectangle and the face rectangle determined by the Rowley face detector. In test sequence 5
the face is lost when the speaker tilts his head too quickly. The rectangle overlap then falls below 50% so that the object can be regarded as
lost.

Test sequence Tracked object Duration Total frames Rectangle’s average overlap
1 Anchor person 50 s 1250 97.8%
2 Anchor person 52 s 1303 94.1%
3 Parliament speaker 53 s 1335 89%
4 Parliament speaker 140 s 3492 91%
5 Parliament speaker Lost after 17 s 428 82.4%

identical. A significant change in the forward/backward ra- R1 | B1 B2 R2

tio indicates a cut. Referential dependencies across cuts tend
to lead to large motion vectors since frames on opposite sides
of cuts are typically markedly diﬀerent. These motion vectors
can be expected to spread widely.
(a)

5.1. Algorithm
R1 B1 | B2 R2
We use a combined evaluation of B-frame references and
motion vectors. Figure 5 shows the three possible cut types.
(b)
R1 and R2 stand for reference frames, in between are the B-
frames B1 and B2 . The arrows indicate the references of the
majority of the macroblocks. The number of B-frames be- R1 B1 B2 |R2
tween the reference frames is arbitrary, but at least one. We
use two B-frames in the example, which give a GOP such as
IBBPBBPBBPBB.
(i) In the first case, most macroblocks from B1 use pixel (c)
blocks from R2 as reference. This indicates a cut be-
tween R1 and B1 . Figure 5: The three cut types. R1 and R2 are reference frames, B1
(ii) In the second and third cases, most macroblocks from and B2 are the two B-frames, | indicates the cut position. The arrows
B1 use the reference frame R1 . At the moment of ana- indicate the references of the majority of the macroblocks.
lyzing B1 , we can conclude whether the following cut
occurs before or at R2 . It is necessary to check all of the
following B-frames to determine if the macroblock-
type ratio has inverted in order to localize the cut more which direction a reference is established depends on the en-
exactly. coder’s preferences. For this reason, the cut detection method
(iii) In the second case, the macroblock-type ratio has in- described above will always detect a cut. We therefore discard
verted in B2 . Therefore, a cut is detected at that frame. macroblocks whose motion vectors are 0. The idea is that af-
(iv) If the ratio has not inverted by the time we reach R2 , ter a cut, the position of the macroblock most probably dif-
the cut must be at R2 . This is shown in the 3rd case. fers from the position of the referenced pixel block. Bidirec-
(v) If there was no significant ratio of forward/backward tional macroblocks are always used because they imply that
types, it is assumed that no cut occurred in within that no cut has occurred.
sequence. Experiments showed that the results improve if all mac-
roblocks with motion vectors whose absolute values are nei-
Note that after a cut has been detected at a B-frame, no further vertically nor horizontally greater than 1 are not used.
ther cut can be detected until the next reference frame. Such vectors are half-pixel motion vectors that are primarily
used because of coding eﬃciency. Eﬀectively, we only con-
5.1.1. Problem: little motion activity sider the forward/backward predicted ratio of macroblocks
that have moved or will move compared to the previous or
There are general problems with cut detection using mac- following reference frame.
>1
roblock types if there is no or little motion as is the case with BFor is the number of forward-predicted macroblocks
video depicting static graphics. The encoder does not need whose motion vectors are greater than 1 for at least one di-
>1
to use references to both reference frames, but uses a refer- rection, BBack is the analogous value for backward-predicted
ence to the previous or following reference frame only; in macroblocks. The number of bidirectional macroblocks is
W. Hesseler and S. Eickeler 9

denoted as BBi and the number of intracoded macroblocks 5.1.2. Cut validation using the following P-frame
as BIntra .
The ratios AFor and ABack are defined as To lower the number of false detections such as the ones oc-
curing frequently with static computer-generated graphics,
the P-frame after a detected cut can be used to validate the
>1
BFor + BBi + R cut. The encoder will probably not be able to take over pixel
AFor = >1 >1 ,
BFor + BBack + BBi + R blocks in the P-frame from the same position in the previous
(7) reference frame. The ratio of nonmoving motion vectors is
B>1 + BBi + R
ABack = >1 Back>1 , calculated as
BFor + BBack + BBi + R
>1
BTotal − BFor − BBi − BIntra
where R is a value that compensates the noise in the distri- AFor
0/1
= (15)
BTotal
bution of macroblock types. It increases the ratio if the num-
ber of considered macroblocks is low. R depends on the total and compared with a threshold constant tValid . A cut hypoth-
number of macroblocks BTotal and a constant r as follows: esis occuring since the last reference frame can be assumed to
be unlikely if
R = BTotal · r. (8)
AFor
0/1
> tValid , (16)
A cut of type 1 (see Figure 5) is assumed if and can therefore be rejected. The cuts can only be verified
if the reference frame after cut is a P-frame. In case of an I-
AFor < tFor , (9) frame, no information can be obtained. Some encoders en-
force an I-frame after a cut. This is, however, not a problem
where tFor is a constant. If the ratio is in an expanded range since only false cuts need to be identified.
AFor with
5.2. Experiments
AFor < tFor
+
, (10)
The tests were primarily done with newscast video mate-
a cut hypothesis with lower probability is assumed. We there- rial. They contain a broad test spectrum including, for ex-
fore check the spread of the forward portion of motion vec- ample, speakers with static background, static computer-
tors of the considered macroblocks using the average devia- generated graphics, camera flashes, fast camera movement,
tion from the mean of the motion vectors. This value is de- reports with bad picture quality, and so forth. Table 2 shows
noted as SFor XY and is compared with the constant SDeviation . the results of five test sequences. The cut positions were de-
A cut is assumed if termined by hand.
The average percentage of detected cuts varies from
SFor XY > SDeviation . (11) 94.8% to 100.0% depending on the encoder and the video
content. The average percentage of incorrect detected cuts
The idea is that in case the reference frame is located before (false alarms) varies from 0% to 23.0%. Up to 4.1% of the
the cut, the forward-predicted macroblocks will change their detected cuts diﬀered from the actual positions by one frame.
positions drastically without preferring one direction. Parliament transmissions generally give better results than
A cut of type 2 or 3 (see Figure 5) is assumed if newscasts, since they do not contain typical problems.
The causes for nondetected cuts are, in descending order
ABack < tBack . (12) of importance as follows.

Again, if the ratio is in an extended range with (i) The encoder did not use the characteristic distribution
of macroblock types.
(ii) The frames before and after the cut are too similar.
ABack < tBack
+
, (13)
(iii) The cut occurred between two half-pictures.
we check the spread of the considered motion vectors, here The causes for cuts detected where no cut actually occurred
the backward-predicted portions. Similarly, if are in descending order of importance as follows:
(i) the consecutive frames are too similar, there are (al-
SBack XY > SDeviation , (14)
most) no changes so that it was suﬃcient for the en-
coder to use references to one reference frame only;
a cut is assumed. However, in these cases we can only iden-
(ii) camera flashes;
tify a cut up until the next reference frame. For this reason,
(iii) the encoder did not use the characteristic distribution
each following B-frame is checked to see if the number of
of macroblock types;
forward-predicted macroblocks is smaller than the number
(iv) very quick camera motion.
of backward-predicted macroblocks and a cut is assumed if
this is the case. If no cut position was found by the next ref- No or little motion only is still a problem. This especially
erence frame, this reference frame must be the cut position. applies to computer-generated graphics since they do not
10 EURASIP Journal on Applied Signal Processing

Table 2: Results of cut detection test for five videos. # reference cuts is the number of cuts in the ground truth. B-frames is the num-
ber of B-frames between two reference frames. Deletion = number of missed cuts/divided
by number of reference cuts, insertion = num-
ber of falsely detected cuts/divided by number of detected cuts, and inaccuracy = displacement of cut/divided by number of detected cuts.
Video type Duration Total frames # Ref. cuts B-frames Deletion Insertion Inaccuracy
Newscast 15 min 22500 132 1 1.5% 2.9% 0.0 frames
Newscast 15 min 22500 167 1 1.2% 2.3% 0.0 frames
Newscast 15 min 22500 96 2 5.2% 11.1% 0.009 frames
Newscast 15 min 22500 97 2 4.1% 23% 0.032 frames
Parliament 54.7 min 82054 81 2 0% 0% 0.0 frames

contain the noise that leads to small differences between the the bottleneck. For this reason, an algorithm which can de-
frames. In order to avoid the repeated detection of a cut, only tect shots faster than real time is highly desirable.
the macroblocks with motion vectors of at least two half-
pixels are considered. However, imposing this limitation can 6. CONCLUSIONS
lead to a situation in which the number of remaining mac-
roblocks is small, causing the result to be unreliable. In this In this paper, three different algorithms for low-level meta-
case, both a nondetected cut and a erroneously detected cut data extraction were presented. All three algorithms work in
are possible. Camera flashes lead to increased brightness in the MPEG-2 compressed domain, and are therefore efficient,
a single frame or a half-frame only. If a flash occurs in a ref- saving time for the further processing steps that create meta-
erence frame that frame, will likely have less referential de- data on a higher semantic level. The algorithms operate in
pendencies, which may lead to an erroneously detected cut. realtime on a computer with an 866 MHz Pentium III pro-
Cuts between two half-pictures can occur if the video ma- cessor under Linux.
terial was produced with analog equipment. The first half- Detection of cuts and camera motion can be performed
picture of a frame belongs to a scene and the second half- immediately after parsing a frame in bitstream order with-
picture belongs to the next scene. Very quick camera motion out having to wait for the data of the following frames. This
can also lead to false detection of a cut since the motion vec- procedure allows on-the-fly cut and camera motion detec-
tors spread. tion, for example, with live videos from a satellite. The object
Generally, the number of nondetected cuts can be re- tracking algorithm requires the frames to be in display or-
duced by adjusting the parameters so that more cuts are de- der, which gives a delay of up to three frames (less than 100
tected, also resulting in more false alarms. All cut hypotheses milliseconds).
could be checked using the DC coefficients of the approx- Considering that the encoder’s task is to encode the best
imated DCT as was shown with the camera motion detec- picture quality with a given bit rate and not with the cam-
tion. Such checking does not compromise speed improve- era or object motion in mind, the results of the compressed-
ment over uncompressed-domain techniques. domain algorithms are convincing. The results show that it
The performance of the algorithm presented here de- is not necessary to expend computing power on decoding in
pends on the encoder. The parameters have to be adjusted order to perform low-level metadata extraction in the pixel
to accommodate the particular encoder. It is necessary that domain.
the encoder uses B-frames and does not always prefer one The algorithms presented in this paper are not funda-
reference frame like the next I-frame. mentally limited to MPEG-2. They can be extended to all
video compression methods that use motion compensation
5.3. Results and bidirectional frames, for example, MPEG-4 simple and
advanced profile (also known as DivX).
The cut detection algorithm using compressed-domain data
only can detect 94.8%–100% of all cuts in a video. The per- REFERENCES
formance is dependent on the encoder and the video mate-
rial. Only sequences with no or little motion remain prob- [1] B. S. Manjunath, P. Salembier, and T. Sikora, Introduction to
lematic despite special treatment and in such cases, the de- MPEG-7: Multimedia Content Description Interface, John Wi-
tection is not reliable. Also, encoders do not always encode ley & Sons, New York, NY, USA, 2002.
the video as presumed by the proposed algorithms but are [2] R. Wang and T. Huang, “Fast camera motion analysis in
rather programmed with the coding efficiency in mind. MPEG domain,” in Proceedings of IEEE International Confer-
ence on Image Processing (ICIP ’99), vol. 3, pp. 691–694, Kobe,
A typical application of the cut detection method would Japan, October 1999.
be integration in distributed online television monitoring [3] A. Smolic, M. Hoeynck, and J.-R. Ohm, “Low-complexity
systems. The basic units for processing of video sequences are global motion estimation from P-frame motion vectors for
the shots. These shots are distributed to the different nodes MPEG-7 applications,” in Proceedings of IEEE International
of a computer cluster for a fast processing. The detection of Conference on Image Processing (ICIP ’00), vol. 2, pp. 271–274,
the shots must occur before the distribution and is as such Vancouver, BC, Canada, September 2000.
W. Hesseler and S. Eickeler 11

[4] P. M. Kuhn, “Camera motion estimation using feature points [21] H. A. Rowley, S. Baluja, and T. Kanade, “Neural network-
in MPEG compressed domain,” in Proceedings of IEEE Inter- based face detection,” IEEE Transactions on Pattern Analysis
national Conference on Image Processing (ICIP ’00), vol. 3, pp. and Machine Intelligence, vol. 20, no. 1, pp. 23–38, 1998.
596–599, Vancouver, BC, Canada, September 2000. [22] H. Wang and S.-F. Chang, “A highly eﬃcient system for auto-
[5] V. Kobla, D. Doermann, and A. Rosenfeld, “Compressed do- matic face region detection in MPEG video,” IEEE Transactions
main video segmentation,” Tech. Rep. CAR-TR-839 (CS-TR- on Circuits and Systems for Video Technology, vol. 7, no. 4, pp.
3688), University of Maryland, College Park, Md, USA, 1996. 615–628, 1997.
[6] D. Comaniciu and P. Meer, “Mean shift analysis and applica-
tions,” in Proceedings of 7th IEEE International Conference on
Computer Vision (ICCV ’99), vol. 2, pp. 1197–1203, Kerkyra, Wolfgang Hesseler received his Diploma
Greece, September 1999. in computer science from the University
[7] B. Danette Allen and G. Bishop, “Tracking: Beyond 15 Min- of Bonn, Germany. The algorithms pre-
utes of Thought,” in SIGGRAPGH 2001 Course 11 Booklet, Los sented in this paper were developed dur-
Angeles, Calif, USA, August 2001. ing his work at the Fraunhofer IMK in
[8] A. Gyaourova, C. Kamath, and S.-C. Cheung, “Block matching Birlinghoven, Germany, where he wrote
for object tracking,” Tech. Rep. UCRL-TR-200271, Lawrence his Diploma thesis about using MPEG-2
Livermore National Laboratory, Livermore, Calif, USA, 2003. compressed-domain data for computer vi-
[9] V. Kobla, D. Doermann, K.-I. Lin, and C. Faloutsos, “Fea- sion. His areas of interest are video indexing
ture normalization for video indexing and retrieval,” Tech. and video analysis.
Rep. CAR-TR-847 (CS-TR-3732), University of Maryland,
Stefan Eickeler received his Dipl.-Ing. de-
College Park, Md, USA, 1996.
gree and Dr.-Ing. degree in electrical engi-
[10] J. I. Khan, Z. Guo, and W. Oh, “Motion based object tracking neering from the University of Duisburg,
in MPEG-2 stream for perceptual region discriminating rate Germany, in 1996 and 2001, respectively. He
transcoding,” in Proceedings of 9th ACM International Confer- joined the Fraunhofer Institute for Media
ence on Multimedia (ACM Multimedia ’01), pp. 572–576, Ot- Communication (Fraunhofer IMK) in the
tawa, Ontario, Canada, September–October 2001. year 2001. Since 2003 he has taught as a
[11] H. Zhang, A. Kankanhalli, and S. W. Smoliar, “Automatic lecturer at the Bonn-Aachen International
partitioning of full-motion video,” ACM Multimedia Systems, Centre for Information Technology. His re-
vol. 1, no. 1, pp. 10–28, 1993. search interests are statistical pattern recog-
[12] A. Nagasaka and Y. Tanaka, “Automatic video indexing and nition and signal processing with the applications face and gesture
full-video search for object appearances,” in Proceedings of recognition and automatic media and document analysis.
IFIP 2nd Working Conference on Visual Database Systems, pp.
113–127, North-Holland, September–October 1991.
[13] R. Zabih, J. Miller, and K. Mai, “A feature-based algorithm for
detecting and classifying scene breaks,” in Proceedings of 3rd
ACM International Conference on Multimedia (ACM Multime-
dia ’95), pp. 189–200, San Francisco, Calif, USA, November
1995.
[14] G. Bozdagi and H. T. Sencar, “Preprocessing tool for com-
pressed video editing,” in Proceedings of IEEE 3rd Workshop on
Multimedia Signal Processing, pp. 283–288, Copenhagen, Den-
mark, September 1999.
[15] A. Bovik, Ed., Handbook of Image and Video Processing, Aca-
demic Press, New York, NY, USA, 2000.
[16] J. Calic and E. Izquierdo, “Towards real-time shot detection in
the MPEG-compressed domain,” in Proceedings of Workshop
on Image Analysis for Multimedia Interactive Services (WIAMIS
’01), pp. 95–100, Tampere, Finland, May 2001.
[17] P. Viola and M. J. Jones, “Robust real-time object detection,”
Tech. Rep. CRL 2001/01, Cambridge Research Laboratory,
Cambridge, Mass, USA, 2001.
[18] S.-F. Chang and D. G. Messerschmitt, “Manipulation and
compositing of MC-DCT compressed video,” IEEE Journal on
Selected Areas in Communications, vol. 13, no. 1, pp. 1–11,
1995.
[19] K. Shen and E. J. Delp, “A fast algorithm for video parsing us-
ing MPEG compressed sequences,” in Proceedings of IEEE In-
ternational Conference on Image Processing (ICIP ’95), vol. 2,
pp. 252–255, Washington, DC, USA, October 1995.
[20] B.-L. Yeo and B. Liu, “On the extraction of DC sequence from
MPEG compressed video,” in Proceedings of IEEE International
Conference on Image Processing (ICIP ’95), vol. 2, pp. 260–263,
Washington, DC, USA, October 1995.

A Survey On The Automatic Indexing of Video Data 1999 Journal of Visual Communication and Image Representation
No ratings yet
A Survey On The Automatic Indexing of Video Data 1999 Journal of Visual Communication and Image Representation
35 pages
Ijcet: International Journal of Computer Engineering & Technology (Ijcet)
No ratings yet
Ijcet: International Journal of Computer Engineering & Technology (Ijcet)
11 pages
Moditroduction Multimedia Database
No ratings yet
Moditroduction Multimedia Database
39 pages
Module 1 - Introduction To Multimedia Databases
No ratings yet
Module 1 - Introduction To Multimedia Databases
39 pages
A Review On Design and Implementation of Compensated Frame Prediction & Reconstruction
No ratings yet
A Review On Design and Implementation of Compensated Frame Prediction & Reconstruction
4 pages
Data Mining-Multimedia Datamining
No ratings yet
Data Mining-Multimedia Datamining
8 pages
Simplified Block Matching Algorithm For Fast Motion Estimation in Video Compression
No ratings yet
Simplified Block Matching Algorithm For Fast Motion Estimation in Video Compression
8 pages
Video Coding Fundamentals Explained
No ratings yet
Video Coding Fundamentals Explained
117 pages
Moving Object Detection and Tracking in Videos
No ratings yet
Moving Object Detection and Tracking in Videos
25 pages
MTEK Lect-Wavelet Filt
No ratings yet
MTEK Lect-Wavelet Filt
23 pages
Efficient Video Indexing For Fast-Motion Video
No ratings yet
Efficient Video Indexing For Fast-Motion Video
16 pages
MPEG Video Compression Case Study
No ratings yet
MPEG Video Compression Case Study
12 pages
Itc - Mpeg Case Study
No ratings yet
Itc - Mpeg Case Study
29 pages
Audio-Visual Format Documentation Project Background Paper
No ratings yet
Audio-Visual Format Documentation Project Background Paper
14 pages
CH: 7 Fundamentals of Video Coding
No ratings yet
CH: 7 Fundamentals of Video Coding
13 pages
Evaluation of Available MPEG-7 Annotation Tools: Mario Döller, Nikolaus Lefin, Harald Kosch University of Passau, Germany
No ratings yet
Evaluation of Available MPEG-7 Annotation Tools: Mario Döller, Nikolaus Lefin, Harald Kosch University of Passau, Germany
30 pages
MPEG-7 Overview and Applications Guide
No ratings yet
MPEG-7 Overview and Applications Guide
44 pages
Mpeg 7
No ratings yet
Mpeg 7
10 pages
Multimedia Mining Research An Overview 250429 112356
No ratings yet
Multimedia Mining Research An Overview 250429 112356
9 pages
Formal Specification in Software Reuse Designs: An Object-Oriented Database Example
No ratings yet
Formal Specification in Software Reuse Designs: An Object-Oriented Database Example
18 pages
Performance Enhancement of Video Compression Algorithms With SIMD
No ratings yet
Performance Enhancement of Video Compression Algorithms With SIMD
80 pages
Vast Audio Warehouses.
No ratings yet
Vast Audio Warehouses.
1 page
Action Event Retrieval From Cricket Video Using Audio Energy Feature For Event Summarization
No ratings yet
Action Event Retrieval From Cricket Video Using Audio Energy Feature For Event Summarization
8 pages
Image and Video Compression Techniques in Image Processesing An Overview
No ratings yet
Image and Video Compression Techniques in Image Processesing An Overview
8 pages
Book Chapter-Ip
No ratings yet
Book Chapter-Ip
21 pages
Wk9 MPEG Part2
No ratings yet
Wk9 MPEG Part2
30 pages
Compression Techniques
No ratings yet
Compression Techniques
7 pages
FILE: DIP Books EE5356: Digital Image / Video Coding / Compression
No ratings yet
FILE: DIP Books EE5356: Digital Image / Video Coding / Compression
10 pages
Motion Estimtion and Motion Compensated (Video) Coding
No ratings yet
Motion Estimtion and Motion Compensated (Video) Coding
41 pages
Softwares Video Forense PDF
No ratings yet
Softwares Video Forense PDF
10 pages
Comparison of H.264 and Motion-JPEG2000 Compression For Video Telemetry Final
No ratings yet
Comparison of H.264 and Motion-JPEG2000 Compression For Video Telemetry Final
6 pages
Multimedia Content Description Interface: MPEG-7
No ratings yet
Multimedia Content Description Interface: MPEG-7
23 pages
MPEG7 Algorithms Implemented For A Multimedia Management System
No ratings yet
MPEG7 Algorithms Implemented For A Multimedia Management System
6 pages
Introduction To Mpeg-7
No ratings yet
Introduction To Mpeg-7
17 pages
MPEG-7 for Multimedia Professionals
100% (1)
MPEG-7 for Multimedia Professionals
58 pages
Audio Indexing: Gaël Richard
No ratings yet
Audio Indexing: Gaël Richard
1 page
Mining Multimedia Data Bases
No ratings yet
Mining Multimedia Data Bases
17 pages
Multimedia Database Management System: Paper On
No ratings yet
Multimedia Database Management System: Paper On
24 pages
Content-Based Audio Retrieval Using A Generalized Algorithm
No ratings yet
Content-Based Audio Retrieval Using A Generalized Algorithm
13 pages
Information Systems, Technology and Management INFS3608 Data & Information Management Semester 2, 2008 Assignment Cover Page
No ratings yet
Information Systems, Technology and Management INFS3608 Data & Information Management Semester 2, 2008 Assignment Cover Page
9 pages
Multimedia Database Management Systems
No ratings yet
Multimedia Database Management Systems
6 pages
Video Database Management Techniques
No ratings yet
Video Database Management Techniques
5 pages
Overview of MPEG Standards and History
No ratings yet
Overview of MPEG Standards and History
2 pages
Motion Estimation Algorithm Using Block-Matching and Harmony Search Optimization
No ratings yet
Motion Estimation Algorithm Using Block-Matching and Harmony Search Optimization
33 pages
Chad Fogg - Didier J. LeGall - Joan L. Mitchell - William B. Pennebaker-MPEG Video Compression Standard (Digital Multimedia Standards Series) (Digital Multimedia Standards Series) - Springer (1996)
No ratings yet
Chad Fogg - Didier J. LeGall - Joan L. Mitchell - William B. Pennebaker-MPEG Video Compression Standard (Digital Multimedia Standards Series) (Digital Multimedia Standards Series) - Springer (1996)
509 pages
Performance Comparison of Optical Flow and Background Subtraction and Discrete Wavelet Transform Methods For Moving Objects
No ratings yet
Performance Comparison of Optical Flow and Background Subtraction and Discrete Wavelet Transform Methods For Moving Objects
10 pages
Multimedia Data Mining Guide
No ratings yet
Multimedia Data Mining Guide
19 pages
Object Tracking Using Background Subtraction and M PDF
No ratings yet
Object Tracking Using Background Subtraction and M PDF
11 pages
Efficient Motion Estimation by Fast Threestep Search X8ir95p9y3
No ratings yet
Efficient Motion Estimation by Fast Threestep Search X8ir95p9y3
6 pages
Wk8 MPEG Part1
No ratings yet
Wk8 MPEG Part1
36 pages
Chapter 2-Multimedia Storage Techniques
No ratings yet
Chapter 2-Multimedia Storage Techniques
62 pages
An Overview of The Mpeg Compression Algorithm: Technical Note
No ratings yet
An Overview of The Mpeg Compression Algorithm: Technical Note
7 pages
An Overview of The Mpeg Compression Algorithm: Technical Note
No ratings yet
An Overview of The Mpeg Compression Algorithm: Technical Note
8 pages
INGss LES
No ratings yet
INGss LES
4 pages
A New Forensic Video Database For Source Smartphone Identification Description and Analysis
No ratings yet
A New Forensic Video Database For Source Smartphone Identification Description and Analysis
12 pages
MPEG Unit 5
No ratings yet
MPEG Unit 5
9 pages
In Communications Based On MATLAB: A Practical Course
No ratings yet
In Communications Based On MATLAB: A Practical Course
10 pages
Improved Error Detection and Data Recovery Architecture For Motion Estimation Testing Applications
No ratings yet
Improved Error Detection and Data Recovery Architecture For Motion Estimation Testing Applications
7 pages
Engineering Control Course Guide
No ratings yet
Engineering Control Course Guide
14 pages
Abunetekelehaimanotamharic 2
100% (1)
Abunetekelehaimanotamharic 2
42 pages
Amhara Housing Scheme Amended
No ratings yet
Amhara Housing Scheme Amended
30 pages
06 Philipsious
No ratings yet
06 Philipsious
17 pages
Ethiochurchmenfesawiermeja
No ratings yet
Ethiochurchmenfesawiermeja
97 pages
03 Autocephali
No ratings yet
03 Autocephali
11 pages
2 MediaRepresentation Audio
No ratings yet
2 MediaRepresentation Audio
76 pages
Review Article: Cybersecurity and Countermeasures at The Time of Pandemic
No ratings yet
Review Article: Cybersecurity and Countermeasures at The Time of Pandemic
19 pages
Abba Petros
100% (1)
Abba Petros
335 pages
GeoNode User Account Guide
No ratings yet
GeoNode User Account Guide
80 pages
INSA Presentation Branding Mockups
No ratings yet
INSA Presentation Branding Mockups
2 pages
7629-Questions and Answers Sem I
50% (2)
7629-Questions and Answers Sem I
14 pages
02 Autocephali
No ratings yet
02 Autocephali
36 pages
Sensors: A Comprehensive Survey On Signcryption Security Mechanisms in Wireless Body Area Networks
No ratings yet
Sensors: A Comprehensive Survey On Signcryption Security Mechanisms in Wireless Body Area Networks
37 pages
ስንክሳር (1)
86% (7)
ስንክሳር (1)
1,044 pages
03 Yemebetachin Weladiteamlak 01
No ratings yet
03 Yemebetachin Weladiteamlak 01
22 pages
Unit 2 - Week 1
No ratings yet
Unit 2 - Week 1
5 pages
Visual Perception for Art Students
No ratings yet
Visual Perception for Art Students
3 pages
Research Seminar Titles
No ratings yet
Research Seminar Titles
3 pages
Validation and Filters in ASP.NET MVC
No ratings yet
Validation and Filters in ASP.NET MVC
21 pages
Unit 1-Information Representation and Multimedia
No ratings yet
Unit 1-Information Representation and Multimedia
35 pages
Artifacts FSI
No ratings yet
Artifacts FSI
11 pages
Frame Rate Benchmark Results Summary
No ratings yet
Frame Rate Benchmark Results Summary
1 page
Fourier Transform in MRI Imaging
No ratings yet
Fourier Transform in MRI Imaging
2 pages
Projection Lenses Questions and Answers With Formulae PDF
No ratings yet
Projection Lenses Questions and Answers With Formulae PDF
4 pages
Geoinformatics: Principles and Applications
No ratings yet
Geoinformatics: Principles and Applications
1 page
ETech Image Placement
No ratings yet
ETech Image Placement
34 pages
3D Holographic Projection Technique
0% (1)
3D Holographic Projection Technique
26 pages
Manual de Instruções Epson PowerLite S5 (112
No ratings yet
Manual de Instruções Epson PowerLite S5 (112
2 pages
压缩版本-Umy Medical Catalogue from KellyMa20250507
No ratings yet
压缩版本-Umy Medical Catalogue from KellyMa20250507
26 pages
Facial Plastic, Reconstructive and Trauma Surgery - 1st Edition Full Text Download
100% (8)
Facial Plastic, Reconstructive and Trauma Surgery - 1st Edition Full Text Download
17 pages
Bayer Pattern Detector MTF Analysis
No ratings yet
Bayer Pattern Detector MTF Analysis
7 pages
DH-SD49225-HC-LA: 2MP 25x Starlight IR PTZ HDCVI Camera
No ratings yet
DH-SD49225-HC-LA: 2MP 25x Starlight IR PTZ HDCVI Camera
3 pages
Color 1 Color 2 Color 3 Color 4 Color 5
No ratings yet
Color 1 Color 2 Color 3 Color 4 Color 5
8 pages
Difference Between LCD and CRT Displays
No ratings yet
Difference Between LCD and CRT Displays
3 pages
Full HD Car DVR User Manual: 1, Certifications
100% (1)
Full HD Car DVR User Manual: 1, Certifications
13 pages
Course Ra
No ratings yet
Course Ra
1 page
Multimedia Image Formats Overview
No ratings yet
Multimedia Image Formats Overview
9 pages
DSLR Camera Features & Image Sensors
No ratings yet
DSLR Camera Features & Image Sensors
38 pages
Typhoon H Software Release 3.0 Software Release 3.0: W Hat'S New?
No ratings yet
Typhoon H Software Release 3.0 Software Release 3.0: W Hat'S New?
5 pages
DJI Mini 4 Pro Drone Review
No ratings yet
DJI Mini 4 Pro Drone Review
28 pages
Acr
No ratings yet
Acr
14 pages
BSC Medical Imaging Technology
No ratings yet
BSC Medical Imaging Technology
3 pages
The Power To Freeze Time
No ratings yet
The Power To Freeze Time
11 pages
Computer Graphics (BEG375CO)
No ratings yet
Computer Graphics (BEG375CO)
1 page
Chapter 1: Introduction: Digital Image Processing, 3rd Edition, by R. C. Gonzalez and R. E. Woods, Prentice Hall, 2008
No ratings yet
Chapter 1: Introduction: Digital Image Processing, 3rd Edition, by R. C. Gonzalez and R. E. Woods, Prentice Hall, 2008
13 pages
Yizhan Camera Manual de instruções 说明书 - 英文版
No ratings yet
Yizhan Camera Manual de instruções 说明书 - 英文版
11 pages
Police PhotoGraphy
86% (7)
Police PhotoGraphy
233 pages
OPERA T90 Sharp - Eng
No ratings yet
OPERA T90 Sharp - Eng
8 pages
SunPak DigiFlash 3000 Manual
No ratings yet
SunPak DigiFlash 3000 Manual
1 page

Mpeg 2 Compressed Domain Algorithms For Video Analysis

Uploaded by

Mpeg 2 Compressed Domain Algorithms For Video Analysis

Uploaded by

Hindawi Publishing Corporation

EURASIP Journal on Applied Signal Processing

MPEG-2 Compressed-Domain Algorithms for

Wolfgang Hesseler and Stefan Eickeler

Fraunhofer IMK, Schloss Birlinghoven, 53754 Sankt Augustin, Germany

Received 1 September 2004; Revised 2 June 2005; Accepted 6 June 2005

Copyright © 2006 Hindawi Publishing Corporation. All rights reserved.

1. INTRODUCTION MPEG-7 is a common standard to store the metadata

Figure 1: Motion vectors spread in low-textured areas.

5. CUT DETECTION Cut detection algorithms using the ratio of forward-

identical. A significant change in the forward/backward ra- R1 | B1 B2 R2

You might also like