|
|
|
| United States Patent | 6175592 |
| Link to this page | http://www.wikipatents.com/6175592.html |
| Inventor(s) | Kim; Hee-Yong (Plainsboro, NJ);
Meyer; Edwin Robert (Bensalem, PA);
Egawa; Ren (Princeton, NJ) |
| Abstract | A discrete cosine transform domain (DCT) filter for lowpass filtering a
high resolution encoded video image represented as frequency-domain
coefficient values, such as macroblocks, before decimation of the video
image in the spatial domain. The DCT filter masks or weights the DCT
coefficients of the video image macroblocks before processing by an
inverse DCT. The filter may be implemented as a block mirror filter in the
frequency domain, and the filter values may be combined with the IDCT
coefficient values. Original motion vectors of the high resolution encoded
video image are translated because low resolution reference images used by
the decoder are not equivalent to the original high resolution images.
Therefore, motion vectors are scaled to retrieve low resolution prediction
blocks which are up-sampled to generate the original pixel and half-pixel
values in the spatial domain. The up-sampled prediction block is added to
the DCT filtered inverse-DCT transformed pixel values if the current
macroblock is part of a non-intraframe encoded image. After motion
compensation processing of the original macroblock, the reconstructed
macroblock in the lower resolution is decimated accordingly. |
|
|
|
Title Information  |
|
|
|
|
|
Drawing from US Patent 6175592 |
|
|
Frequency domain filtering for down conversion of a DCT encoded picture |
|
|
|
|
|
| Publication Date |
January 16, 2001 |
|
|
|
|
|
| Filing Date |
March 12, 1997 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Title Information  |
|
|
References  |
|
|
| *references marked with an asterisk below are user-added references |
|
U.S. References |
|
|
| Add a new US reference: |
| | Reference | Relevancy | Comments | Reference | Relevancy | Comments | 5841479 Van Gestel 375/240.01 Nov,1998 |      Your vote accepted [0 after 0 votes] | | 5835151 Sun 348/441 Nov,1998 |      Your vote accepted [0 after 0 votes] | | 5784494 Strongin
Jul,1998 |      Your vote accepted [0 after 0 votes] | | 5774206 Wasserman 709/247 Jun,1998 |      Your vote accepted [0 after 0 votes] | | 5737019 Kim 375/240.25 Apr,1998 |      Your vote accepted [0 after 0 votes] | | 5726711 Boyce 348/408.1 Mar,1998 |      Your vote accepted [0 after 0 votes] | | 5614957 Boyce 348/567 Mar,1997 |      Your vote accepted [0 after 0 votes] | | 5613084 Hau 711/219 Mar,1997 |      Your vote accepted [0 after 0 votes] | | 5528301 Hau 348/441 Jun,1996 |      Your vote accepted [0 after 0 votes] | | 5489903 Wilson 341/144 Feb,1996 |      Your vote accepted [0 after 0 votes] | | 5483474 Arbeiter 708/300 Jan,1996 |      Your vote accepted [0 after 0 votes] | | 5481568 Yada 375/340 Jan,1996 |      Your vote accepted [0 after 0 votes] | | 5389923 Iwata
Feb,1995 |      Your vote accepted [0 after 0 votes] | | 5331346 Shields 348/441 Jul,1994 |      Your vote accepted [0 after 0 votes] | | 5327235 Richards 348/441 Jul,1994 |      Your vote accepted [0 after 0 votes] | | 5274372 Luthra 341/61 Dec,1993 |      Your vote accepted [0 after 0 votes] | | 5262854 Ng 375/240.24 Nov,1993 |      Your vote accepted [0 after 0 votes] | | 5057911 Stec 348/642 Oct,1991 |      Your vote accepted [0 after 0 votes] | | 4908874 Gabriel 382/277 Mar,1990 |      Your vote accepted [0 after 0 votes] | | 4870661 Yamada 375/240 Sep,1989 |      Your vote accepted [0 after 0 votes] | | 4774581 Shiratsuchi 348/561 Sep,1988 |      Your vote accepted [0 after 0 votes] | | 4652908 Fling 348/625 Mar,1987 |      Your vote accepted [0 after 0 votes] | | 4631750 Gabriel 382/277 Dec,1986 |      Your vote accepted [0 after 0 votes] | | 4536745 Yamaguchi 341/61 Aug,1985 |      Your vote accepted [0 after 0 votes] | | 4472785 Kasuga 708/270 Sep,1984 |      Your vote accepted [0 after 0 votes] | | 4472732 Bennett 348/452 Sep,1984 |      Your vote accepted [0 after 0 votes] | | 4468688 Gabriel 348/580 Aug,1984 |      Your vote accepted [0 after 0 votes] | | 3997772 Crochiere 708/313 Dec,1976 |      Your vote accepted [0 after 0 votes] | | |
|
|
|
|
U.S. References |
|
|
Foreign References |
|
|
|
|
|
|
Foreign References |
|
|
Other References |
|
|
|
|
|
|
Other References |
|
|
|
|
|
References  |
|
|
|
|
|
| Market Size |
|
Estimate the gross annual revenues of the relevant market
sector:
|
| | |
| |
|
|
| Market Share |
|
Estimate the percentage of the relevant market sector this invention will capture:
|
| | |
| |
|
|
| Reasonable Royalty |
|
What percentage of gross sales should the inventor or assignee be paid?
|
| | |
| |
|
|
|
Public's "Guesstimation" of Royalty Value
|
| Market Size | N/A | [No votes] | | x | Market Share | N/A | [No votes] | | x | Reasonable Royalty | N/A | [No votes] |
| | N/A | |
| |
|
|
|
|
|
|
|
|
|
|
|
|
Market Review  |
|
|
Technical Review  |
|
|
Claims  |
|
|
What is claimed:
1. An apparatus for forming a low resolution video signal from an encoded
video signal representing a video image, the encoded video signal being a
frequency-domain transformed high resolution video signal, the apparatus
comprising:
means for receiving and for providing the encoded video signal as a
plurality of high resolution frequency-domain video coefficient values;
down conversion filter means for receiving and weighting selected ones of
the plurality of high resolution frequency-domain video coefficient values
to form a set of filtered frequency-domain video coefficients, wherein the
down conversion filter means is a lowpass filter represented by a set of
frequency domain filter coefficients, and the down conversion filter means
weights the selected ones of the plurality of high resolution
frequency-domain video coefficient values by multiplying the set of
frequency domain filter coefficients with the plurality of high resolution
frequency-domain video coefficient values;
inverse-transform means for receiving and transforming the filtered
frequency-domain video coefficients into a set of filtered pixel sample
values; and
decimating means for deleting selected ones of the set of filtered pixel
sample values to provide the low resolution video signal.
2. Apparatus for forming a low resolution video signal as recited in claim
1, wherein the frequency-domain transformed high resolution video signal
is transformed by a discrete cosine transform (DCT) operation, and the
inverse-transform means transforms the frequency domain video coefficients
by an inverse discrete cosine transform (IDCT) operation.
3. Apparatus for forming a low resolution video signal as recited in claim
1, wherein the down conversion filter means is a lowpass filter having a
cutoff frequency determined by a sampling frequency of the encoded video
signal divided by a decimation ratio.
4. Apparatus for forming a low resolution video signal as recited in claim
1, wherein the down conversion filter means includes a plurality of
frequency domain coefficients of a lowpass block mirror filter having a
predetermined number of taps.
5. The apparatus for forming a low resolution video signal as recited in
claim 1, wherein the decimating means down-samples the set of filtered
pixel sample values according to a decimation ratio.
6. An apparatus for forming a low resolution video signal from an encoded
video signal representing a video image, the encoded video signal being a
discrete cosine transformed (DCT) high resolution video signal, the
apparatus comprising:
means for receiving and for providing the encoded video signal as a
plurality of DCT video coefficient values;
inverse-transform means including
means for weighting a set of the plurality of discrete cosine transform
(DCT) coefficient values with a set of multi-bit down-conversion filtering
coefficients by multiplying each DCT coefficient in the set of DCT
coefficients with a respectively different one of the set of multi-bit
down-conversion filtering coefficients to form a set of weighted DCT
coefficients; and
means for transforming, by an inverse DCT (IDCT) operation from a DCT
domain to a spatial domain, the weighted DCT coefficients into a set of
filtered pixel sample values; and
decimating means for deleting selected ones of the set of filtered pixel
sample values to provide the low resolution video signal.
7. A method of forming a lower resolution video signal from an encoded
video signal representing a video image, the encoded video signal being a
frequency-domain transformed video signal, comprising the steps of:
a) providing the encoded video signal as a plurality of discrete cosine
transform (DCT) coefficient values;
b) weighting selected ones of the plurality of DCT coefficient values with
a plurality of frequency domain coefficients representing a lowpass block
mirror filter having a predetermined number of taps to form a set of
filtered DCT coefficient values;
c) transforming the filtered DCT coefficient values according to an inverse
discrete cosine transform (IDCT) operation to obtain a set of filtered
pixel sample values; and
d) retaining selected ones of the set of filtered pixel sample values to
provide the lower resolution video signal.
8. An apparatus for forming a lower resolution video signal from an encoded
video signal representing a video image, the encoded video signal being a
frequency-domain transformed video signal, comprising:
means for receiving and for providing the encoded video signal as a
plurality of frequency-domain video coefficient values;
combining means for combining the plurality of frequency domain video
coefficient values with a set of filtering inverse-transform coefficients
to produce a set of filtered pixel sample values, wherein the filtering
inverse-transform coefficients are formed by multiplying a set of
weighting coefficients for down-conversion and a set of inverse-transform
coefficients for conversion from the frequency domain to the spatial
domain;
decimating means for deleting selected ones of the set of filtered pixel
sample values to produce a set of decimated pixel sample values; and
means for storing the set of decimated filtered pixel sample values and for
providing the stored set of decimated filtered pixel sample values as the
lower resolution video signal.
9. A method of receiving an encoded video signal representing a video
image, the encoded video signal being a frequency-domain transformed video
signal, and forming a lower resolution video signal, the method comprising
the steps of:
a) providing the encoded video signal as a plurality of frequency-domain
video coefficient values;
b) combining the plurality of frequency domain video coefficient values
with a set of filtering inverse-transform coefficients to produce a set of
low resolution pixel sample values, wherein the filtering
inverse-transform coefficients are formed by multiplying a set of
weighting coefficients for low-pass filtering and a set of
inverse-transform coefficients for conversion from the frequency domain to
the spatial domain;
c) decimating selected ones of the set of low resolution pixel sample
values; and
d) storing the selected ones of the set of low resolution pixel sample
values to provide the stored pixel sample values as the lower resolution
video signal.
10. Apparatus for receiving an encoded video signal representing a video
image, the encoded video signal being a compressed frequency-domain
transformed video signal, and forming a lower resolution video signal, the
apparatus comprising:
means for providing the encoded video signal as a plurality of DCT
coefficient values and a motion vector;
down-conversion filter means for receiving and weighting, based on a
decimation value, selected ones of the plurality of DCT coefficient values
with a plurality of frequency domain coefficients representing a lowpass
block mirror filter having a predetermined number of taps to form a set of
filtered DCT coefficients;
inverse-transform means for receiving and transforming the filtered DCT
coefficients using an inverse discrete cosine transform (IDCT) operation
to obtain a set of filtered compressed pixel sample values;
translation means for receiving the motion vector and scaling the motion
vector based on the decimation value;
prediction block generating means for receiving the scaled motion vector
and a previous set of filtered pixel sample values, and forming a set of
prediction pixel sample values;
combining means for combining the set of filtered compressed pixel sample
values with the set of prediction pixel sample values to form a set of
filtered pixel sample values; and
decimating means for receiving and for retaining selected ones of the set
of filtered pixel sample values based on the decimation value, wherein the
decimating means provides the selected ones of the set of filtered pixel
sample values as the lower resolution video signal.
11. Apparatus for forming a low resolution video signal as recited in claim
10, wherein the down conversion filter means is a lowpass filter having a
cutoff frequency proportional to a sampling frequency of the encoded video
signal divided by the decimation value.
12. Apparatus for forming a lower resolution video signal as recited in
claim 10, wherein the down conversion filter means is a lowpass filter
represented by a set of frequency-domain filter coefficient values, and
the down conversion filter means weights the selected ones of the
plurality of DCT coefficient values by multiplying the set of
frequency-domain filter coefficients with respective ones of the plurality
of DCT coefficient values.
13. Apparatus for forming a lower resolution video signal as recited in
claim 10, wherein the prediction block generating means further includes:
memory means for storing at least one reference frame, the reference frame
being a previously decoded video signal represented as the previous set of
filtered pixel sample values,
up-sampling means for receiving and up-sampling the reference frame, the
up-sampling means and the memory means being responsive to the scaled
motion vector; and
half-pixel generating means for generating a plurality of half-pixel
interpolated values from the up-sampled reference frame, the half-pixel
generating means providing the plurality of half-pixel interpolated values
as the set of prediction pixel sample values.
14. A method of receiving an encoded video signal representing a video
image, the encoded video signal being a compressed frequency-domain
transformed video signal, and forming a low resolution video signal, the
method comprising the steps of:
a) providing the encoded video signal as a plurality of compressed high
resolution DCT coefficient values and a motion vector;
b) weighting, based on a decimation value, selected ones of the plurality
of compressed high resolution DCT coefficient values with a plurality of
frequency domain coefficients representing a lowpass block mirror filter
having a predetermined number of taps to form a set of filtered compressed
DCT coefficient values;
c) transforming the filtered compressed DCT coefficient values using an
inverse discrete cosine transform (IDCT) operation to obtain a set of
filtered compressed pixel sample values;
d) scaling the motion vector based on the decimation value; and
e) forming a set of prediction pixel sample values from the scaled motion
vector and a previous set of filtered pixel sample values;
f) combining the set of filtered compressed pixel sample values with the
set of prediction pixel sample values to form a set of filtered pixel
sample values;
g) deleting selected ones of the set of filtered pixel sample values based
on the decimation value to form the lower resolution video signal; and
h) storing the pixel sample values of the low resolution video signal. |
|
|
|
|
Claims  |
|
|
Description  |
|
|
FIELD OF THE INVENTION
This invention relates to a decoder having a filter for down conversion of
frequency domain encoded signals, e.g. MPEG-2 encoded video signals, and
more specifically to a decoder which converts a high resolution video
signal to a low resolution video signal by filtering the frequency domain
signals.
BACKGROUND OF THE INVENTION
In the United States a standard has been proposed for digitally encoded
high definition television signals (HDTV). A portion of this standard is
essentially the same as the MPEG-2 standard, proposed by the Moving
Picture Experts Group (MPEG) of the International Organization for
Standardization (ISO). The standard is described in an International
Standard (IS) publication entitled, "Information Technology--Generic
Coding of Moving Pictures and Associated Audio, Recommendation H.626",
ISO/IEC 13818-2, IS, 11/94 which is available from the ISO and which is
hereby incorporated by reference for its teaching on the MPEG-2 digital
video coding standard.
The MPEG-2 standard is actually several different standards. In MPEG-2
several different profiles are defined, each corresponding to a different
level of complexity of the encoded image. For each profile, different
levels are defined, each level corresponding to a different image
resolution. One of the MPEG-2 standards, known as Main Profile, Main Level
is intended for coding video signals conforming to existing television
standards (i.e., NTSC and PAL). Another standard, known as Main Profile,
High Level is intended for coding high-definition television images.
Images encoded according to the Main Profile, High Level standard may have
as many as 1,152 active lines per image frame and 1,920 pixels per line.
The Main Profile, Main Level standard, on the other hand, defines a maximum
picture size of 720 pixels per line and 567 lines per frame. At a frame
rate of 30 frames per second, signals encoded according to this standard
have a data rate of 720*567*30 or 12,247,200 pixels per second. By
contrast, images encoded according to the Main Profile, High Level
standard have a maximum data rate of 1,152*1,920*30 or 66,355,200 pixels
per second. This data rate is more than five times the data rate of image
data encoded according to the Main Profile Main Level standard. The
standard proposed for HDTV encoding in the United States is a subset of
this standard, having as many as 1,080 lines per frame, 1,920 pixels per
line and a maximum frame rate, for this frame size, of 30 frames per
second. The maximum data rate for this proposed standard is still far
greater than the maximum data rate for the Main Profile, Main Level
standard.
The MPEG-2 standard defines a complex syntax which contains a mixture of
data and control information. Some of this control information is used to
enable signals having several different formats to be covered by the
standard. These formats define images having differing numbers of picture
elements (pixels) per line, differing numbers of lines per frame or field
and differing numbers of frames or fields per second. In addition, the
basic syntax of the MPEG-2 Main Profile defines the compressed MPEG-2 bit
stream representing a sequence of images in six layers, the sequence
layer, the group of pictures layer, the picture layer, the slice layer,
the macroblock layer, and the block layer. Each of these layers is
introduced with control information. Finally, other control information,
also known as side information, (e.g. frame type, macroblock pattern,
image motion vectors, coefficient zig-zag patterns and dequantization
information) are interspersed throughout the coded bit stream.
Down-conversion of high resolution Main Profile, High Level pictures to
Main Level, Main Level pictures, or other lower resolution picture
formats, has gained increased importance for reducing implementation costs
of HDTV. Down conversion allows replacement of expensive high definition
monitors used with Main Profile, High Level encoded pictures with
inexpensive existing monitors which have a lower picture resolution to
support, for example, Main Profile, Main Level encoded pictures, such as
NTSC or 525 progressive monitors. Down conversion converts a high
definition input picture into lower resolution picture for display on the
lower resolution monitor.
To effectively receive the digital images, a decoder should process the
video signal information rapidly. To be optimally effective, the coding
systems should be relatively inexpensive and yet have sufficient power to
decode these digital signals in real time.
One method of down conversion of the prior art simply low pass filters and
decimates the decoded high resolution, Main Profile, High Level picture to
form an image suitable for display on a conventional television receiver.
Consequently, using existing techniques, a decoder employing
down-conversion may be implemented using a single processor having a
complex design, considerable memory, and operating on the spatial domain
image at a high data rate to perform this function. The high resolution,
and high data rate, however, requires very expensive circuitry, which
would be contrary to the implementation of a decoder in a consumer
television receiver in which cost is a major factor.
SUMMARY OF THE INVENTION
An apparatus for forming a decimated video signal receives an encoded video
signal representing a video image, the encoded video signal being a
frequency-domain transformed video signal. The apparatus includes means
for providing the encoded video signal as a plurality of high resolution
frequency-domain video coefficient values. The apparatus further includes
down-conversion filter means for receiving and weighting selected ones of
the plurality of high resolution frequency-domain video coefficient values
to form a set of filtered frequency-domain video coefficients; and
inverse-transform means for receiving and transforming the filtered
frequency-domain video coefficients into a set of low resolution pixel
sample values. The apparatus also includes a decimating processor for
receiving and retaining selected ones of the set of low resolution pixel
sample values to provide the decimated video signal.
BRIEF DESCRIPTION OF THE DRAWINGS
These and other features and advantages of the present invention will
become apparent from the following detailed description, taken in
conjunction with the accompanying drawings, wherein:
FIG. 1 is a high level block diagram of a video decoding system of the
prior art.
FIG. 2A is a high level block diagram of the down conversion system of one
exemplary embodiment of the present invention.
FIG. 2B is a high level block diagram of the down conversion system of a
second exemplary embodiment of the present invention employing an
inexpensive horizontal and vertical filtering implementation.
FIG. 3A illustrates subpixel positions and corresponding predicted pixels
for the 3:1 and 2:1 exemplary embodiments of the present invention.
FIG. 3B shows the upsampling process which is performed for each row of an
input macroblock for an exemplary embodiment of the present invention.
FIG. 4 illustrates the multiplication pairs for the first and second output
pixel values of an exemplary embodiment of a block mirror filter.
FIG. 5 illustrates an exemplary implementation of the filter for
down-conversion for a two-dimensional system processing the horizontal and
vertical components implemented as cascaded one-dimensional IDCTs.
FIG. 6A shows the input and decimated output pixels for 4:2:0 video signal
using 3:1 decimation.
FIG. 6B shows the input and decimated output pixels for 4:2:0 video signal
using 2:1 decimation.
FIG. 7A is a high level block diagram illustrating a vertical programmable
filter of one embodiment of the present invention.
FIG. 7B illustrates the spatial relationships between coefficients and
pixel sample space of lines of the vertical programmable filter of FIG.
7A.
FIG. 8A is a high level block diagram illustrating a horizontal
programmable filter of one embodiment of the present invention.
FIG. 8B illustrates spatial relationships between | | |