|
Description  |
|
|
CROSS-REFERENCE TO RELATED APPLICATION
This application is related to four other applications filed on this date.
Their titles are:
1. A High Definition Television Coding Arrangement With Graceful
Degradation
2. An Adaptive Leak HDTV Encoder
3. HDTV Encoder with Forward Estimation and Constant Rate Motion Vectors
4. HDTV Receiver
BACKGROUND OF THE INVENTION
This invention relates to encoding of signals, and more specifically, to
encoding of signals with a controlled quantizer.
When image signals are digitized and linearly quantized, a transmission
rate of about 100 Mbits per second is necessary for images derived from
standard TV signals. For HDTV, with its greater image size and greater
resolution, a much higher transmission rate would be is required. When
terrestrial transmission is desired, and when the option of allocating
greater bandwidth per channel is unavailable, it is necessary to compress
the HDTV signal within the allocated bandwidth. Having an allocated
bandwidth is tantamount to having a certain number of information bits
that can be transmitted in each selected time interval, such as an image
frame. All of the compression techniques of HDTV signals must of necessity
consider this bit budget.
In order to reduce the bit rate, or to fit the encoded signal of an image
frame within the allocated bit budget, various coding schemes have been
studied. One is the so-called "differential pulse code modulation" (DPCM)
coding approach. By this method, the value of a particular pixel at any
moment is predicted on the basis of values of pixels which have been
already coded. The necessary number of bits is reduced by coding the
difference (error) between the predicted value and the value of the
particular pixel at that moment. According to this differential pulse code
modulation, it is possible to reduce the number of bits per pixel by
approximately half.
Still, that number is much larger than can be accepted for terrestrial
transmission, so a second look is typically taken at the quantizer itself.
Clearly, if the step size of the quantizer is made large enough, the
number of bits generated can be reduced to an acceptable level. Alas, the
quantization error resulting therefrom would yield a picture that is far
from acceptable.
Experiments show, and it is quite logical, that the effect of quantization
error is different for different types of picture and, expectedly,
artisans have tried to tailor quantizers to the pictures being encoded.
U.S. Pat. No. 4,802,232 issued Jan. 31, 1989, for example, describes one
such attempt. In accordance with the described approach the picture to be
encoded is divided into regions and each region of the input image is
classified into one of a preset number of classes. The signal of each
class is quantized in a specified manner, and the manner of quantization
of the different classes, of course, differ. The limitation of this
approach is the hard decisions boundaries between the classes.
Nill in "A Visual Model Weighted Cosine Transform for Image Compression and
Quality Assessment", IEEE Transactions on Communications, Vol. COM-33, No.
6, June, 1985, pg. 551-557 and Saghri et al. "Image quality measure based
on a human visual system model", Optical Engineering, Vol. 28, No. 7, July
1989, pg. 813-818, describe very similar approaches. These approaches
utilize fixed global thresholds for each frequency band, where the
thresholds are set based on a model of the human visual systems (HVS)
modulation transfer function (MTF), and the transform utilized. The
limitation of this approach is that the quantization cannot adapt to local
characteristics of the image.
In "Adaptive Quantization of Picture Signals using Spatial Masking",
Proceedings of IEEE, Vol. 65, April 1977, pg. 536-548, Netravali et al.
describe a means of designing non-uniform quantizers to incorporate
spatial masking and brightness correction for predictive coding systems.
The limitation of their approach, again, is that the quantization cannot
adapt to local characteristics of the image.
In "Design of Statistically Based Buffer Control Policies for Compressed
Digital Video", Zdepski et al., an IEEE conference, 1989, pg. 1343-1349,
describe an approach where the quantizer in the DPCM loop interacts with
an adaptive mode control circuit. The circuit measures the number of bits
generated by the quantizer and, based on preselected thresholds, decides
for the next frame on one of eight possible quantizer step sizes. The
selected step size is employed for the next frame. A similar approach is
described in "Digital Pictures" by A. N. Netravali and B. G. Haskell,
Plenum Press, 1988, pg. 537 et seq.
The deficiency of these methods is that the quantization steps are altered
in discrete jumps and in that no accounting is made of a global distortion
target.
It is an object of this invention, therefore, to develop means for
controlling the quantizer so that the number of bits generated per frame
is, on the average, within the allocated bit budget.
It is another object of this invention to control the quantizer in a manner
that is sensitive to the characteristics of the input signal to which
human viewers are sensitive.
It is still another object of this invention to control the quantizer in a
manner that spreads as evenly as possible the unavoidable quantization
noise.
SUMMARY OF THE INVENTION
These and other objects are achieved with a quantizer control mechanism
that is responsive to both the input signal and the fullness of the output
buffer. More specifically, the input image is divided into blocks and the
signal of each block is DCT transformed. The transformed signal is
analyzed to develop a brightness correction and to evaluate the texture of
the image and the change in texture in the image. Based on these, and in
concert with the human visual perception model, perception threshold
signals are created for each subband of the transformed signal.
Concurrently, scale factors for each subband of the transformed signal are
computed, and a measure of variability in the transformed input signal is
calculated. A measure of the fullness of the buffer to which the quantizer
sends its encoded results is obtained, and that measure is combined with
the calculated signal variability to develop a correction signal. The
correction signal modifies the perception threshold signals to develop
threshold control signals that are applied to the quantizer. The scale
factors are also applied to the quantizer, as well as a global target
distortion measure.
The quantizer pre-multiplies the input signal by the scale factors and
quantizes the signal in a manner that is sensitive to the threshold
control signals. The quantization itself is performed a number of times by
scanning a codebook that specifies the exact manner of quantizing of each
of the subbands of the transformed signal. Accounting for the global
target distortion, the best quantization result is selected and sent to
the aforementioned buffer.
BRIEF DESCRIPTION OF THE DRAWING
FIG. 1 presents a block diagram of a forward estimation section of an HDTV
digital encoder;
FIG. 2 presents a block diagram of an encoding loop section of an HDTV
encoder that interacts with the forward estimation section of FIG. 1;
FIG. 3 depicts a hardware organization for a coarse motion vector detector;
FIG. 4 depicts a hardware organization for a fine motion vector detector
that takes into account the output of the coarse motion vector detector;
FIG. 5 illustrates the spatial relationship of what is considered a "slice"
of image data;
FIG. 6 shows one way for selecting a mix of motion vectors to fit within a
given bit budget;
FIG. 7 presents one embodiment for evaluating a leak factor, .alpha.;
FIG. 8 illustrates the arrangement of a superblock that is quantized in QVS
38 block of FIG. 2;
FIG. 9 depicts how a set of selection error signals is calculated in
preparation for codebook vector selection;
FIG. 10 presents a block diagram of QVS block 38;
FIG. 11 is a block diagram of inverse quantization block 39;
FIG. 12 presents the structure of perceptual coder 49;
FIG. 13 illustrates the structure of perceptual processor 93;
FIG. 14 is a block diagram of texture processors 96 and 98;
FIG. 15 presents a block diagram of a digital HDTV receiver;
FIG. 16 presents a modified forward estimation block that choses a leak
factor from a set of two fixed leak factors; and
FIG. 17 a frame buffere circuit that includes a measure of temporal
filtering.
DETAILED DESCRIPTION
The motion estimation principles of this invention start with the
proposition that for a given bit budget, a much better overall prediction
error level can be attained by employing blocks of variable sizes.
Starting with large sized blocks that handle large image sections for
which a good translation vector can be found, the motion vector selection
in accordance with this invention then replaces, or adds, to some of
motion vectors for the large blocks with motion vectors for the small
blocks.
While these principles of the invention can be put to good use in the
prediction of other types of signals, it is clear that it is quite useful
in connection with the encoding of HDTV signals. However, in the HDTV
application there is simply not enough time within the encoding loop to
compute the best motion vectors and to then select the best mix of motion
vectors. Accordingly, a forward estimation section is incorporated in the
encoder, and that section creates as many of the signals used within the
encoder as it is possible. The input image signal is, of course,
appropriately delayed so that the need for the signals that are computed
within the forward estimation section does not precede their creation.
In order to appreciate the operation of the motion vector selection and
encoding processes improvement of this invention serves, the following
describes the entire coder section of an HDTV transmitter. The principles
of this invention are, of course, primarily described in connection with
the motion vector estimation and selection circuits depicted of FIG. 1.
DETAILED DESCRIPTION
Quantization of the signals is one of the most important tasks that the
HDTV transmitter must perform. The quantizer, however, is but a part of
the entire encoding loop. Moreover, the encoding loop, generally, and the
quantizer, in particular, require a number of signals that need a
substantial period of time for computation. Fortunately, it is possible to
compute these signals in a forward estimation section, which "off line" as
far as the encoding loop is concerned.
To better understand the function and workings of the quantizer of this
invention, the following describes the entire coder section of an HDTV
transmitter, i.e., both the forward estimation section and the encoding
loop. The detailed principles of this invention are primarily described in
connection with the perceptual encoder circuit, the quantizer circuit, and
the associated circuits.
In FIG. 1, the input signal is applied at line 10. It is a digitized video
signal that arrives in sequences of image frames. This input signal is
applied to frame-mean processor 11, to buffer 12, and to motion vector
generator 13. The output of buffer 12 is also applied to motion vector
generator block 13. Frame-mean processor 11 develops the mean value of
each incoming frame. That value is delayed in buffers 24 and 25, and
applied to a number of elements within FIG. 1, as described below. It is
also sent to the encoding loop of FIG. 2 through buffer 52. Motion vector
generator 13 develops motion vectors which are applied to motion vector
selector/encoder 14 and, thereafter, through buffers 15 and 32, wherefrom
the encoded motion vectors are sent to the encoding loop of FIG. 2. The
unencoded output of motion vector selector/encoder 14 is also applied to
motion compensator block 16, and to buffer 17 followed by buffer 50,
wherefrom the unencoded motion vectors are sent to the encoding loop of
FIG. 2.
The output of buffer 12 is applied to buffer 18 and thereafter to buffers
19 and 51, wherefrom it is sent to the encoding loop of FIG. 2. The output
of buffer 18 is applied to buffer 22 and to leak factor processor 20, and
the output of buffer 19 is applied to motion compensator circuit 16. The
output of motion compensator 16 is applied to buffer 21 and to leak factor
processor 20.
The frame-mean signal of buffer 25 is subtracted from the output of buffer
21 in subtracter 26 and from the output of buffer 22 in subtracter 27. The
outputs of subtracter 26 and leak processor 20 are applied to multiplier
23, and the output of multiplier 23 is applied to subtracter 28. The
output of leak processor 20 is also sent to the encoding loop of FIG. 2
via buffer 31. Element 28 subtracts the output of multiplier 23 from the
output of subtracter 27 and applies the result to DCT transform circuit
30. The output of transform circuit 30 is applied to processor 53 which
computes scale factors S.sub.ij and signal standard deviation .sigma. and
sends its results to FIG. 2. The output of subtracter 27 is applied to DCT
transform circuit 29, and the output of DCT circuit 29 is sent to the
encoding loop of FIG. 2.
To get a sense of the timing relationship between the various elements in
FIG. 1, it is useful to set a benchmark, such as by asserting that the
input at line 10 corresponds to the image signal of frame t; i.e., that
the input signal at line 10 is frame I(t). All of the buffers in FIG. 1
store and delay one frame's worth of data. Hence, the output of buffer 12
is I(t-1), the output of buffer 18 is I(t-2), the output of buffer 19 is
I(t-3), and the output of buffer 51 is I(t-4).
Motion vector generator 13 develops motion vectors M(t) that (elsewhere in
the encoder circuit and in the decoder circuit) assist in generating an
approximation of frame I(t) based on information of frame I(t-1). It takes
some time for the motion vectors to be developed (an internal delay is
included to make the delay within generator 13 equal to one frame delay).
Thus, the output of generator 13 (after processing delay) corresponds to a
set of motion vectors MV(t-1). Not all of the motion vectors that are
created in motion vector generator 13 are actually used, so the output of
generator 13 is applied to motion vector selector/encoder 14 where a
selection process takes place. Since the selection process also takes
time, the outputs of selector/encoder 14 are MV(t-2) and the CODED MV(t-2)
signals, which are the motion vectors, and their coded representations,
that assist in generating an approximation of frame I(t-2) based on
information of frame I(t-3). Such an I(t-2) signal is indeed generated in
motion compensator 16, which takes the I(t-3) signal of buffer 19 and the
motion vectors of selector/encoder 14 and develops therefrom a displaced
frame signal DF(t-2) that approximates the signal I(t-2). Buffers 17 and
50 develop MV(t-4) signals, while buffers 15 and 32 develop the CODED
MV(t-4) signals.
As indicated above, processor 11 develops a frame-mean signal. Since the
mean signal cannot be known until the frame terminates, the output of
processor 11 relates to frame t-1. Stated differently, the output of
processor 11 is M(t-1) and the output of buffer 25 is M(t-3).
Leak factor processor 20 receives signals I(t-2) and DF(t-2). It also takes
time to perform its function (and internal delay is included to insure
that it has a delay of exactly one frame), hence the output signal of
processor 20 corresponds to the leak factor of frame (t-3). The output of
processor 20 is, therefore, designated L(t-3). That output is delayed in
buffer 31, causing L(t-4) to be sent to the encoding loop.
Lastly, the processes within elements 26-30 are relatively quick, so the
transformed image (I.sub.T) and displaced frame difference (DFD.sub.T)
outputs of elements 29 and 30 correspond to frame I.sub.T (t-3) and
DFD.sub.T (t-3), respectively, and the output of processor 53 corresponds
to S.sub.ij (t-4) and .sigma.(t-4).
FIG. 2 contains the encoding loop that utilizes the signals developed in
the forward estimation section of FIG. 1. The loop itself comprises
elements 36, 37, 38, 39, 40, 41, 54, 42, 43, 44 and 45. The image signal
I(t-4) is applied to subtracter 36 after the frame-mean signal M(t-4) is
subtracted from it in subtracter 35. The signal developed by subtracter 36
is the difference between the image I(t-4) and the best estimation of
image I(t-4) that is obtained from the previous frame's data contained in
the encoding loop (with the previous frame's frame-mean excluded via
subtracter 44, and with a leak factor that is introduced via multiplier
45). That frame difference is applied to DCT transform circuit 37 which
develops 2-dimensional transform domain information about the frame
difference signal of subtracter 36. That information is encoded into
vectors within quantizer-and-vector-selector (QVS) 38 and forwarded to
encoders 46 and 47. The encoding carried out in QVS 38 and applied to
encoder 47 is reversed to the extent possible within inverse quantizer 39
and applied to inverse DCT circuit 40.
The output of inverse DCT circuit 40 approximates the output of subtracter
36. However, it does not quite match the signal of subtracter because only
a portion of the encoded signal is applied to element 39 and because it is
corrupted by the loss of information in the encoding process of element
38. There is also a delay in passing through elements 37, 38, 39, and 40.
That delay is matched by the delay provided by buffer 48 before the
outputs of buffer 48 and inverse DCT transform circuit 40 are combined in
adder 41 and applied to adder 54. Adder 54 adds the frame-mean signal
M(t-4) and applies the results to buffer 42. Buffer 42 complements the
delay provided by buffer 48 less the delay in elements 43, 44 and 45 (to
form a full one frame delay) and delivers it to motion compensator 43.
Motion compensator 43 is responsive to the motion vectors MV(t-4). It
produces an estimate of the image signal I(t-4), based on the
approximation of I(t-5) offered by buffer 42. As stated before, that
approximation is diminished by the frame-mean of the previous frame,
M(t-5), through the action of subtracter 44. The previous frame's
frame-mean is derived from buffer 55 which is fed by M(t-4). The results
of subtracter 44 are applied to multiplier 45 which multiplies the output
of subtracter 44 by the leak factor L(t-4). The multiplication results
form the signal to the negative input of subtracter 36.
It may be noted in passing that the action of motion compensator 43 is
linear. Therefore, when the action of buffer 42 is also linear--which
means that it does not truncate its incoming signals--then adder 54 and
subtracter 44 (and buffer 55) are completely superfluous. They are used
only when buffer 42 truncates its incoming signal to save on the required
storage.
In connection with buffer 42, another improvement is possible. When the
processing within elements 36, 37, 38, 39, and 40 and the corresponding
delay of buffer 48 are less than the vertical frame retrace interval, the
output of buffer 42 can be synchronized with its input, in the sense that
pixels of a frame exit the buffer at the same time that corresponding
pixels of the previous frame exit the buffer. Temporal filtering can then
be accomplished at this point by replacing buffer 42 with a buffer circuit
42 as shown in FIG. 17. In buffer circuit 42, the incoming pixel is
compared to the outgoing pixel. When their difference is larger than a
certain threshold, the storage element within circuit 42 is loaded with
the average of the two compared pixels. Otherwise, the storage element
within buffer 42 is loaded with the incoming pixel only.
QVS 38 is also responsive to perceptual coder 49 and to S.sub.ij (t-4).
That coder is responsive to signals I.sub.T (t-4) and .sigma.(t-4).
Signals S.sub.ij (t-4) are also sent to inverse quantization circuit 39
and to buffer fullness and formatter (BFF) 56. BFF block 56 also receives
information from encoders 46 and 47, the leak signal L(t-4) and the CODED
MV(t-4) information from buffer 32 in FIG. 1. BFF block 56 sends fullness
information to perceptual coder 49 and all if its received information to
subsequent circuitry, where the signals are amplified, appropriately
modulated and, for terrestrial transmission, applied to a transmitting
antenna.
BFF block 56 serves two closely related functions. It packs the information
developed in the encoders by applying the appropriate error correction
codes and arranging the information, and it feeds information to
perceptual coder 49, to inform it of the level of output buffer fullness.
The latter information is employed in perceptual coder 49 to control QVS
38 and inverse quantizer 39 and, consequently, the bit rate of the next
frame.
The general description above provides a fairly detailed exposition of the
encoder within the HDTV transmitter. The descriptions below delve in
greater detail into each of the various circuits included in FIGS. 1 and
2.
FRAME-MEAN CIRCUIT 11
The mean, or average, signal within a frame is obtainable with a simple
accumulator that merely adds the values of all pixels in the frame and
divides the sum by a fixed number. Adding a binary number of pixels offers
the easiest division implementation, but division by any other number is
also possible with some very simple and conventional hardware (e.g., a
look-up table). Because of this simplicity, no further description is
offered herein of circuit 11.
MOTION VECTOR GENERATOR 13
The motion vector generator compares the two sequential images I(t) and
I(t-1), with an eye towards detecting regions, or blocks, in the current
image frame, I(t), that closely match regions, or blocks, in the previous
image frame, I(t-1). The goal is to generate relative displacement
information that permits the creation of an approximation of the current
image frame from a combination of the displacement information and the
previous image frame.
More specifically, the current frame is divided into n.times.n pixel
blocks, and a search is conducted for each block in the current frame to
find an n.times.n block in the previous frame that matches the current
frame block as closely as possible.
If one wishes to perform an exhaustive search for the best displacement of
an n.times.n pixel block in a neighborhood of a K.times.K pixel array, one
has to test all of the possible displacements, of which there are
(K-n).times.(K-n). For each of those displacements one has to determine
the magnitude of the difference (e.g., in absolute, RMS, or square sense)
between the n.times.n pixel array in the current frame and the n.times.n
portion of the K.times.K pixel array in the previous frame that
corresponds to the selected displacement. The displacement that
corresponds to the smallest difference is the preferred displacement, and
that is what we call the motion vector.
One important issue in connection with a hardware embodiment of the
above-described search process is the shear volume of calculations that
needs to be performed in order to find the absolutely optimum motion
vector. For instance, if the image were subdivided into blocks of
8.times.8 pixels and the image contains 1024.times.1024 pixels, then the
total number of blocks that need to be matched would be 2.sup.14. If an
exhaustive search over the entire image were to be performed in
determining the best match, then the number of searches for each block
would be approximately 2.sup.20. The total count (for all the blocks)
would then be approximately 2.sup.34 searches. This "astronomical" number
is just too many searches!
One approach for limiting the required number of searches is to limit the
neighborhood of the block whose motion vector is sought. In addition to
the direct reduction in the number of searches that must be undertaken,
this approach has the additional benefit that a more restricted
neighborhood limits the number of bits that are required to describe the
motion vectors (smaller range), and that reduces the transmission burden.
With those reasons in mind, we limit the search neighborhood in both the
horizontal and vertical directions to .+-.32 positions. That means, for
example, that when a 32.times.16 pixel block is considered, then the
neighborhood of search is 80.times.80 pixels, and the number of searches
for each block is 2.sup.12 (compared to 2.sup.20).
As indicated above, the prediction error can be based on a sum of squares
of differences, but it is substantially simpler to deal with absolute
values of differences. Accordingly, the motion vector generator herein
compares blocks of pixels in the current frame with those in the previous
frame by forming prediction error signals that correspond to the sum over
the block of the absolute differences between the pixels.
To further reduce the complexity and size of the search, a two-stage
hierarchical motion estimation approach is used. In the first stage, the
motion is estimated coarsely, and in the second stage the coarse
estimation is refined. Matching in a coarse manner is achieved in the
first stage by reducing the resolution of the image by a factor of 2 in
both the horizontal and the vertical directions. This reduces the search
area by a factor of 4, yielding only 2.sup.12 blocks in a 1024.times.1024
image array. The motion vectors generated in the first stage are then
passed to the second stage, where a search is performed in the
neighborhood of the coarse displacement found in the first stage.
FIG. 3 depicts the structure of the first (coarse) stage in the motion
vector generator. In FIG. 3 the input signal is applied to a
two-dimensional, 8 pixel by 8 pixel low-pass filter 61. Filter 61
eliminates frequencies higher than half the sampling rate of the incoming
data. Subsampler 62 follows filter 61. It subsamples its input signal by a
2:1 factor. The action of filter 61 insures that no aliasing results from
the subsampling action of element 62 since it eliminates signals above the
Nyquist rate for the subsampler. The output of subsampler 62 is an image
signal with half as many pixels in each line of the image, and half as
many lines in the image. This corresponds to a four-fold reduction in
resolution, as discussed above.
In FIG. 1, motion vector generator 13 is shown to be responsive to the I(t)
signal at input line 10 and to the I(t-1) signal at the output of buffer
12. This was done for expository purposes only, to make the operation of
motion vector 13 clearly understandable in the context of the FIG. 1
description. Actually, it is advantageous to have motion vector generator
13 be responsive solely to I(t), as far as the connectivity of FIG. 1 is
concerned, and have the delay of buffer 12 be integrated within the
circuitry of motion vector generator 13.
Consonant with this idea, FIG. 3 includes a frame memory 63 which is
responsive to the output of subsampler 62. The subsampled I(t) signal at
the input of frame memory 63 and the subsampled I(t-1) signal at the
output of frame memory 63 are applied to motion estimator 64.
The control of memory 63 is fairly simple. Data enters motion estimator
block 64 is sequence, one line at a time. With every sixteen lines of the
subsampled I(t), memory 64 must supply to motion estimator block 64
sixteen lines of the subsampled I(t-1); except offset forward by sixteen
lines. The 32 other (previous) lines of the subsampled I(t-1) that are
needed by block 64 are already in block 64 from the previous two sets of
sixteen lines of the subsampled I(t) signal that were applied to motion
estimator block 64.
Motion estimator 64 develops a plurality of prediction error signals for
each block in the image. The plurality of prediction error signals is
applied to best-match calculator 65 which identifies the smallest
prediction error signal. The displacement corresponding to that prediction
error signal is selected as the motion vector of the block.
Expressed in more mathematical terms, if a block of width w and height h in
the current frame block is denoted by b(x,y,t), where t is the current
frame and x and y are the north-west corner coordinates of the block, then
the prediction error may be defined as the sum of absolute differences of
the pixel values:
##EQU1##
where r and s are the displacements in the x and y directions,
respectively.
The motion vector that gives the best match is the displacement (r,s) that
gives the minimum prediction error.
The selection of the motion vector is performed in calculator 65. In cases
where there are a number of vectors that have the same minimum error,
calculator 65 selects the motion vector (displacement) with the smallest
magnitude. For this selection purpose, magnitudes are defined in
calculator 65 as the sum of the magnitudes of the horizontal and vertical
displacement, i.e., .vertline.r.vertline.+.vertline.s.vertline..
In the second stage of motion vector generator 13, a refined determination
is made as to the best displacement value that can be selected, within the
neighborhood of the displacement selected in the first stage. The second
stage differs from the first stage in three ways. First, it performs a
search that is directed to a particular neighborhood. Second, it evaluates
prediction error values for 8.times.8 blocks and a 4.times.2 array of
8.times.8 blocks (which in effect is a 32.times.16 block). And third, it
interpolates the end result to 1/2 pixel accuracy.
FIG. 4 presents a general block diagram of the second stage of generator
13. As in FIG. 3, the input signal is applied to frame memory 66. The
input and the output of memory 66 are applied to motion estimator 67, and
the output of motion estimator 67 is applied to best match calculator 68.
Estimator 67 is also responsive to the coarse motion vector estimation
developed in the first stage of generator 13, whereby the estimator is
caused to estimate motion in the neighborhood of the motion vector
selected in the first stage of generator 13.
Calculator 68 develops output sets with 10 signals in each set. It develops
eight 8.times.8 block motion vectors, one 32.times.16 motion vector that
encompass the image area covered by the eight 8.times.8 blocks, and a
measure of the improvement in motion specification (i.e., a lower
prediction error) that one would get by employing the eight 8.times.8
motion vectors in place of the associated 32.times.16 motion vector. The
measure of improvement can be developed in any number of ways, but one
simple way is to maintain the prediction errors of the 8.times.8 blocks,
develop a sum of those prediction errors, and subtract the developed sum
from the prediction error of the 32.times.16 motion vector.
The motion vector outputs of calculator 68 are applied in FIG. 4 to half
pixel estimator 69. Half pixel motion is deduced from the changes in the
prediction errors around the region of minimum error. The simple approach
used in estimator 69 is to derive the half pixel motion independently in
the x and y directions by fitting a parabola to the three points around
the minimum, solving the parabola equation, and finding the position of
the parabola's minimum. Since all that is desired is 1/2 pixel accuracy,
this process simplifies to performing the following comparisons:
##EQU2##
where p.sub.x is the prediction error at x, and x' is the deduced half
pixel motion vector.
The searches in both stages of motion vector generator 13 can extend over
the edges of the image to allow for the improved prediction of an object
entering the frame. The values of the pixels outside the image should be
set to equal the value of the closest known pixel.
The above describes the structure of motion vector generator 13. All of the
computational processes can be carried out with conventional processors.
The processes that can most benefit from special purpose processors are
the motion estimation processes of elements 64 and 67; simply because of
the number of operations called for. These processes, however, can be
realized with special purpose chips from LSI Logic Corporation, which
offers a video motion estimation processor (L64720). A number of these can
be combined to develop a motion estimation for any sized block over and
sized area. This combining of L64720 chips is taught in an LSI Logic
Corporation Application Note titled "LG720 (MEP) Video Motion Estimation
Processor".
MOTION VECTOR SELECTOR/ENCODER 14
The reason for creating the 32.times.16 blocks is rooted in the expectation
that the full set of motion vectors for the 8.times.8 blocks cannot be
encoded in the bit budget allocated for the motion vectors. On the other
hand, sending only 32.times.16 block motion vectors requires 28,672
bits--which results from multiplying the 14 bits per motion vector (7 bits
for the horizontal displacement and 7 bits for the vertical displacement)
by 32 blocks in the horizontal direction and 64 blocks in the vertical
direction. In other words, it is expected that the final set of motion
vectors would be a mix of 8.times.8 block motion vectors and 32.times.16
block motion vectors. It follows, therefore, that a selection must be made
of the final mix of motion vectors that are eventually sent by the HDTV
transmitter, and that selection must fit within a preassigned bit budget.
Since the number of bits that define a motion vector depends on the
efficacy of compression encoding that may be applied to the motion
vectors, it follows that the selection of motion vectors and the
compression of th | | |