WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Adaptive non-linear quantizer    
United States Patent5136377   
Link to this pagehttp://www.wikipatents.com/5136377.html
Inventor(s)Johnston; James D. (Warren, NJ); Knauer; Scott C. (Mountainside, NJ); Matthews; Kim N. (Watchung, NJ); Netravali; Arun N. (Westfield, NJ); Petajan; Eric D. (Watchung, NJ); Safranek; Robert J. (New Providence, NJ); Westerink; Peter H. (Newark, NJ)
AbstractA quantizer, with quantization control that is sensitive to input signal characteristics and to output buffer fullness responds to an input signal that is divided into blocks and DCT transformed. The transformed signal is analyzed to develop a brightness correction and to evaluate the texture of the image and the change in texture in the image. Based on these, and in concert with the human visual perception model, perception threshold signals are created for each subband of the transformed signal. Concurrently, scale factors for each subband of the transformed signal are computed, and a measure of variability in the transformed input signal is calculated. A measure of the fullness of the buffer to which the quantizer sends its encoded results is obtained, and that measure is combined with the calculated signal variability to develop a correction signal. The correction signal modifies the perception threshold signals to develop threshold control signals that are applied to the quantizer. The scale factors are also applied to the quantizer, as well as a global target distortion measure.



 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Inventor     Johnston; James D. (Warren, NJ); Knauer; Scott C. (Mountainside, NJ); Matthews; Kim N. (Watchung, NJ); Netravali; Arun N. (Westfield, NJ); Petajan; Eric D. (Watchung, NJ); Safranek; Robert J. (New Providence, NJ); Westerink; Peter H. (Newark, NJ)
Owner/Assignee     AT&T Bell Laboratories (Murray Hill, NJ)
Patent assignment
All assignments
Publication Date     August 4, 1992
Application Number     07/626,279
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     December 11, 1990
US Classification     375/240.12 348/441 348/469 375/240.05
Int'l Classification     H04N 007/12 H04N 007/01 H04N 007/04
Examiner     Peng; John K.
Assistant Examiner    
Attorney/Law Firm     Brendzel; Henry T.
Address
Parent Case    
Priority Data    
USPTO Field of Search     358/135 358/136 358/140 358/141 358/12
Patent Tags     adaptive non-linear quantizer
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
4689672
Furukawa
375/240.12
Aug,1987

[0 after 0 votes]
4672441
Hoelzlwimmer
348/400.1
Jun,1987

[0 after 0 votes]
4562468
Koga
375/240.14
Dec,1985

[0 after 0 votes]
4460923
Hirano
348/413.1
Jul,1984

[0 after 0 votes]
4202011
Koga
348/411.1
May,1980

[0 after 0 votes]
4051530
Kuroda
375/240.12
Sep,1977

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


We claim:

1. An encoder including a coder for developing encoder output signals from frame difference signals, prediction means responsive to said encoder output signals for predicting a next frame's signals, and means for developing said frame difference signals from applied next frame signals of an image frame and from output signals of said prediction means, the improvement comprising:

said coder including controllable quantizer means that quantizes said difference signals in accordance with a quantization schema that varies with the dictates of a control signal; and

said coder including means, responsive to said applied next frame signals, to develop said control signal, which control signal varies throughout said applied next frame with changes in at least one selected characteristic of said applied next frame signals.

2. The encoder of claim 1 wherein said control signal is further responsive to a selected bit rate target level for said encoder output signals.

3. The encoder of claim 2 wherein said control signal forms a selected quantization error target level for said encoder output signals.

4. The encoder of claim 1 wherein said selected characteristic is a measure of texture in said applied next frame signals.

5. The encoder of claim 4 wherein said measure of texture is a combination of texture measured horizontally, texture measured vertically, and texture measured over a selected area of said image frame.

6. The encoder of claim 4 wherein said measure of texture is a combination of a texture measure of said applied next frame signals and of previously applied next frame signals.

7. The encoder of claim 1 wherein said selected characteristic is a measure of brightness in said applied nex frame signals.

8. The encoder of claim 1 further comprising an output buffer for receiving said encoder output signals, and said coder comprising means for receiving signals from said output buffer that indicate the level of buffer fullness of said output buffer.

9. The encoder of claim 8 wherein selected characteristic is a measure of buffer fullness of said output buffer.

10. The encoder of claim 8 wherein selected characteristic is a combination of said buffer fullness of said output buffer, brightness of said applied next frame signals and texture of said applied next frame signals.

11. The encoder of claim 8 wherein said characteristic is related to the buffer fullness of said output buffer in the previous frame.

12. The encoder of claim 8 wherein said characteristic is related to the buffer fullness of said output buffer in the previous frame and in the frame previous to the previous frame.

13. The encoder of claim 1 wherein said coder is responsive to applied scale factor signals to control the effective range of said quantizer means.

14. The encoder of claim 1 further comprising:

forward estimation means including means for processing said next frame signals to develop said control signal within a processing interval; and

means for delaying said next frame signals by said processing interval.

15. The encoder of claim 14 wherein said forward estimation means further includes means for developing said measure of brightness within said processing interval.

16. The encoder of claim 14 wherein said forward estimation means further includes means for developing said measure of texture within said processing interval.

17. The encoder of claim 1 wherein said next frame signals comprise a sequence of signal sections, each of which is related to a transform of at least one block of said frame difference signals, and each of which including a collection of N transform element signals, and said control signal comprising N control signal cells, where N in a constant, and each control signal cell controls the quantization schema for a different one of said N transform element signals.

18. The encoder of claim 17 wherein said control signal having N control signal cells is a control signal vector that is selected from a look-up table that contains a collection of control signal vectors.

19. The encoder of claim 18 wherein the selection from said look-up table is based on an error level that is obtained by coding said frame difference signal under control of each vector and said selected characteristic, and by developing a quantization error signal that is subtracted from a quantization error target level.

20. The encoder of claim 19 wherein said selected characteristic is scale factors that are related to the power at various subbands in said next frame's signals.

21. The encoder of claim 19 wherein said quantization error target level is related to signal characteristics of said next frame's signals.

22. The encoder of claim 19 wherein said quantization error target level is further related to buffer fullness of a buffer that stores said encoder output signals.

23. The encoder of claim 19 wherein said selection is further based on the number of bits created by said coder quantizer means.

24. The encoder of claim 18 wherein the selection from said look-up table is based on an error level that is obtained by scaling said frame difference signal, coding the scaled frame difference signal under control of each vector and said selected characteristic and by developing a quantization error signal that is subtracted in an absolute sense from a quantization error target level.

25. The encoder of claim 24 wherein said selected characteristic is related to texture of said next frame signal.

26. An encoder comprising:

prediction means responsive to output signals of said encoder, for developing frame prediction signals;

means for developing frame difference signals in response to said frame prediction means and applied frame signals;

coder means, responsive to said frame difference signals and to a control signal, for encoding frame difference signals under direction of said control signal, where said coder means codes different portions of said frame difference signals with different coding schemas, where different coding schemas yield different numbers of bits when coding any given signal, said coder means thereby generates a number of bits when encoding said applied frame signals; and

control means for developing said control signal in response to said encoder output signals, to control the number of bits generated by said coder means while encoding said applied frame signals.

27. The encoder of claim 26 wherein said control means develops a control signal that controls said coder means to develop a number of bits for each frame of said applied frame signals that approaches a preselect number.

28. The encoder of claim 26 wherein said control means develops a control signal that controls said coder means to develop a number of bits for each frame of said applied frame signals that on average approaches a preselect number.

29. The encoder of claim 26 where said control means is further responsive to said applied frame signals and modifies its developed control signal based on said applied frame signals to control coding error signals created in said coder in the course of coding of said frame difference signals.

30. The encoder of claim 27 where said control means modifies said control signal in response to said applied frame signal to equalize coding error signals throughout a frame of said applied frame signals.

31. The encoder of claim 27 where said control means develops said control signal in accordance with a visual perception model for humans, to cause the creation of encoder outputs signals that, when said encoder output signals are decoded and a frame image is created and displayed, the coding error signals found in said created and displayed frame image are perceived, in accordance with said visual perception model, to be essentially equally visible throughout said frame.
 Description Submit all comments and votes
 


CROSS-REFERENCE TO RELATED APPLICATION

This application is related to four other applications filed on this date. Their titles are:

1. A High Definition Television Coding Arrangement With Graceful Degradation

2. An Adaptive Leak HDTV Encoder

3. HDTV Encoder with Forward Estimation and Constant Rate Motion Vectors

4. HDTV Receiver

BACKGROUND OF THE INVENTION

This invention relates to encoding of signals, and more specifically, to encoding of signals with a controlled quantizer.

When image signals are digitized and linearly quantized, a transmission rate of about 100 Mbits per second is necessary for images derived from standard TV signals. For HDTV, with its greater image size and greater resolution, a much higher transmission rate would be is required. When terrestrial transmission is desired, and when the option of allocating greater bandwidth per channel is unavailable, it is necessary to compress the HDTV signal within the allocated bandwidth. Having an allocated bandwidth is tantamount to having a certain number of information bits that can be transmitted in each selected time interval, such as an image frame. All of the compression techniques of HDTV signals must of necessity consider this bit budget.

In order to reduce the bit rate, or to fit the encoded signal of an image frame within the allocated bit budget, various coding schemes have been studied. One is the so-called "differential pulse code modulation" (DPCM) coding approach. By this method, the value of a particular pixel at any moment is predicted on the basis of values of pixels which have been already coded. The necessary number of bits is reduced by coding the difference (error) between the predicted value and the value of the particular pixel at that moment. According to this differential pulse code modulation, it is possible to reduce the number of bits per pixel by approximately half.

Still, that number is much larger than can be accepted for terrestrial transmission, so a second look is typically taken at the quantizer itself. Clearly, if the step size of the quantizer is made large enough, the number of bits generated can be reduced to an acceptable level. Alas, the quantization error resulting therefrom would yield a picture that is far from acceptable.

Experiments show, and it is quite logical, that the effect of quantization error is different for different types of picture and, expectedly, artisans have tried to tailor quantizers to the pictures being encoded.

U.S. Pat. No. 4,802,232 issued Jan. 31, 1989, for example, describes one such attempt. In accordance with the described approach the picture to be encoded is divided into regions and each region of the input image is classified into one of a preset number of classes. The signal of each class is quantized in a specified manner, and the manner of quantization of the different classes, of course, differ. The limitation of this approach is the hard decisions boundaries between the classes.

Nill in "A Visual Model Weighted Cosine Transform for Image Compression and Quality Assessment", IEEE Transactions on Communications, Vol. COM-33, No. 6, June, 1985, pg. 551-557 and Saghri et al. "Image quality measure based on a human visual system model", Optical Engineering, Vol. 28, No. 7, July 1989, pg. 813-818, describe very similar approaches. These approaches utilize fixed global thresholds for each frequency band, where the thresholds are set based on a model of the human visual systems (HVS) modulation transfer function (MTF), and the transform utilized. The limitation of this approach is that the quantization cannot adapt to local characteristics of the image.

In "Adaptive Quantization of Picture Signals using Spatial Masking", Proceedings of IEEE, Vol. 65, April 1977, pg. 536-548, Netravali et al. describe a means of designing non-uniform quantizers to incorporate spatial masking and brightness correction for predictive coding systems. The limitation of their approach, again, is that the quantization cannot adapt to local characteristics of the image.

In "Design of Statistically Based Buffer Control Policies for Compressed Digital Video", Zdepski et al., an IEEE conference, 1989, pg. 1343-1349, describe an approach where the quantizer in the DPCM loop interacts with an adaptive mode control circuit. The circuit measures the number of bits generated by the quantizer and, based on preselected thresholds, decides for the next frame on one of eight possible quantizer step sizes. The selected step size is employed for the next frame. A similar approach is described in "Digital Pictures" by A. N. Netravali and B. G. Haskell, Plenum Press, 1988, pg. 537 et seq.

The deficiency of these methods is that the quantization steps are altered in discrete jumps and in that no accounting is made of a global distortion target.

It is an object of this invention, therefore, to develop means for controlling the quantizer so that the number of bits generated per frame is, on the average, within the allocated bit budget.

It is another object of this invention to control the quantizer in a manner that is sensitive to the characteristics of the input signal to which human viewers are sensitive.

It is still another object of this invention to control the quantizer in a manner that spreads as evenly as possible the unavoidable quantization noise.

SUMMARY OF THE INVENTION

These and other objects are achieved with a quantizer control mechanism that is responsive to both the input signal and the fullness of the output buffer. More specifically, the input image is divided into blocks and the signal of each block is DCT transformed. The transformed signal is analyzed to develop a brightness correction and to evaluate the texture of the image and the change in texture in the image. Based on these, and in concert with the human visual perception model, perception threshold signals are created for each subband of the transformed signal.

Concurrently, scale factors for each subband of the transformed signal are computed, and a measure of variability in the transformed input signal is calculated. A measure of the fullness of the buffer to which the quantizer sends its encoded results is obtained, and that measure is combined with the calculated signal variability to develop a correction signal. The correction signal modifies the perception threshold signals to develop threshold control signals that are applied to the quantizer. The scale factors are also applied to the quantizer, as well as a global target distortion measure.

The quantizer pre-multiplies the input signal by the scale factors and quantizes the signal in a manner that is sensitive to the threshold control signals. The quantization itself is performed a number of times by scanning a codebook that specifies the exact manner of quantizing of each of the subbands of the transformed signal. Accounting for the global target distortion, the best quantization result is selected and sent to the aforementioned buffer.

BRIEF DESCRIPTION OF THE DRAWING

FIG. 1 presents a block diagram of a forward estimation section of an HDTV digital encoder;

FIG. 2 presents a block diagram of an encoding loop section of an HDTV encoder that interacts with the forward estimation section of FIG. 1;

FIG. 3 depicts a hardware organization for a coarse motion vector detector;

FIG. 4 depicts a hardware organization for a fine motion vector detector that takes into account the output of the coarse motion vector detector;

FIG. 5 illustrates the spatial relationship of what is considered a "slice" of image data;

FIG. 6 shows one way for selecting a mix of motion vectors to fit within a given bit budget;

FIG. 7 presents one embodiment for evaluating a leak factor, .alpha.;

FIG. 8 illustrates the arrangement of a superblock that is quantized in QVS 38 block of FIG. 2;

FIG. 9 depicts how a set of selection error signals is calculated in preparation for codebook vector selection;

FIG. 10 presents a block diagram of QVS block 38;

FIG. 11 is a block diagram of inverse quantization block 39;

FIG. 12 presents the structure of perceptual coder 49;

FIG. 13 illustrates the structure of perceptual processor 93;

FIG. 14 is a block diagram of texture processors 96 and 98;

FIG. 15 presents a block diagram of a digital HDTV receiver;

FIG. 16 presents a modified forward estimation block that choses a leak factor from a set of two fixed leak factors; and

FIG. 17 a frame buffere circuit that includes a measure of temporal filtering.

DETAILED DESCRIPTION

The motion estimation principles of this invention start with the proposition that for a given bit budget, a much better overall prediction error level can be attained by employing blocks of variable sizes. Starting with large sized blocks that handle large image sections for which a good translation vector can be found, the motion vector selection in accordance with this invention then replaces, or adds, to some of motion vectors for the large blocks with motion vectors for the small blocks.

While these principles of the invention can be put to good use in the prediction of other types of signals, it is clear that it is quite useful in connection with the encoding of HDTV signals. However, in the HDTV application there is simply not enough time within the encoding loop to compute the best motion vectors and to then select the best mix of motion vectors. Accordingly, a forward estimation section is incorporated in the encoder, and that section creates as many of the signals used within the encoder as it is possible. The input image signal is, of course, appropriately delayed so that the need for the signals that are computed within the forward estimation section does not precede their creation.

In order to appreciate the operation of the motion vector selection and encoding processes improvement of this invention serves, the following describes the entire coder section of an HDTV transmitter. The principles of this invention are, of course, primarily described in connection with the motion vector estimation and selection circuits depicted of FIG. 1.

DETAILED DESCRIPTION

Quantization of the signals is one of the most important tasks that the HDTV transmitter must perform. The quantizer, however, is but a part of the entire encoding loop. Moreover, the encoding loop, generally, and the quantizer, in particular, require a number of signals that need a substantial period of time for computation. Fortunately, it is possible to compute these signals in a forward estimation section, which "off line" as far as the encoding loop is concerned.

To better understand the function and workings of the quantizer of this invention, the following describes the entire coder section of an HDTV transmitter, i.e., both the forward estimation section and the encoding loop. The detailed principles of this invention are primarily described in connection with the perceptual encoder circuit, the quantizer circuit, and the associated circuits.

In FIG. 1, the input signal is applied at line 10. It is a digitized video signal that arrives in sequences of image frames. This input signal is applied to frame-mean processor 11, to buffer 12, and to motion vector generator 13. The output of buffer 12 is also applied to motion vector generator block 13. Frame-mean processor 11 develops the mean value of each incoming frame. That value is delayed in buffers 24 and 25, and applied to a number of elements within FIG. 1, as described below. It is also sent to the encoding loop of FIG. 2 through buffer 52. Motion vector generator 13 develops motion vectors which are applied to motion vector selector/encoder 14 and, thereafter, through buffers 15 and 32, wherefrom the encoded motion vectors are sent to the encoding loop of FIG. 2. The unencoded output of motion vector selector/encoder 14 is also applied to motion compensator block 16, and to buffer 17 followed by buffer 50, wherefrom the unencoded motion vectors are sent to the encoding loop of FIG. 2.

The output of buffer 12 is applied to buffer 18 and thereafter to buffers 19 and 51, wherefrom it is sent to the encoding loop of FIG. 2. The output of buffer 18 is applied to buffer 22 and to leak factor processor 20, and the output of buffer 19 is applied to motion compensator circuit 16. The output of motion compensator 16 is applied to buffer 21 and to leak factor processor 20.

The frame-mean signal of buffer 25 is subtracted from the output of buffer 21 in subtracter 26 and from the output of buffer 22 in subtracter 27. The outputs of subtracter 26 and leak processor 20 are applied to multiplier 23, and the output of multiplier 23 is applied to subtracter 28. The output of leak processor 20 is also sent to the encoding loop of FIG. 2 via buffer 31. Element 28 subtracts the output of multiplier 23 from the output of subtracter 27 and applies the result to DCT transform circuit 30. The output of transform circuit 30 is applied to processor 53 which computes scale factors S.sub.ij and signal standard deviation .sigma. and sends its results to FIG. 2. The output of subtracter 27 is applied to DCT transform circuit 29, and the output of DCT circuit 29 is sent to the encoding loop of FIG. 2.

To get a sense of the timing relationship between the various elements in FIG. 1, it is useful to set a benchmark, such as by asserting that the input at line 10 corresponds to the image signal of frame t; i.e., that the input signal at line 10 is frame I(t). All of the buffers in FIG. 1 store and delay one frame's worth of data. Hence, the output of buffer 12 is I(t-1), the output of buffer 18 is I(t-2), the output of buffer 19 is I(t-3), and the output of buffer 51 is I(t-4).

Motion vector generator 13 develops motion vectors M(t) that (elsewhere in the encoder circuit and in the decoder circuit) assist in generating an approximation of frame I(t) based on information of frame I(t-1). It takes some time for the motion vectors to be developed (an internal delay is included to make the delay within generator 13 equal to one frame delay). Thus, the output of generator 13 (after processing delay) corresponds to a set of motion vectors MV(t-1). Not all of the motion vectors that are created in motion vector generator 13 are actually used, so the output of generator 13 is applied to motion vector selector/encoder 14 where a selection process takes place. Since the selection process also takes time, the outputs of selector/encoder 14 are MV(t-2) and the CODED MV(t-2) signals, which are the motion vectors, and their coded representations, that assist in generating an approximation of frame I(t-2) based on information of frame I(t-3). Such an I(t-2) signal is indeed generated in motion compensator 16, which takes the I(t-3) signal of buffer 19 and the motion vectors of selector/encoder 14 and develops therefrom a displaced frame signal DF(t-2) that approximates the signal I(t-2). Buffers 17 and 50 develop MV(t-4) signals, while buffers 15 and 32 develop the CODED MV(t-4) signals.

As indicated above, processor 11 develops a frame-mean signal. Since the mean signal cannot be known until the frame terminates, the output of processor 11 relates to frame t-1. Stated differently, the output of processor 11 is M(t-1) and the output of buffer 25 is M(t-3).

Leak factor processor 20 receives signals I(t-2) and DF(t-2). It also takes time to perform its function (and internal delay is included to insure that it has a delay of exactly one frame), hence the output signal of processor 20 corresponds to the leak factor of frame (t-3). The output of processor 20 is, therefore, designated L(t-3). That output is delayed in buffer 31, causing L(t-4) to be sent to the encoding loop.

Lastly, the processes within elements 26-30 are relatively quick, so the transformed image (I.sub.T) and displaced frame difference (DFD.sub.T) outputs of elements 29 and 30 correspond to frame I.sub.T (t-3) and DFD.sub.T (t-3), respectively, and the output of processor 53 corresponds to S.sub.ij (t-4) and .sigma.(t-4).

FIG. 2 contains the encoding loop that utilizes the signals developed in the forward estimation section of FIG. 1. The loop itself comprises elements 36, 37, 38, 39, 40, 41, 54, 42, 43, 44 and 45. The image signal I(t-4) is applied to subtracter 36 after the frame-mean signal M(t-4) is subtracted from it in subtracter 35. The signal developed by subtracter 36 is the difference between the image I(t-4) and the best estimation of image I(t-4) that is obtained from the previous frame's data contained in the encoding loop (with the previous frame's frame-mean excluded via subtracter 44, and with a leak factor that is introduced via multiplier 45). That frame difference is applied to DCT transform circuit 37 which develops 2-dimensional transform domain information about the frame difference signal of subtracter 36. That information is encoded into vectors within quantizer-and-vector-selector (QVS) 38 and forwarded to encoders 46 and 47. The encoding carried out in QVS 38 and applied to encoder 47 is reversed to the extent possible within inverse quantizer 39 and applied to inverse DCT circuit 40.

The output of inverse DCT circuit 40 approximates the output of subtracter 36. However, it does not quite match the signal of subtracter because only a portion of the encoded signal is applied to element 39 and because it is corrupted by the loss of information in the encoding process of element 38. There is also a delay in passing through elements 37, 38, 39, and 40. That delay is matched by the delay provided by buffer 48 before the outputs of buffer 48 and inverse DCT transform circuit 40 are combined in adder 41 and applied to adder 54. Adder 54 adds the frame-mean signal M(t-4) and applies the results to buffer 42. Buffer 42 complements the delay provided by buffer 48 less the delay in elements 43, 44 and 45 (to form a full one frame delay) and delivers it to motion compensator 43.

Motion compensator 43 is responsive to the motion vectors MV(t-4). It produces an estimate of the image signal I(t-4), based on the approximation of I(t-5) offered by buffer 42. As stated before, that approximation is diminished by the frame-mean of the previous frame, M(t-5), through the action of subtracter 44. The previous frame's frame-mean is derived from buffer 55 which is fed by M(t-4). The results of subtracter 44 are applied to multiplier 45 which multiplies the output of subtracter 44 by the leak factor L(t-4). The multiplication results form the signal to the negative input of subtracter 36.

It may be noted in passing that the action of motion compensator 43 is linear. Therefore, when the action of buffer 42 is also linear--which means that it does not truncate its incoming signals--then adder 54 and subtracter 44 (and buffer 55) are completely superfluous. They are used only when buffer 42 truncates its incoming signal to save on the required storage.

In connection with buffer 42, another improvement is possible. When the processing within elements 36, 37, 38, 39, and 40 and the corresponding delay of buffer 48 are less than the vertical frame retrace interval, the output of buffer 42 can be synchronized with its input, in the sense that pixels of a frame exit the buffer at the same time that corresponding pixels of the previous frame exit the buffer. Temporal filtering can then be accomplished at this point by replacing buffer 42 with a buffer circuit 42 as shown in FIG. 17. In buffer circuit 42, the incoming pixel is compared to the outgoing pixel. When their difference is larger than a certain threshold, the storage element within circuit 42 is loaded with the average of the two compared pixels. Otherwise, the storage element within buffer 42 is loaded with the incoming pixel only.

QVS 38 is also responsive to perceptual coder 49 and to S.sub.ij (t-4). That coder is responsive to signals I.sub.T (t-4) and .sigma.(t-4). Signals S.sub.ij (t-4) are also sent to inverse quantization circuit 39 and to buffer fullness and formatter (BFF) 56. BFF block 56 also receives information from encoders 46 and 47, the leak signal L(t-4) and the CODED MV(t-4) information from buffer 32 in FIG. 1. BFF block 56 sends fullness information to perceptual coder 49 and all if its received information to subsequent circuitry, where the signals are amplified, appropriately modulated and, for terrestrial transmission, applied to a transmitting antenna.

BFF block 56 serves two closely related functions. It packs the information developed in the encoders by applying the appropriate error correction codes and arranging the information, and it feeds information to perceptual coder 49, to inform it of the level of output buffer fullness. The latter information is employed in perceptual coder 49 to control QVS 38 and inverse quantizer 39 and, consequently, the bit rate of the next frame.

The general description above provides a fairly detailed exposition of the encoder within the HDTV transmitter. The descriptions below delve in greater detail into each of the various circuits included in FIGS. 1 and 2.

FRAME-MEAN CIRCUIT 11

The mean, or average, signal within a frame is obtainable with a simple accumulator that merely adds the values of all pixels in the frame and divides the sum by a fixed number. Adding a binary number of pixels offers the easiest division implementation, but division by any other number is also possible with some very simple and conventional hardware (e.g., a look-up table). Because of this simplicity, no further description is offered herein of circuit 11.

MOTION VECTOR GENERATOR 13

The motion vector generator compares the two sequential images I(t) and I(t-1), with an eye towards detecting regions, or blocks, in the current image frame, I(t), that closely match regions, or blocks, in the previous image frame, I(t-1). The goal is to generate relative displacement information that permits the creation of an approximation of the current image frame from a combination of the displacement information and the previous image frame.

More specifically, the current frame is divided into n.times.n pixel blocks, and a search is conducted for each block in the current frame to find an n.times.n block in the previous frame that matches the current frame block as closely as possible.

If one wishes to perform an exhaustive search for the best displacement of an n.times.n pixel block in a neighborhood of a K.times.K pixel array, one has to test all of the possible displacements, of which there are (K-n).times.(K-n). For each of those displacements one has to determine the magnitude of the difference (e.g., in absolute, RMS, or square sense) between the n.times.n pixel array in the current frame and the n.times.n portion of the K.times.K pixel array in the previous frame that corresponds to the selected displacement. The displacement that corresponds to the smallest difference is the preferred displacement, and that is what we call the motion vector.

One important issue in connection with a hardware embodiment of the above-described search process is the shear volume of calculations that needs to be performed in order to find the absolutely optimum motion vector. For instance, if the image were subdivided into blocks of 8.times.8 pixels and the image contains 1024.times.1024 pixels, then the total number of blocks that need to be matched would be 2.sup.14. If an exhaustive search over the entire image were to be performed in determining the best match, then the number of searches for each block would be approximately 2.sup.20. The total count (for all the blocks) would then be approximately 2.sup.34 searches. This "astronomical" number is just too many searches!

One approach for limiting the required number of searches is to limit the neighborhood of the block whose motion vector is sought. In addition to the direct reduction in the number of searches that must be undertaken, this approach has the additional benefit that a more restricted neighborhood limits the number of bits that are required to describe the motion vectors (smaller range), and that reduces the transmission burden. With those reasons in mind, we limit the search neighborhood in both the horizontal and vertical directions to .+-.32 positions. That means, for example, that when a 32.times.16 pixel block is considered, then the neighborhood of search is 80.times.80 pixels, and the number of searches for each block is 2.sup.12 (compared to 2.sup.20).

As indicated above, the prediction error can be based on a sum of squares of differences, but it is substantially simpler to deal with absolute values of differences. Accordingly, the motion vector generator herein compares blocks of pixels in the current frame with those in the previous frame by forming prediction error signals that correspond to the sum over the block of the absolute differences between the pixels.

To further reduce the complexity and size of the search, a two-stage hierarchical motion estimation approach is used. In the first stage, the motion is estimated coarsely, and in the second stage the coarse estimation is refined. Matching in a coarse manner is achieved in the first stage by reducing the resolution of the image by a factor of 2 in both the horizontal and the vertical directions. This reduces the search area by a factor of 4, yielding only 2.sup.12 blocks in a 1024.times.1024 image array. The motion vectors generated in the first stage are then passed to the second stage, where a search is performed in the neighborhood of the coarse displacement found in the first stage.

FIG. 3 depicts the structure of the first (coarse) stage in the motion vector generator. In FIG. 3 the input signal is applied to a two-dimensional, 8 pixel by 8 pixel low-pass filter 61. Filter 61 eliminates frequencies higher than half the sampling rate of the incoming data. Subsampler 62 follows filter 61. It subsamples its input signal by a 2:1 factor. The action of filter 61 insures that no aliasing results from the subsampling action of element 62 since it eliminates signals above the Nyquist rate for the subsampler. The output of subsampler 62 is an image signal with half as many pixels in each line of the image, and half as many lines in the image. This corresponds to a four-fold reduction in resolution, as discussed above.

In FIG. 1, motion vector generator 13 is shown to be responsive to the I(t) signal at input line 10 and to the I(t-1) signal at the output of buffer 12. This was done for expository purposes only, to make the operation of motion vector 13 clearly understandable in the context of the FIG. 1 description. Actually, it is advantageous to have motion vector generator 13 be responsive solely to I(t), as far as the connectivity of FIG. 1 is concerned, and have the delay of buffer 12 be integrated within the circuitry of motion vector generator 13.

Consonant with this idea, FIG. 3 includes a frame memory 63 which is responsive to the output of subsampler 62. The subsampled I(t) signal at the input of frame memory 63 and the subsampled I(t-1) signal at the output of frame memory 63 are applied to motion estimator 64.

The control of memory 63 is fairly simple. Data enters motion estimator block 64 is sequence, one line at a time. With every sixteen lines of the subsampled I(t), memory 64 must supply to motion estimator block 64 sixteen lines of the subsampled I(t-1); except offset forward by sixteen lines. The 32 other (previous) lines of the subsampled I(t-1) that are needed by block 64 are already in block 64 from the previous two sets of sixteen lines of the subsampled I(t) signal that were applied to motion estimator block 64.

Motion estimator 64 develops a plurality of prediction error signals for each block in the image. The plurality of prediction error signals is applied to best-match calculator 65 which identifies the smallest prediction error signal. The displacement corresponding to that prediction error signal is selected as the motion vector of the block.

Expressed in more mathematical terms, if a block of width w and height h in the current frame block is denoted by b(x,y,t), where t is the current frame and x and y are the north-west corner coordinates of the block, then the prediction error may be defined as the sum of absolute differences of the pixel values: ##EQU1## where r and s are the displacements in the x and y directions, respectively.

The motion vector that gives the best match is the displacement (r,s) that gives the minimum prediction error.

The selection of the motion vector is performed in calculator 65. In cases where there are a number of vectors that have the same minimum error, calculator 65 selects the motion vector (displacement) with the smallest magnitude. For this selection purpose, magnitudes are defined in calculator 65 as the sum of the magnitudes of the horizontal and vertical displacement, i.e., .vertline.r.vertline.+.vertline.s.vertline..

In the second stage of motion vector generator 13, a refined determination is made as to the best displacement value that can be selected, within the neighborhood of the displacement selected in the first stage. The second stage differs from the first stage in three ways. First, it performs a search that is directed to a particular neighborhood. Second, it evaluates prediction error values for 8.times.8 blocks and a 4.times.2 array of 8.times.8 blocks (which in effect is a 32.times.16 block). And third, it interpolates the end result to 1/2 pixel accuracy.

FIG. 4 presents a general block diagram of the second stage of generator 13. As in FIG. 3, the input signal is applied to frame memory 66. The input and the output of memory 66 are applied to motion estimator 67, and the output of motion estimator 67 is applied to best match calculator 68. Estimator 67 is also responsive to the coarse motion vector estimation developed in the first stage of generator 13, whereby the estimator is caused to estimate motion in the neighborhood of the motion vector selected in the first stage of generator 13.

Calculator 68 develops output sets with 10 signals in each set. It develops eight 8.times.8 block motion vectors, one 32.times.16 motion vector that encompass the image area covered by the eight 8.times.8 blocks, and a measure of the improvement in motion specification (i.e., a lower prediction error) that one would get by employing the eight 8.times.8 motion vectors in place of the associated 32.times.16 motion vector. The measure of improvement can be developed in any number of ways, but one simple way is to maintain the prediction errors of the 8.times.8 blocks, develop a sum of those prediction errors, and subtract the developed sum from the prediction error of the 32.times.16 motion vector.

The motion vector outputs of calculator 68 are applied in FIG. 4 to half pixel estimator 69. Half pixel motion is deduced from the changes in the prediction errors around the region of minimum error. The simple approach used in estimator 69 is to derive the half pixel motion independently in the x and y directions by fitting a parabola to the three points around the minimum, solving the parabola equation, and finding the position of the parabola's minimum. Since all that is desired is 1/2 pixel accuracy, this process simplifies to performing the following comparisons: ##EQU2## where p.sub.x is the prediction error at x, and x' is the deduced half pixel motion vector.

The searches in both stages of motion vector generator 13 can extend over the edges of the image to allow for the improved prediction of an object entering the frame. The values of the pixels outside the image should be set to equal the value of the closest known pixel.

The above describes the structure of motion vector generator 13. All of the computational processes can be carried out with conventional processors. The processes that can most benefit from special purpose processors are the motion estimation processes of elements 64 and 67; simply because of the number of operations called for. These processes, however, can be realized with special purpose chips from LSI Logic Corporation, which offers a video motion estimation processor (L64720). A number of these can be combined to develop a motion estimation for any sized block over and sized area. This combining of L64720 chips is taught in an LSI Logic Corporation Application Note titled "LG720 (MEP) Video Motion Estimation Processor".

MOTION VECTOR SELECTOR/ENCODER 14

The reason for creating the 32.times.16 blocks is rooted in the expectation that the full set of motion vectors for the 8.times.8 blocks cannot be encoded in the bit budget allocated for the motion vectors. On the other hand, sending only 32.times.16 block motion vectors requires 28,672 bits--which results from multiplying the 14 bits per motion vector (7 bits for the horizontal displacement and 7 bits for the vertical displacement) by 32 blocks in the horizontal direction and 64 blocks in the vertical direction. In other words, it is expected that the final set of motion vectors would be a mix of 8.times.8 block motion vectors and 32.times.16 block motion vectors. It follows, therefore, that a selection must be made of the final mix of motion vectors that are eventually sent by the HDTV transmitter, and that selection must fit within a preassigned bit budget. Since the number of bits that define a motion vector depends on the efficacy of compression encoding that may be applied to the motion vectors, it follows that the selection of motion vectors and the compression of th