|
Description  |
|
|
BACKGROUND OF
THE INVENTION
The present invention pertains to a method and apparatus for the compression and transmission of video data. More particularly, the present invention pertains to regulating a transmit buffer in a transmission subsystem and a quantizer selector
in a digital video system environment.
A video compression/transmission system that is known in the art is shown in FIG. 1. A camera 11 is provided that generates video image data for a video capture component 13. The video capture component 13 "captures" the video image data from
the camera one frame at a time in a known manner and at a predetermined rate (e.g., approximately 30 frames per second). The video capture component 13 transfers the video frame data to a video compressor 15 which may compress the video image data for
the frame according to a bit-rate control algorithm. Such a bit-rate control algorithm typically includes a compression algorithm such as any of a variety of block transform based video compression algorithms such as H.261 (International
Telecommunication Union--Telecommunications Standardization Sector ("ITU-T"), March, 1993), H.263 (ITU-T, Dec. 5, 1995), JPEG ("Joint Photographic Expert Group")(International Organization for Standardization/International Electrotechnical Commission
("ISO/IEC") 10918-1), MPEG-I and MPEG-II ("Motion Picture Expert Group")(ISO/IEC 11172-2 and 13818-2). The compressed video frame data is then sent to a transmitter 19 via a video controller 17, and the data is stored temporarily in a transmit buffer 20
under the control of a buffer regulator. The transmitter 19 then pulls data from the transmit buffer sequentially and adds the appropriate protocol information and transmits the data to a transmission medium 21 (e.g., POTS (Plain Old Telephone
Service)).
In systems, such as ProShare.RTM. systems and Intel Smart Video Recorder.RTM. systems (Intel Corporation, Santa Clara, Calif.), the bit rate control algorithm of compressor 15 operates separately from the transfer of data from transmit buffer
20 to transmission medium 21 by the buffer regulator. Because of this separation, the operation of the bit rate control algorithm can only estimate the state of the transmit buffer 20 (i.e., how much data is contained in the transmit buffer 20). Also,
the buffer regulator of the transmitter 19 typically requires that the video compressor 15 produce the same amount of compressed data for each frame. This separation leads to inaccuracies in that the transmit buffer 20 is incorrectly filled (i.e., not
filled with enough data which reduces the frame rate over the transmission medium 21 or filled with too much data causing delay or latency).
SUMMARY OF THE INVENTION
According to a first embodiment of the present invention, the video compressor receives a succession of video frames, each frame including uncompressed video frame data divided into N macroblocks, where each macroblock defines a spatial area of
each of the frames. The video compressor compresses a macroblock of video frame data based on a quantization parameter, which is supplied by a quantizer selector of a bit rate controller. The quantizer selector calculates the quantization parameter for
an nth one of the macroblocks of a current video frame based on a cumulative amount of compressed video image data generated for the first n-lmacroblocks of the current frame and a previous frame. Since the quantization parameter has a direct effect on
the number of compressed bits generated per macroblock, the bit rate controller can control the total number of compressed bits generated for a frame on a macroblock-by-macroblock basis.
According to a second embodiment of the present invention, a video compressor generates compressed video frame data to be sent to a transmit buffer (which, in turn, are sent over a transmission medium). The video compressor operates under the
control of a buffer regulator of the bit rate controller which schedules the compression of video frames captured by a video capture component coupled to the video compressor. When the number of bits remaining in the transmit buffer falls below a
threshold, the buffer regulator schedules a compression of a new frame from the video capture component. In this manner, the effective bandwidth of the transmission medium is used efficiently in that compressed video frame data is sent to the transmit
buffer as the data is needed.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a block diagram of video compression and transmission system that is known in the art.
FIG. 2 is a block diagram of a video compression and transmission system constructed according to an embodiment of the present invention.
FIG. 3 is block diagram of a video controller and video compressor operated according to an embodiment of the present invention.
FIG. 4 is a flow diagram of an operation of a quantizer selector constructed according to an embodiment of the present invention.
FIG. 5 is a flow diagram of an operation of a buffer regulator constructed according to an embodiment of the present invention.
DETAILED DESCRIPTION
Referring to FIG. 2, a block diagram of a video compression and transmission system 30 including an embodiment of the bit rate controller of the present invention is shown. In this example, the bit rate controller can include a buffer regulator
34 (in a video controller 35) and a quantizer selector 36 (in a video compressor 37). As in FIG. 1, a camera 31 is provided which supplies video image data to a video capture component 33. The video capture component supplies video image data at a rate
of approximately 30 frames per second (FPS) to the video controller 35. The video controller 35 via the buffer regulator 34 schedules the compression of video image data from the video capture component 33 by supplying it to a video compressor 37. The
video compressor 37 then compresses the video image data under the control of the quantizer selector 36. The compressed video image data is then supplied by the video controller 35 to a transmitter 39 which stores this data in a transmit buffer 40.
Compressed video data is then pulled from the transmit buffer 40 and sent to a transmission medium 41 such as a Plain Old Telephone Service (POTS).
In an example of the present invention using POTS, the bandwidth of the phone lines is limited to approximately 25-28 kilobits per second (kbps). Due to limitations in the bandwidth of the transmission medium, video image data is most often
transmitted at less than the video capture rate of 30 FPS. Accordingly, not every frame of video image data generated by the video capture component 33 is compressed by the video compressor 37 (i.e., not every frame of video image data is scheduled to
be compressed by the video controller 35). Given the bandwidth limitations of the transmission medium 41, the system 30 attempts to operate at a target frame rate (i.e., the number of frames of compressed video image data sent over the transmission
medium each second). For this example, it is assumed that the target frame rate for the system 30 is 5 FPS. Given that there is 25-28 kbps bandwidth over the transmission medium 41, it is assumed that 20 kbps is available for video data generated by
the system 30. Accordingly, the video controller 35 has a target frame rate of 5 FPS and a bandwidth of 20 kbps, which leads to an average of 4 kbits of compressed video data for each frame sent over the transmission medium (i.e., 20 kilobits/sec.div.5
frames/sec=4 kilobits/frame). Under the control of the quantizer selector 36, the video compressor 37 will attempt to compress video frame data from the video capture component 33 to a target frame size of 4 kbits. The compressed video data is then
sent by the video controller 35 to the transmitter 39 and placed in the transmit buffer 40. The quantity of data contained in the transmit buffer 40 (e.g., measured in bits or bytes) is supplied to the buffer regulator 34 of the video controller 35 as
the value, Bit Count (either periodically or in response to a query by the video controller 35) in this example. The buffer regulator 34 compares the Bit Count to a threshold value or "Low Water Mark" and schedules another frame of video data from the
video capture component 33 to be compressed by the video compressor 37 when Bit Count is less than the threshold value. The low water mark value is high enough so that by the time all of the remaining bits have been drained from the transmit buffer 40,
the video compressor 37 will have finished compressing the next frame and is ready to place it in the transmit buffer 40.
If the transmitter 39 is transmitting compressed video data from the transmit buffer 40 to the transmission medium 41 at the expected rate of 20 kbps (in this example), and the video compressor 37 is returning compressed video frames of 4 kbits,
then the target frame rate of 5 FPS will be achieved. However, the transmission bandwidth can vary (e.g., when there is no audio data being transmitted over the transmission medium 41, more bandwidth is available for video data) and the video compressor
will most likely generate compressed video frames that contain approximately 4 kbits. In other words, the frame rate will vary around the target frame rate. If the video controller 35 detects that the transmit buffer is being drained at a rate
different than originally specified (e.g., 20 kbps) over an extended period of time, then the video compression and transmission system 30 can recalculate a target frame size based on the new transmission bandwidth to maintain the same frame rate, and
can vary the low water mark to minimize latency.
The operation of the video compressor 37 will be described with reference to FIG. 3. When compressing raw video data of a frame, the video controller 35 (via the buffer regulator) calculates the target frame size (i.e., target number of bits per
frame) that corresponds to the target frame rate and the expected bandwidth. As in the example above, the target bits per frame is calculated by dividing the expected bandwidth with the target frames-per-second value. Accordingly, in this example, the
video controller 35 will determine that the target bit rate is 4 kbits per frame. The video compressor 37 includes a coder/decoder (codec) 53 which employs a bit rate control algorithm on a per-frame basis to attempt to achieve the target frame size.
Compression algorithms such as H.263, require that raw video data be divided into a plurality of rows of macroblocks. Each macroblock is of the same size (e.g., in the H.263, each macroblock has a size of 16 by 16 pixels).
The quantizer selector 36 includes a processor 50 that receives the target frame size from the buffer regulator 34 as well as command information (e.g., to schedule a frame compression). When the buffer regulator 34 schedules a frame to be
compressed, it is sent to the video compressor 37 and can be stored in a uncompressed data queue 54. The data in queue 54 is sent to the codec 53 which compresses the video frame data under the control of the processor 50. According to an embodiment of
the present invention, part of that control is the providing of a quantization parameter (newQP) to the codec 53.
The Quantization Parameter (QP) value in a hybrid Discrete Cosine Transform (DCT) compression algorithm can vary between a low value of 0 and a high value of 31 according to the H.263 specification. After the DCT is completed, the transformed
coefficients are quantized with the QP value. How many bits are needed to transmit the quantized coefficients for a macroblock depends on the value of the QP. When the QP value is small, a large number of bits are usually needed. When the QP value is
large, fewer bits are needed. Accordingly, the QP value has an inverse proportional effect on the number of compressed bits generated for each macroblock.
In this compression algorithm, it is assumed that consecutive frames in a video sequence will have similarities (e.g., similar backgrounds) and will have similar bit usage distributions. A goal of the compression algorithm is to match the bit
usage distribution in the current frame with the previous frame. To avoid rapid fluctuations, and to handle scene changes, certain limitations will be applied. It is assumed that there are N macroblocks in a frame. Prior to encoding macroblock n of N,
the following calculations are performed to determine the value for newQP to be used when encoding the nth macroblock. A flow diagram showing an exemplary method of setting the value for newQP is shown in FIG. 4. First, the target frame size is
computed in the video controller 35 as discussed above:
where target_frame_size would be 4 kbits in this example, bit_rate is nominally 20 kbps in this example, and frame_rate is 5 fps in this example. Also, the number of bits that should have been used up in encoding the first n macroblocks is
calculated as follows (block 100):
where bit_usage_target is a target value for the number of bits that have already been created for the current compressed frame (i.e., after encoding the first n-1 macroblocks), prev_bit_usage[n] is the cumulative number of bits used in the first
n-1 macroblocks of the previous compressed frame, and prev_bit_usage[N] is the total number of bits used in the previous compressed frame. The previous bit usage array can be stored in a memory 51 in the quantizer selector 36.
As suggested in Video Codec Test Model, TMN5 (ITU-T, Study Group 15, Working Party 15/1, Expert's Group on Very Low Bitrate Visual Telephony (Jan. 31, 1995), the disclosure of which is hereby incorporated by reference in its entirety), the value
for the first version of the QP value is selected according to Equations 3-6.
bit_usage_delta=bit_usage_until_now-bit_usage_target Eq. 3
(block 102) where bit_usage_until_now is the total number of bits that have been used to compress the current frame up to the current macroblock. This value can be supplied by a compressed data queue 52 coupled to the output of the codec 53.
(block 104). The quantization parameter for the nth macroblock has a value between 0 and 31 according to the H.263 specification. The mean value for QP for the previous picture (frame) can also be stored in memory 51 of the quantizer selector
36.
According to an embodiment of the present invention, the quantization parameter is changed so as to control the size of the video frame data and the amount the quantization parameter can fluctuate is controlled to prevent degradations in quality. The quantization parameter can be controlled so that it is held between an upper and a lower limit for each row of macroblocks. For example, if the value for newQP, calculated above is greater than a selected term, A (see decision block 107), then newQP
is set to A (block 108). Also, if newQP is less than a second selected value, B (see decision block 111), then newQP is set to B (block 112). The value for A can be selected based on the previous value for QP as shown in equations 7-10 (block 106).
The value for B can be selected based on equation 11 (block 110):
In this example, using equations 6-11, the quantization parameter is prevented from increasing more than four quantization values over one row of macroblocks and is prevented from decreasing more than two quantization values over a row of
macroblocks (as compared to the mean quantization parameter value for the previous row of macroblocks). This way the system reacts more quickly to increases in complexity in the video sequence. Thus, large fluctuations in the QP value are reduced in
the video frame currently being compressed, which would degrade quality. By controlling the quantization parameter in such a manner, the system allocates bits more accurately to different parts of the video frame according to a past history of bit
allocation. As with other values described above, mean QP values for previous rows of macroblocks can be stored in memory 51 of the quantizer selector 36.
The value for newQP can be set to a selected value C (block 116), if newQP is less than C (see decision block 115), where
(block 114) where startQP is the QP for the first macroblock calculated as:
Limiting the value of newQP so that it never decreases below 2/3 of the average of all quantization parameters for the previous frame prevents an excessively large frame size (e.g., having a number of compressed bits far in excess of the target
frame size) and large fluctuations in the QP value from video frame to video frame which can also degrade quality.
It is essential that the bit count for the frame currently being compressed does not exceed the allocated buffer size for the codec 53 (as specified by the block transform based video compression algorithm specification, such as the H.263
specification). If it does, then there is a possibility that the codec 53 (storing buffer_size bits) will overflow resulting in lost video frame data. Accordingly, if bit_usage_until_now>(n/N)*D*buffer_size (see decision block 117), then newQP is
set to the selected value of E (block 118). In this example, E is newQP+4 and D is 0.75.
The target frame size in kbits can be changed by the buffer regulator 34 to compensate for the ability of the video system to fill the bandwidth of the transmission medium 41. Prior to sending a frame of compressed data to the transmitter 39 (or
after a number of compressed frames are sent), the Bit Count value can be checked from the transmitter 39. At this time, the target_frame_size ("TFS") is multiplied by a TFS adjustment variable (TFSadj) which can have an initial value of 1.0. If the
Bit Count value has gone to zero at the time it is checked, the frames that are being sent may be too small. This is checked by keeping track of whether a compression is skipped because the video compressor 37 is busy at the time. This essentially
means that if a compression is skipped because the video compressor 37 is not finished compressing the previous frame, and the subsequently compressed frame is sent on a Bit Count of 0, the system 30 is not able to compress video frame data fast enough.
Therefore, the target_frame_size variable should be increased by raising the TFSadj variable by an increment (TFSadjInc) (e.g., 0.01). So that target frame size does not become excesseive, a maximum for TFSadj can be set to 2.0. Another reason for
changing the target_frame_size value is when every video frame being captured by the video capture component is being compressed, but the Bit Count value is 0 before the next frame of compressed video data is sent to the transmitter 39. This may be an
indication that the capture rate may be too slow for the target frame size. To correct this, the target_frame_size variable is decremented by decreasing the value for TFSadj by the value TFSadjDec (e.g., 0.03). The target frame size adjustment value
can have a minimum value of 1.0.
Whether the compressor 37 is to compress a video frame from the video capture component 33 can be based on a comparison between the Bit Count value from the transmitter 37 and the threshold value or "low water mark." The low water mark is just
high enough so that by the time all remaining bits have been drained from the transmit buffer 40, the video compressor 37 will have finished compressing the next frame and is ready to place it in the transmit buffer 40. If the low water mark is too low,
the buffer will be empty for some period of time before the next frame is finished compressing, and thus the bandwidth over the transmission medium 41 will be wasted. On the other hand, if it is too high, then the next frame of video will have sat in
the transmit buffer 40 longer than necessary and thus latency will be added to the system 30.
To compute the threshold value (Low Water Mark or LWM), two values are used in this embodiment of the present invention. The first is the amount of time left before the video compressor 37 would be able to compress a raw video frame from the
video capture component 33 if it did not compress the one currently available from the component 33 (i.e., the next send time (NST)). Since the amount of time varies for compressing an image an estimate can be used that is 20% greater than the average
of the last y compressions. Accordingly, for a compress time (CT.sub.i) for the ith frame, the current compress time (CCT) at the ith frame would be:
If VCI is the video capture interval for the video capture component 33 (e.g., 1/29.97 fps or 33.37 ms) the estimate for the next sample time is computed as:
NST=VCI+1.2*CCT Eq. 15
It can be seen from the above, that the rate at which video frames are captured by the video capture component 33 has an effect on latency in the system 30. Thus, the faster that video frames are captured, the less time the video compressor 37
will be waiting for a raw video frame data from the video capture component 33.
The second value is the current bit rate (CBR) which is the rate at which bits of compressed data are being read from the transmit buffer 40. The CBR after the ith frame is computed as an average of the last z frames sent as:
where L.sub.i is the length in bits of the ith compressed frame of video data and T.sub.i is a time stamp (in elapsed milliseconds) associated with frame i. Given these two values, the selection of a value for the low water mark threshold is:
as system parameters change, such as transmission medium bandwidth and target frame rate, the LWM threshold may be changed to compensate for it. The LWM threshold can be adjusted after each frame (or after a set number of frames) by multiplying
it with and LWM adjustment (LWMadj) which would have an initial value of 1. If the Bit Count from the transmit buffer 40 has gone to 0 before the next compressed frame is sent to the transmit buffer 40, then the LWMadj variable is modified adding an
incremental value LWMadjInc (e.g., 0.01). If the Bit Count from the transmit buffer 40 has not reached 0 when the next compressed fame is sent to the transmit buffer 40, then the LWMadj variable is modified by subtracting a decremental value LWMadjDec
(e.g., 0.001). By setting LWMadjInc to a value much greater than LWMadjDec, the system is able to react faster to increasing the LWM to avoid the Bit Count value from reaching 0 often. To minimize latency, LWM should be as low as possible. Using the
LWMadj value as modified by LWMadjInc and LWMadjDec, the effect is that LWM will reach a state of equilibrium (i.e., going up and down in value by the same amount each period) and remain in equilibrium unless a change in the system shifts the LWM to
another point (e.g., where video content begins to change extensively from frame to frame). To prevent the LWM from becoming too high, a maximum can be set for LWMadj (e.g., 2.0).
The buffer regulator 34 can also be used in controlling the generation of so-called PB-frames using the low water mark threshold. As detailed in the H.263 specification, a PB frame includes one P-frame which is predicted from the previously
decoded P-frame and one B-frame which is predicted both from the previous decoded P-frame and the P-frame currently being decoded (thus, bidirectionally). If a B-frame is expected at the current compression time, that compression can be scheduled even
if the Bit Count value is above the LWM threshold. This is because, even if the B-frame is compressed, it is not sent immediately (i.e., like a P-frame) since the encoder holds on to it and returns it with the appropriate P-frame as a PB-frame.
The operation of the buffer regulator 34 as it relates to the low water mark threshold and the target frame size is shown in FIG. 5 in flow diagram form. In block 61, the low water mark threshold is computed and adjusted as described above. If
a B-frame is expected in decision block 63 control passes to block 67 where the next frame of video data is queued for compression (e.g., in queue 54 of FIG. 3). If a B-frame is not expected, then the Bit Count value is checked against the low water
mark threshold (decision block 63) and the compression is skipped if the Bit Count value is too high (block 65). Otherwise, if the video compressor 37 is not busy (decision block 66), then the video frame data is queued for compression (block 67). In
block 71, the target_frame_size variable is adjusted as described above and the video frame data is compressed by codec 53 (block 73). In block 75 the CCT and CBR value are updated and the Bit Count value is once again checked in decision block 77. If
the Bit Count value is 0 then the LWMadj and TFSadj values are decreased (block 79), and if the Bit Count value is not 0, the same variables are increased (block 78). In block 80, the compressed frame data is sent to the transmit buffer 40 via the video
controller 35 and control returns to block 61 to handle the next frame of video data.
With the video compression/transmission system of the present invention, an improved quality of video data is achieved in that the system adapts quickly to changing bandwidth at the transmission medium 41. If the video controller 35 detects that
the transmit buffer is being drained at a rate that is different than originally specified over an extended period of time, then it can recalculate the target frame size based on the new transmission bandwidth to maintain the same frame rate, and can
vary the "low water mark" threshold to minimize latency.
The present invention has been described with respect to a POTS transmission medium, however the embodiments of the bit rate controller described above can be used at a broad range of data rates. Although several embodiments are specifically
illustrated and described herein, it will be appreciated that modifications and variations of the present invention are covered by the above teachings and within the purview of the appended claims without departing from the spirit and intended scope of
the invention.
* * * * *
|
|
|
|
|
Description  |
|