|
Claims  |
|
|
What is claimed is:
1. A method for determining whether portions of an image represented by
image data obtained by scanning said image correspond to text or
continuous tone, said method comprising the steps of:
partitioning said image data into contiguous image data blocks;
determining an average grayscale value for each of said image data blocks;
determining at least one additional characteristic for each of said image
data blocks;
utilizing each possible average grayscale value as an indexing variable;
generating a one dimensional Gaussian distribution to distribute said at
least one additional characteristic for each of said average grayscale
values for text data;
determining the mean and standard deviation for each text data
distribution;
calculating the Mahalanobis distance for each additional characteristic
data value for each image data block to determine a measure of the
distance from the mean for that characteristic of the corresponding data
block and, hence, a representative text probability of whether said data
block corresponds to a text portion of said image; and
determining whether said data blocks correspond to text or continuous tone
portions of said image based on said representative text probabilities.
2. A method as claimed in claim 1 further comprising the steps of:
generating a one dimensional Gaussian distribution to distribute said at
least one additional characteristic for each of said average grayscale
values for continuous tone data;
determining the mean and standard deviation for each continuous tone data
distribution; and
calculating the Mahalanobis distance for each additional characteristic
data value for each image data block to determine a measure of the
distance from the mean for that characteristic of the corresponding data
block and, hence, a representative continuous tone probability of whether
said data block corresponds to a continuous tone portion of said image
wherein the step of determining whether said data blocks correspond to
text or continuous tone portions of said image is based on said
representative text and continuous tone probabilities.
3. A method as claimed in claim 2 wherein the step of determining whether
said data blocks correspond to text or continuous tone portions of said
image based on said representative text and continuous tone probabilities
comprises normalizing the text probability by dividing it by the sum of
the two probabilities and using the normalized probability to make the
determination.
4. A method as claimed in claim 3 further comprising the step of applying
said normalized text probability to a first decision filter to update said
normalized text probability based on the normalized text probabilities of
the surrounding contiguous data blocks prior to determining whether said
data blocks correspond to text or continuous tone portions of said image.
5. A method as claimed in claim 4 further comprising the step of applying
the output signals from said first decision filter to a second decision
filter to finally determine whether said data blocks correspond to text or
continuous tone portions of said image, said second decision filter
evaluating the text/continuous tone decisions for 45 data blocks
comprising a cross-shaped outline of data blocks including a centered
three-by-three matrix of data blocks and the immediately adjacent
three-by-three matrices of data blocks to the top, bottom and sides
thereof, the preceeding text/continuous tone decisions being summed and
compared to a predetermined threshold to finally determine whether the
center data block of the centered three-by-three matrix of data blocks
corresponds to a text portion or a continuous tone portion of said image.
6. A method as claimed in claim 3 wherein said at least one additional
characteristic comprises a block variance.
7. A method as claimed in claim 6 further comprising the step of
determining a block edge count for each of said image data blocks as a
second additional characteristic.
8. A method as claimed in claim 7 further comprising the step of
determining a block agreement count for each of said image data blocks as
a third additional characteristic.
9. A method as claimed in claim 8 further comprising the step of
determining a block text average for each of said image data blocks as a
fourth additional characteristic.
10. A method for duplicating a document wherein grayscale image data is
separated into text data and continuous tone data such that it can be
suitably processed for use in a bilevel printing device, said method
comprising the steps of:
scanning a document to generate grayscale image data representative of
individual pels of said document;
partitioning said image data into contiguous image data blocks;
determining an average grayscale value for each of said image data blocks;
determining at least one additional characteristic for each of said image
data blocks;
utilizing each possible average grayscale value as an indexing variable;
generating a one dimensional Gaussian distribution to distribute said at
least one additional characteristic for each of said average grayscale
values for text data;
determining the mean and standard deviation for each text data
distribution;
calculating the Mahalanobis distance for each additional characteristic
data value for each image data block to determine a measure of the
distance from the mean for that characteristic of the corresponding data
block and, hence, a representative text probability of whether said data
block corresponds to a text portion of said image;
determining whether said data blocks correspond to text image data or
continuous tone portions of said image based on said representative text
probabilities;
processing said grayscale image data as either text image data or
continuous tone image data to generate binary image data; and
printing a duplicate of said document based on said binary image data.
11. A method as claimed in claim 10 further comprising the steps of:
generating a one dimensional Gaussian distribution to distribute said at
least one additional characteristic for each of said average grayscale
values for continuous tone data;
determining the mean and standard deviation for each continuous tone data
distribution; and
calculating the Mahalanobis distance for each additional characteristic
data value for each image data block to determine a measure of the
distance from the mean for that characteristic of the corresponding data
block and, hence, a representative continuous tone probability of whether
said data block corresponds to a continuous tone portion of said image
wherein the step of determining whether said data blocks correspond to
text or continuous tone portions of said image is based on said
representative text and continuous tone probabilities.
12. A method as claimed in claim 11 wherein the step of determining whether
said data blocks correspond to text or continuous tone portions of said
image based on said representative text and continuous tone probabilities
comprises normalizing the text probability by dividing it by the sum of
the two probabilities and using the normalized probability to make the
determination.
13. A method as claimed in claim 12 further comprising the step of applying
said normalized text probability to a first decision filter to update said
normalized text probability based on the normalized text probabilities of
the surrounding contiguous data blocks prior to determining whether said
data blocks correspond to text or continuous tone portions of said image.
14. A method as claimed in claim 13 further comprising the step of applying
the output signals from said first decision filter to a second decision
filter to finally determine whether said data blocks correspond to text or
continuous tone portions of said image, said second decision filter
evaluating the text/continuous tone decisions for 45 data blocks
comprising a cross-shaped outline of data blocks including a centered
three-by-three matrix of data blocks and the immediately adjacent
three-by-three matrices of data blocks to the top, bottom and sides
thereof, the preceeding text/continuous tone decisions being summed and
compared to a predetermined threshold to finally determine whether the
center data block of the centered three-by-three matrix of data blocks
corresponds to a text portion of a continuous tone portion of said image.
15. A method as claimed in claim 12 wherein said at least one additional
characteristic comprises a block variance.
16. A method as claimed in claim 15 further comprising the step of
determining a block edge count for each of said image data blocks as a
second additional characteristic.
17. A method as claimed in claim 16 further comprising the step of
determining a block agreement count for each of said image data blocks as
a third additional characteristic.
18. A method as claimed in claim 17 further comprising the step of
determining a block text average for each of said image data blocks as a
fourth additional characteristic.
19. Apparatus for determining whether portions of an image represented by
image data obtained by scanning said image correspond to text or
continuous tone, said apparatus comprising:
first circuit means for partitioning said image data into contiguous image
data blocks;
second circuit means for determining an average grayscale value for each of
said image data blocks in response to the image data comprising said image
data blocks;
third circuit means for determining at least one additional characteristic
for each of said image data blocks in response to image data comprising
said image data blocks;
probability generating means responsive to said average grayscale value and
said at least one additional characteristic for generating a normalized
text probability based on the Mahalanobis distance of each additional
characteristic data value distributed as a one dimensional Gaussian
distribution for each of said average grayscale values; and
fourth circuit means for determining whether said data blocks correspond to
text or continuous tone portions of said image based on said normalized
text probabilities.
20. Apparatus as claimed in claim 19 wherein said fourth circuit means
comprises first decision filter circuit means for updating said normalized
text probabilities based on the normalized text probabilities of the
surrounding contiguous data blocks and second decision filter circuit
means for finally determining whether said data blocks correspond to text
or continuous tone portions of said image by processing the output signals
from said first decision filter for 45 data blocks comprising a
cross-shaped outline of data blocks including a centered three-by-three
matrix of data blocks and the immediately adjacent three-by-three matrices
of data blocks to the top, bottom and sides thereof, the preceeding
text/continuous tone decisions from said first decision filter being
summed and compared to a predetermined threshold to finally determine
whether the center data block of the centered three-by-three matrix of
data blocks corresponds to a text portion or a continuous tone portion of
said image. |
|
|
|
|
Claims  |
|
|
Description  |
|
|
BACKGROUND OF THE INVENTION
This invention relates generally to processing image data obtained by
scanning a picture, document or other image and, more particularly, to
determining whether portions of the image represented by the image data
correspond to text or continuous tone such that the image data may be
appropriately processed for storage, duplication or display of the image.
Image data is obtained by scanning an image with a scanner comprising, for
example, a plurality of charge coupled devices (CCD's). The scanner
effectively divides the image into a finite number of small picture
elements which are referred to as pixels or pels. Each resulting pel of
the image is converted into a number representative of the grayscale value
of the pel as detected by one of the scanner CCD's.
The image data is often applied to a bilevel device, i.e., a device which
reproduces each pel as one of two grayscale levels, typically black or
white, for duplication or display of the image. For application in a
bilevel device, the image data representative of each pel is processed
into a binary number signifying whether the pel is to be black or white,
e.g., a print or no print picture element, respectively, for an image
duplication system. The determination of which pels of image data are to
be printed or left not printed for a bilevel image duplication system
depends to some extent on the characteristics of the image to be
duplicated.
For example, portions of the image may be broadly classified as comprising
either continuous tone or text. Photographs and certain half tone images
are examples of continuous tone while text is exemplified by line drawings
and letter images. Different techniques are normally applied to process
the two different types of image data.
For text image data, the conversion from grayscale values to binary values
is often accomplished by establishing a threshold to which the grayscale
value of each pel is compared. The result of the comparison is that if the
grayscale value exceeds the threshold, a black or print representation is
selected for the pel and, conversely, if the threshold is not exceeded,
the pel is left white or not printed.
For continuous tone image data, alternate techniques are applied. For
example, a variety of pel block patterns may be selected to represent
blocks of the continuous tone image data dependent upon the composite
grayscale value of the blocks. For continuous tone image data, pel block
patterns may be shifted, rotated or otherwise varied to prevent the
appearance of interference patterns such as Moire patterns.
One known prior art technique for distinguishing between text and
continuous tone image data employs an electronic filter. The filter is
applied to a block of pel data surrounding a particular pel to be
evaluated. The difference between the grayscale value of the pel being
evaluated and the average filtered grayscale value for the block of pels
is determined. If the absolute value of the difference is above a preset
threshold, text image data is presumed; and, if the absolute value of the
difference is below or equal to the preset threshold, continuous tone
image data is presumed. This technique is more fully disclosed in U.S.
Pat. No. 4,194,221.
While the known prior art arrangement provides a varying level of
effectiveness dependent upon the selection of coefficients for use in the
electronic filter, improved techniques for distinguishing between text and
continuous tone image data are always needed to advance the art of image
processing and provide effective and inexpensive alternatives.
SUMMARY OF THE INVENTION
In accordance with the present invention, image data is partitioned into
contiguous image data blocks which are processed to determine whether the
portions of the image represented by the data blocks correspond to text or
continuous tone portions of the image. An average grayscale value is
determined for each of the image data blocks with the average grayscale
value being used as an indexing variable for at least one additional
characteristic which is determined for each of the image data blocks. A
one dimensional Gaussian distribution is generated to distribute the data
values of the one or more additional characteristics for each possible
grayscale value of an image data block. The mean and standard deviation
are determined for each data distribution and the Mahalanobis distance for
each additional characteristic data value is determined. The Mahalanobis
distance effectively measures the distance of the data value from the mean
of the distribution for the characteristic of the corresponding data block
and, hence, is a representative probability for whether the image data
block corresponds to a text portion of the image or a continuous tone
portion of the image.
Preferably, Gaussian distributions are generated for both text data and
continuous tone data such that probabilities for both text and continuous
tone can be estimated independently for each image data block. The text
probability is then normalized by dividing it by the sum of the two
probabilities. The normalized text probability is then utilized to
determine whether the data blocks correspond to text or continuous tone
portions of the image. Among the additional characteristics which may be
determined for each image data block are the following which are listed in
order of preference: an image data block variance, an image data block
edge count, an image data block agreement count, and an image data block
text average.
It is, therefore, an object of the present invention to provide an improved
method and apparatus for determining whether the image data correspond to
text or continuous tone portions of an image represented by the data.
It is another object of the present invention to provide an improved method
and apparatus for processing image data to determine whether the data
correspond to text or continuous tone portions of the image wherein the
image data is partitioned into contiguous image data blocks with the
average grayscale value being determined for each of the data blocks and
used as an indexing variable for the formation of a one dimensional
Gaussian distribution for at least one additional characteristic of the
data block with the mean and standard deviation being determined for each
data distribution and the Mahalanobis distance for each additional
characteristic data value being determined as a representative probability
for whether the data block corresponds to a text portion of the image or a
continuous tone portion of the image.
Other objects and advantages of the invention will be apparent from the
following description, the accompanying drawings and the appended claims.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a graphical representation of a normal or Gaussian distribution.
FIG. 2 is a graphical representation of the Mahalanobis distance associated
with a Gaussian distribution.
FIG. 3A is a block diagram of an implementation of a low-pass image data
filter.
FIG. 3B is a matrix equation for the low-pass filter of FIG. 3A.
FIG. 4A is a block diagram of an implementation for determining a block
average characteristic.
FIG. 4B shows the elements of a four pel by four pel image data block.
FIG. 5 is a block diagram of an implementation for determining a block
variance characteristic.
FIG. 6 is a block diagram of an implementation for determining block edge
indications.
FIG. 7 shows an implementation for determining text average and agreement
count characteristics.
FIG. 8 is a block diagram of an implementation for accumulating the block
text average, block agreement counts and block edge counts.
FIGS. 9 and 10 are block diagrams of two implementations for generating a
normalized text probability.
FIG. 11 is a block diagram of an implementation of a first decision filter.
FIG. 12 is a 100 element matrix showing the cross-shaped formation of
blocks processed by a second decision filter (see drawing sheet 5).
FIGS. 13A and 13B together from a block diagram for an implementation of
the second decision filter.
FIG. 14 is a block diagram of a document duplicator system.
DETAILED DESCRIPTION OF THE INVENTION
In accordance with the present invention, image data obtained by scanning a
picture, document or other image are processed to determine selected
characteristics of the image data. Based on the selected characteristics,
a decision is made as to whether the image data correspond to text
portions of the image or continuous tone portions of the image.
For processing, the image data are subdivided into contiguous image data
blocks each of which preferably comprises four pels by four pels. Each of
the individual pels of an image data block is represented by its
corresponding grayscale value which is the raw data obtained from the
scanner.
The grayscale values are processed as will be described hereinafter, to
obtain the following characteristics of the individual image data blocks;
a block average which is essentially the average grayscale value of the
pels represented by an image data block; a block variance which is
essentially the statistical variance of the grayscale values of the pels
represented by an image data block; a block edge count which is a
determination of the number of pels which are associated with an edge of a
figure in the portion of the image represented by an image data block; a
block text average which is the summation of the number of pels in an
image data block having a grayscale value exceeding a text threshold; and,
a block agreement count which is formed by initially comparing the
grayscale values of the pels of an image data block to an alternate text
threshold and counting the number of pels for which the determination was
the same as in the text average determination. These image data block
characteristics will be more fully described hereinafter.
Applicants have determined that the block average, i.e., the average
grayscale value of an image data block, can be used as an independent or
indexing variable for the remaining block characteristics: the block
variance, the block edge count, the block text average and the block
agreement count. With the block average used as an independent or indexing
variable, the remaining characteristics are distributed approximately as a
normal or Gaussian distribution as shown in FIG. 1 and defined by the
equation:
##EQU1##
where .eta. is the mean of the distribution and .sigma. is the standard
deviation.
The Mahalanobis distance (MHD) as defined by the equation:
##EQU2##
and shown in FIG. 2, may be utilized as a scaled measure of the distance
of a given data value X from the mean .eta. of a distribution. The smaller
the Mahalanobis distance, the more likely the given data value X
corresponds to the data type represented by the distribution. Hence, the
Mahalanobis distance of a characteristic data value for a given data block
is a representative measure of whether the given data block is a text
portion of a continuous tone portion of the corresponding image. Whether
the Mahalanobis distance represents a text probability or a continuous
tone probability is dependent upon the distribution being applied, i.e., a
text distribution gives text probabilities and a continuous tone
distribution gives continuous tone probabilities.
The value of the mean .eta. and the standard deviation .sigma. are
determined for each of the data distributions for both text and continuous
tone data for each block average value. Thus, given a block average and at
least one additional characteristic for an image data block, both a text
probability and a continuous tone probability can be determined.
A convenient way to implement the present invention is to form tables for
each of the characteristics with the tables being entered by means of the
block average and a second block characteristic data value. For example,
the following tables can be formed for both text and continuous tone:
average/variance; average/edge count; average/text average; and,
average/agreement count.
In one embodiment of the present invention, the Mahalanobis distance tables
for both text and continuous tone data are stored in programmable
read-only memories (PROM's). When a programmable read-only memory (PROM)
is addressed by a corresponding block average and characteristic data
value, the probability that the combination of the two characteristics
will result from text or continuous tone image data (depending upon the
portion of the PROM addressed) is generated at the output of the PROM.
For convenience, the Mahalanobis distances are stored in logarithmic form.
Hence, the logarithmic Mahalanobis distances or probabilities for each
distributed characteristic which is utilized can then be added together
and converted, for example, by means of an antilogarithmic look-up table,
to form representative probabilities. The resulting probabilities are then
analyzed to determine whether the image data block being evaluated
corresponds to a text portion of the scanned image or a continuous tone
portion of the scanned image. Since the probabilities for text and
continuous tone are independently calculated, the sum of the two need not
be equal to one. Hence, the more probable of the two may have a
probability of less that 0.50. Likewise, the less probable of the two may
have a probability of more than 0.50.
Preferably, a normalized text probability is formed by dividing the text
probability by the sum of the text probability and the continuous tone
probability. The normalized text probability thus determined for an image
data block and the normalized text probabilities of each of the blocks
contiguous to it are then applied to a first decision filter which updates
the normalized text probability to increase it or decrease it based on the
probabilities of the surrounding blocks. Thus, if all the surrounding
blocks are text, text will be highly favored for the block, whereas if all
the surrounding blocks are continuous tone, continuous tone will be highly
favored for the block. Based on the first decision filter, blocks
identified as text are given a value of "one" and blocks identified as
continuous tone are given a value of "zero".
Finally, a second decision filter is applied to 45 image data blocks
arranged in three by three matrices of blocks with the block being
evaluated being in the center of a centered three by three matrix with the
immediately adjacent three by three matrices to either side and to the top
and bottom of the matrix containing the block being evaluated, i.e., a
cross-shaped formation of four pel by four pel blocks as shown in FIG. 12.
In the second decision filter, each of the individual block values from the
first decision filter, i.e., "one" or "zero", are counted up for the
entire 45 block area. If the count is greater than a defined threshold,
the final result is text. If the count is not greater than the defined
threshold, the final result is continuous tone. Hysteresis is used in the
threshold in that if the decision from the first decision filter is text,
the threshold is set lower than if the decision from the first decision
filter is continuous tone. In a working embodiment of the present
invention, the threshold for a previous text decision was set equal to 18,
thus favoring a decision of text, and the threshold for a previous
continuous tone decision was set equal to 26, thus favoring continuous
tone.
Exemplary characteristics of image data blocks to be processed in
accordance with the present invention will now be defined and illustrative
embodiments for their generation will be described. It is noted that while
the defined characteristics have been utilized in a working embodiment of
the invention, various combinations of the defined characteristics and/or
other characteristics of the image data blocks can be applied in
accordance with the present invention. Accordingly, this disclosure should
be interpreted to include such variations as will be suggested to those
skilled in the art.
As previously noted, the image data blocks are preferably four pel by four
pel blocks with each of the pels being represented by a grayscale value
determined by an image scanner. In the illustrative embodiments, 16
differing grayscale values are utilized ranging from zero for a white pel
to 15 for a black pel.
For the determination of the block average and block variance, preferably
the input grayscale value image data is passed through a low-pass filter
to reduce extraneous noise which may be present in the data. One low-pass
filter which may be used is defined by a three by three matrix which is
applied to each individual pel and all the pels contiguous thereto. The
function of the low-pass filter may be defined mathematically as:
##EQU3##
where I is the input grayscale value of the corresponding pel and the
matrix M (i,j) is the matrix 121 242 121 as follows:
##STR1##
A block diagram for an implementation of this low-pass filter is shown in
FIG. 3A. An understanding of the operation of the low-pass filter of FIG.
3A is facilitated by reviewing a 3.times.3 input matrix comprising the
elements V.sub.00 through V.sub.22 as shown in FIG. 3B. Each column of the
matrix is applied to the low-pass filter as signified by the first column
of the matrix V.sub.00, V.sub.10 and V.sub.20. V.sub.00 and V.sub.20 are
added together by an adder circuit 300, while V.sub.10 is multiplied by
two by a multiplier circuit 302. The outputs of the adder circuit 300 and
the multiplier 302 are passed to a second adder circuit 304. Thus, the
first and third elements of the first column are added to twice the second
element with the result appearing as the output of the adder circuit 304.
The result is stored in a latch circuit 306 and the next column V.sub.01,
V.sub.11 and V.sub.21 are passed to the input of the low-pass filter. The
same operation is performed on the second column and the results of the
multipliction and addition of the first column are passed from the latch
circuit 306 to a latch circuit 308 with the results of the processing of
the second column being stored in the latch circuit 306.
The third column is then processed as were the first and second columns
with the results remaining at the output of the adder circuit 304. The
contents of the latch circuit 308, i.e., the first column multiplied by
121, is added to the third column which also has been multiplied by 121 by
means of an adder circuit 310. The contents of the latch 306 which is the
center column multiplied by 121 is then multiplied by two by a multiplier
circuit 312 which then provides at its output the center column multiplied
by the factors 242 with all three of the columns appropriately multiplied
being added by an adder circuit 314.
The result is then divided by 16 by a divider circuit 316 and stored in a
latch circuit 318. The latch circuit 318 then holds the low-pass filtered
value for the center pel V.sub.11 of the three by three matrix of pels
shown in FIG. 3B. The latch circuits 306, 308 and 318 are all activated by
a load low-pass LDLP signal which stores the low-pass value for the
current pel into the latch circuit 318 and also loads the latch circuits
306 and 308 in preparation for determining the low-pass value of the pel
V.sub.12.
The block average of an image data block is determined by summing the
low-pass filtered grayscale values of all of the pels of the block and
dividing that sum by 16. The block average is defined by the equation:
##EQU4##
A block diagram of an implementation for determining the average grayscale
value, i.e., the block average, for an image data block is shown in FIG.
4A. Video image data from the low-pass filter previously described is
passed to the input 348 of the block average circuit of FIG. 4A on a scan
line | | |