WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Apparatus and method for use in image processing    
United States Patent5583949   
Link to this pagehttp://www.wikipatents.com/5583949.html
Inventor(s)Smith; Raymond W. (Bristol, GB2); Robson; Christopher J. (Bristol, GB2)
AbstractOptical character recognition is achieved by a system which comprises a scanner for scanning a document, an edge extractor for identifying edges in the image produced by the scanner to produce an outline of each object identified in the image, a segmentation facility for grouping the object outlines into blocks, means for identifying features of the outlines, and a final classification stage for providing data in an appropriate format representative of the characters in the image. Also disclosed are a novel edge extractor, a novel page segmentation facility and a novel feature extraction facility.
   














 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Drawing from US Patent 5583949
Apparatus and method for use in image processing - US Patent 5583949 Drawing
Apparatus and method for use in image processing
Inventor     Smith; Raymond W. (Bristol, GB2); Robson; Christopher J. (Bristol, GB2)
Owner/Assignee     Hewlett-Packard Company (Palo Alto, CA)
Patent assignment
All assignments
Publication Date     December 10, 1996
Application Number     08/468,517
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     June 6, 1995
US Classification     382/199 382/203
Int'l Classification     G06K 009/48
Examiner     Couso; Yon J.
Assistant Examiner    
Attorney/Law Firm    
Address
Parent Case     This is a continuation of application Ser. No. 07/956,593, filed Oct. 5, 1992 now abandoned, which is a continuation of application Ser. No. 373,137, filed Jun. 28, 1989, now abandoned.
Priority Data     Mar 03, 1989[EP]89302122
USPTO Field of Search     382/199 382/198 382/200 382/190 382/203 382/204 358/455 358/456
Patent Tags     image processing
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
5185809
Kennedy
382/131
Feb,1993

[0 after 0 votes]
5131053
Bernzott
382/176
Jul,1992

[0 after 0 votes]
4899225
Sasuga
358/448
Feb,1990

[0 after 0 votes]
4833721
Okutomi
382/197
May,1989

[0 after 0 votes]
4754489
Bokser
382/230
Jun,1988

[0 after 0 votes]
4712248
Hongo
382/199
Dec,1987

[0 after 0 votes]
4680805
Scott
382/197
Jul,1987

[0 after 0 votes]
4566124
Yamamoto
382/185
Jan,1986

[0 after 0 votes]
4504969
Suzuki
382/175
Mar,1985

[0 after 0 votes]
4213150
Robinson
348/26
Jul,1980

[0 after 0 votes]
3925760
Mason
382/310
Dec,1975

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


We claim:

1. An optical character recognition system comprising:

scanning means for optically scanning a document to produce a grey level image thereof;

edge extractor means comprising:

identifier means for identifying points along an edge within said grey level image using grey level values so that said points so identified represent substantially the strongest edge;

tracking means for automatically tracking the edge using grey level values to determine if the edge forms a closed loop and if so defining the edge as an outline,

said identifier means identifying alternate points of the edge if the edge does not form a closed loop and said tracking means automatically tracking an alternate edge associated with said alternate points together with at least some of said points on said strongest edge and determining whether the alternate edge forms a closed loop and if so defining the alternate edge as the outline; and

means for producing data indicative of an object based on at least one outline identified in said image, each outline comprising at least a part of one character; and

processing means for processing the data provided by said edge extractor means to produce an output representative of the characters in said image.

2. The system as claimed in claim 1, wherein the outlines identified by said edge extractor means are subjected to polygonal approximation means for processing the data prior to further processing by (1) following the outline of each object to identify a series of points on said outline having a predetermined characteristic, (2) measuring the depth of the concavity between each adjacent pair of said points, and (3) removing concavities having depths less than a predetermined value, whereby the output of said polygonal approximation means approximates the outlines of the characters in said image by a sequence of straight lines.

3. The system as claimed in claim 1, wherein the grey level image comprises a pixel array, said edge extractor being arranged to locate edges by means of an edge operator which generates edge values for each pixel from the grey level value of pixels adjacent to said pixel.

4. The system as claimed in claim 3, wherein the edge operator process a plurality of grey level values and is defined with respect to a 3.times.3 matrix neighborhood of x as follows:

______________________________________ 5 6 7 4 X 0 3 2 1 ______________________________________

in which the numbers represent the chain codes of adjacent points, the edge value of an adjacent point with chain code n being:

grey level (n+1) (modulo 8)-grey level (n-1) (modulo 8).

5. The system as claimed in claim 4, wherein each of the edges is traced in accordance with an algorithm in which, for a point with chain code n, edge values are derived for points with chain codes n-2, n-1, n, n+1, and n+2 from the current position X, the highest edge value is marked as the current position X and defines the strongest point, and the process is repeated to identify and mark successive strongest points.

6. The system as claimed in claim 1, wherein said processing means includes a page segmentation facility for arranging the data indicative of the outlines into a desired order and means for identifying particular features of said outlines.

7. The system as claimed in claim 6, wherein said page segmentation facility identifies groups of said outlines to form said outlines into blocks of outlines and arranges said blocks into said desired order.

8. The system as claimed in claim 7, wherein the page segmentation facility can identify blocks as text or non-text blocks by determining whether the proportion of spurious characters in the block exceeds a defined threshold value.

9. The system as claimed in claim 8, wherein the block type is determined on the basis of a classification of each outline.

10. The system as claimed in claim 9, wherein data representative of the classification of each outline is provided by an object outline classification process.

11. The system as claimed in claim 8, wherein the block order is determined in accordance with the manner in which the blocks nest.

12. The system as claimed in claim 6, wherein the output of said page segmentation facility comprises data representative of at least one possible class to which the object represented by said at least one outline belongs together with a weighting value, said weighting value indicating the probability that the outline is associated with a particular class.

13. The system as claimed in claim 1, wherein said edge extractor means further comprises a feature extraction facility for processing said data to identify particular features of each outline and to classify the outlines into classes in accordance with the identified features.

14. The system as claimed in claim 13, wherein the features comprise any one of concavity, closure, axis, symmetry or line either singly or in combination.

15. The system as claimed in claim 6, wherein said processing means includes a final classification stage for making the final classification of each of said outlines, wherein said final classification stage operates on the data output by the page segmentation facility using lexical rules, Markov chain analysis and dictionary search techniques.

16. The system as claimed in claim 15, wherein the output of said final classification stage is in ASCII code.

17. The system as claimed in claim 1, wherein the operation of the edge extractor derives an edge value for a pixel based on the grey level values of pixels immediately adjacent to the pixel based on the direction of tracking.

18. An image processing system comprising an edge extractor for automatically producing data indicative of the outline of objects in a grey level image generated by an optical scanning device, wherein said outline comprises at least part of one or more characters, said grey level image comprising a pixel array, said edge extractor comprising means for sequentially scanning said pixel array to locate a first strongest edge and for tracing the outline of one of the objects starting with said first strongest edge and by locating a plurality of additional strongest edges contiguous with one another to determine whether or not the entire outline can be traced using only said first and additional strongest edges and automatically locating alternate edges that are used to complete the entire outline when said additional strongest edges do not complete the outline, said edge extractor further comprising edge operator means for generating edge values for each pixel using grey level values, wherein the operation of the edge extractor is such that the edge value for a current pixel is derived from the relative values of the grey level values of the pixels which are immediately adjacent to the current pixel based on the direction of tracing.

19. The system as claimed in claim 18, wherein the edge operator processes a plurality of grey level values and is defined with respect to a 3.times.3 neighborhood matrix of x as follows: ##STR1## in which the members represent the chain codes of adjacent points, the edge value of an adjacent point with chain code n being:

grey level (n+1) (modulo 8)-grey level (n-1) (modulo 8).

20. The system as claimed in claim 19, wherein said outline is traced in accordance with an algorithm in which for a point with chain code n, edge values are derived for points with chain codes (n-2), (n-1), n, (n+1), and (n+2) from the current position, the point with the strongest edge values is marked, and the process is repeated to identify and mark successive strongest edge values.

21. The system according to claim 18, wherein the edge extractor further comprises a polygonal approximation facility for approximating the outline of an object generated by the edge extractor by at least one polygon.

22. An edge extractor according to claim 18, wherein the edge outline is represented by a closed loop.

23. An image processing system comprising means for scanning a document to generate an image represented by grey level values and automatically producing data representative of an outline of an object in the image comprising an edge extractor for tracing said outline by generating an edge value for a pixel derived from the grey level values of the pixels which are immediately adjacent to said pixel based on the direction of tracing, said edge extractor locating a strongest edge based on said edge values and tracing said outline by locating additional strongest edges if said additional strongest edges in combination complete the outline, said edge extractor automatically locating an alternate edge to complete the outline when said additional strongest edges do not complete the outline, said system further comprising a page segmentation facility for identifying groups of outlines to form said groups into blocks of outlines and for arranging said blocks into order, wherein said outlines are representative of at least a portion of one or more characters.

24. The system as claimed in claim 23, wherein the page segmentation facility can identify blocks as text or non-text by determining whether a proportion of spurious characters in the block exceeds a defined threshold value.

25. The system as claimed in claim 23, wherein the block type is determined on the basis of a classification of each outline.

26. The system as claimed in claim 25, wherein the data representing the classification of each outline is provided by an object outline classification process.

27. The system as claimed in any claim 24, wherein the block order is determined in accordance with the manner in which the blocks nest.

28. An image processing system comprising means for scanning a document to generate an image represented by grey level values and automatically producing output data representative of an outline of an object in the image, the system comprising an edge extractor for tracing said outline by generating an edge value for a pixel derived from the grey level values of pixels immediately adjacent to said pixel relative to the direction of tracing, said edge extractor locating a strongest edge based on said edge values and tracing said outline by locating additional strongest edges if said additional strongest edges in combination complete the outline, said edge extractor automatically locating an alternate edge to complete the outline when said additional strongest edges do not complete the outline, said system further comprising a facility for processing said data to identify particular features of each outline and to classify the outlines into classes of characters likely to represent the character for each outline in accordance with the identified features, and for classifying as spurious outlines those outlines not representing a character, wherein the features identified comprise one or more of concavity, closure, axis, symmetry and line.

29. The system as claimed in claim 28, wherein said output data is produced by said edge extractor.

30. The system as claimed in claim 28, wherein the classification data includes a rating or weighting value indicative of the strength of the match of the outline to the class with which the object has been associated.

31. A method of processing a document to produce data representative of characters on said document, said method comprising the steps of:

scanning said document to produce a pixel array of grey level values representing a grey level image of the document;

scanning the grey level values and for each grey level value comparing the grey level values of pixels immediately on either side of the pixel being scanned;

locating a first edge associated with one of the pixels based on the grey level values so compared;

automatically locating additional edges and tracing a strongest edge in said grey level image representing an outline comprising at least a portion of one or more characters in said image by comparing the grey level values immediately adjacent the pixel associated with each said additional edge; and

determining whether or not said strongest edge forms a complete outline;

automatically identifying an alternate edge that does form a complete outline when said strongest edge does not form a complete outline, wherein the alternate edge is identified by comparing grey level values;

processing data representing said strongest and said alternate edges that form complete outlines to provide data representative of the characters in said image, said processing step comprising automatically segmenting and classifying the data into blocks representative of a plurality of outlines, each outline comprising at least a portion of one or more characters.

32. The method as claimed in claim 31, wherein the data is subjected to a polygonal approximation step.

33. The method as claimed in claim 31, wherein said processing step comprises identifying a feature or features of said complete outlines and providing data representative of a class to which the object represented by said complete outlines belong.

34. A method as claimed in claim 33, wherein said features include concavities and closures.
 Description Submit all comments and votes
 


This invention relates generally to the field of optical character recognition.

BACKGROUND OF THE INVENTION

Optical character recognition devices read text from a printed page and produce an output representative of the text in an appropriate format, e.g. ASCII code. There are two basic techniques available in optical character recognition. One is based on matrix matching and the other on feature extraction. Matrix matching involves comparing a bit mapped image of a character with a set of templates. A major disadvantage of matrix matching is that it is limited to certain fonts and is very susceptible to character spacing. This is a particular drawback in relation to proportionally spaced characters.

Feature extraction involves recognition of selected features of characters and comparison of those features with stored versions of the features on which the system has been trained. Techniques based on feature extraction have the advantage that they are better able to read many different fonts. However known feature extraction methods are complex to implement and have a slow operating speed.

The present invention provides novel aspects which have particular application to optical character recognition based on feature extraction. Some of these methods and apparatus have more general application in the field of image processing.

SUMMARY OF THE INVENTION

A first aspect of the present invention provides an optical character recognition system comprising scanning means for optically scanning a document to produce an image thereof, means for identifying edges within said image and for identifying edge paths to produce data indicative of the outline of each object identified in said image, and means for processing the data provided by said edge identifying means and arranged to provide data, in an appropriate format, representative of the characters in said image.

A second aspect of the present invention provides an edge extractor for producing data indicative of the outline of objects in a grey level image generated by an optical scanning device and which comprises a pixel array, said extractor being arranged to sequentially scan said image array for an edge and for each edge located trace an outline of the object associated with said edge, said edge extractor being arranged to locate edges by means of an edge operator which generates edge values for each pixel.

A third aspect of the present invention provides an image processing system of the type which scans a document and which produces data representative of the outline of each object identified on the document, said system including a page segmentation facility for identifying groups of said object outlines to form said outlines into blocks of outlines and for arranging said blocks into order.

A fourth aspect of the present invention provides an image processing system of the type which scans a document and which produces data representative of the outline of each object identified on the document, said system including a facility for processing said data to identify particular features of each outline and to classify the outlines into classes in accordance with the identified features.

A fifth aspect of the present invention provides a system for processing a document to produce data representative of characters on said document in which said document is scanned to produce an image thereof and an edge identifying means processes the image data to produce data representative of the outline object identified in the image, the system including a polygonal approximation facility for reducing the number of irregularities in each outline said polygonal approximation facility operating by following the outline of each object to identify a series of points on said outline having a predetermined characteristic, measuring the depth of the concavity between each adjacent pair of points and removing concavities having depths less than a given value.

A sixth aspect of the present invention provides a method of processing a document to produce data representative of characters on said document, said method comprising scanning said document to produce an image thereof, identifying edge paths in said image, and processing data representing said edge paths to provide data, in an appropriate format, representative of the characters in said image.

A seventh aspect of the present invention provides a method for processing data representing the edge outlines in an image of a document to produce polygonal approximation of said outline, said method comprising following each outline to identify a series of points along said outline which have a predetermined characteristic, measuring the depth of the concavity between each adjacent pair of points and removing concavities having depths less than a given value.

BRIEF DESCRIPTION OF THE DRAWINGS

The invention will be described now by way of example only with particular reference to the accompanying drawings. In the drawings:

FIG. 1 is a block schematic diagram of an optical character recognition system in accordance with the present invention;

FIGS. 2(A), 2(b), 3(a), 3(b) and 4 illustrate the operation of the system of FIG. 1;

FIG. 5 is a block schematic diagram of an edge extractor which is used in the system of FIG. 1;

FIGS. 6(a)-6(c), 7(a)-7(d), 8(a)-8(d), 9(a)-9(d), 10(a), 10(b), 11(a) and 11(b) illustrate the operation of the edge extractor of FIG. 5;

FIG. 12 is a block schematic of the text ordering part of the system of FIG. 1;

FIGS. 13 and 14 illustrate the operation of the polygonal approximation facility;

FIGS. 15(a), 15(b), 169(a), 16(b), 17(a), 17(b), 18(a), 18(b), 19(a), 19(b), 20, 21(a) and 21C. illustrate the way in which character features are identified;

FIG. 22 is a block schematic diagram of the feature extraction, text ordering and segmentation, and final classification part of the system of FIG. 1;

FIGS. 23(a)-23(d) show several examples of the way in which a letter "E" can be represented;

FIG. 24 is a functional representation of the template generation facility;

FIGS. 25(a), 25(b), 26, 27(a)-27(d) illustrate the operation of the template generation facility;

FIGS. 28(a), 28(b), 29, 30 and 31 illustrate the operation of the segmentation and final classification stages.

DETAILED DESCRIPTION

Referring to FIG. 1 an optical character recognition system comprises a scanner and scanner interface 10, an edge extractor 11, a polygonal approximation means 13, a text ordering facility 12 which operates in conjunction with a feature extractor 14 and a merger/segmenter 15 and a final classifier 16. The scanner can be any suitable commercially available device capable of producing a grey level image. One such device which can be used is a Hewlett-Packard HP9I90A scan-jet device. This device produces a 16 grey level image with a resolution of 300 pixels per inch. At this resolution each A4 page can be read in approximately 30 seconds and requires a storage capability in excess of 4 megabytes (Mbytes). This type of device makes use of a linear charge couple device array as the light sensing element and a typical grey level image for the letter "a" is shown in FIG. 2. FIG. 2A shows the image form and FIG. 2B the form of the image in hexadecimal values. In these values 0 represents black and F represents white, the values 1 to 9, A to E in between representing various levels of grey.

The edge extractor 11 operates by raster scanning the grey level image to locate edges. When an edge is found it is followed all the way around until a closed loop is formed thereby forming an outline of an object, e.g. character associated with the edge. If the edge does not form a closed loop it is ignored. A significant feature of the present optical character recognition arrangement is the use of a grey level image in conjunction with an edge extractor 11. The following are important advantages provided by this arrangement.

1. Looking for edges rather than black regions provides the ability to read any color of text on the background of any color provided that the contrast is sufficiently great. This is not possible by thresholding without manually selecting for positive or negative print.

2. In threshold type arrangements, when an image is thresholded non-text portions of the image form the same kind of black regions on a white-background as text. This causes the character cell location algorithm to be confused. With a grey level image, the vast majority of edges in the non-text region do not form closed loops and are therefore rejected. Any edges which do form closed loops will either not look like any known character, or will fail to form a reasonable text line. Hence grey level images provide discrimination between text and non-text.

3. Thresholded images have inadequate quality for use either in document storage or in new documents. Half-toning algorithms for scanning destroy text; therefore to obtain the text and images from a page in one scan, grey scale is essential.

4. Apart from a simple raster scan of the entire image from the scanner, the edge extractor examines only the edges of characters. The number of pixels in the image that are analyzed in detail is thus reduced by about 90%, thereby making it possible to process a page in a reasonable time. The output of the edge extractor is a long sequence of steps between neighboring pixels representing the outlines of the characters on a page which have been scanned. Prior to transmission to the text ordering facility this output from the edge extractor 11 is subjected to a polygonal approximation step in which the outline of the characters are approximated by sequences of straight line segments, i.e. polygons. There are two main reasons for approximating the outlines prior to passing them on to the next stage.

1. Both polygonal approximation and the processes used in feature extraction consume linear time with respect to the number of points on the input polygon. The constant of proportionality however is much larger in the case of feature extraction. Consequently, polygonal approximation results in considerable time saving.

2. It will become apparent that one of the most important features used in character recognition is the presence of concave regions in the outline. Polygonal approximation straightens out the zig-zags in the outline which would otherwise produce false concavities and confuse the recognition process.

An example of the polygonal approximation is shown in FIG. 3A and 3B. FIG. 3A represents the typical edge path of a character "a" and FIG. 3B is the equivalent outline after It has been subjected to polygonal approximation.

The output of the edge extractor 11 can be considered as a series of image blobs each of which represents the edge path of an object typically a character together with information on its bounding rectangle. This is the main input to the text ordering process 12, feature extractor 14, and merger segmenter 15. The function of the feature extractor 14 is to extract from the outline of the characters certain fundamental features. The particular features which have been selected for extraction by the present arrangement are as follows:

1. Concavities. In general terms a concavity is an area that can be filled and which may include some convex points. Three concavities are illustrated by circles 20 in FIG. 4.

2. Closure. A closure is a region of the background which is completely enclosed by the character. In FIG. 4 a closure is shown by circle 21. In certain fonts characters such as "p" and "e", are printed with an incomplete closure. It is the presence of functional closure rather than physical closure which is to be detected.

3. Lines. A line is a straight part of the outline. Line detection is complicated by the polygonal approximation since curves are converted into a small number of straight lines. They are thus rendered less distinct from actual straight lines.

4. Axes. An axis is sought for objects which have no concavity or closure. The axis feature measures the ratio of the lengths of the major and minor axes of a convex object. It is used to distinguish between characters such as a dot and a straight comma.

5. Symmetry. Symmetry is measured by comparing parts of the character on either side of an axis. Measurement of symmetry is difficult in the case of italicized text, where the axis of symmetry is not perpendicular to the direction of measurement.

In broad outline, using the features set out above, an unknown character is matched by the feature extractor 14 with a generic character class, i.e. it produces a class or set of ASCII characters which the blob could be. The mapping of generic character classes to ASCII characters is not one-to-one. A small number of characters appear in more than one fundamental shape, for example "g". Such characters must have more than one class which map to the same ASCII character.

A more common case is the single class which can map one of several characters. In certain fonts for instance, the characters "1" (one), "l" (lower case l) and "I" (capital I) are identical. The only way to distinguish them is by context. At the feature extraction stage there will be several generic classes to cover the various physical shapes of these three characters. Most of these classes will map to one of two or three ASCII characters, depending on the particular shape.

The main function of the text ordering process 12 is to identify the document structure and extract the correct ordering of the text within the document. The principal steps in text ordering are as follows:

A. Identify separate regions (blocks) within the document.

B. Classify the blocks into text or graphics.

C. Order text blocks into logical reading sequence.

D. Order text blocks into character order.

E. Deduce white space characters.

F. Associate parts of deliberately broken characters.

G. Identify potentially merged characters.

H. Resolve intrinsic character ambiguities

(those which are a feature of the font) by geometry.

The merger segmenter 15 makes an attempt to merge the outlines of two or more blobs which have been identified as parts of an unintentionally broken character or to segment the outline of a blob which has been identified as the ligature of two or more characters. This process gives the system the capability of reading relatively poor copies such as that output from a photocopier.

The final classifier 16 makes the final decision as to the ASCII character associated with each blob by resolving any outstanding ambiguities, for example, the final classification can discriminate upper and lower case characters by using contextual information. It can also use context to resolve difficult ambiguities such as distinguishing between the number "1" a small letter "l" and a capital "I". The techniques available to the final classifier 16 include lexical rules (such as no numbers within words), Markov chain analysis and dictionary search.

The above is a general outline of the overall structure of the optical character recognition system. Certain elements of this overall structure will now be described in more detail.

As has been explained earlier an important element in the present system is the edge extractor. The edge extractor is represented functionally in FIG. 5 of the drawings. The input to the edge extractor is a grey level image which comprises an array of pixels which are stored as shown at 28. This array is raster scanned as indicated at 30 to locate an edge, that is a significant difference in grey level between neighboring pixels. Once an edge is located the edge extractor operates to trace an outline as indicated at 31. The tracing continues until a closed loop edge path has been located as indicated at 32. This closed loop path is approximated as indicated at 33 to produce a polygonal approximation. The outlines are associated as indicated at 34 to produce a blob outline of an object which is fed to the text ordering process. The edge extractor 11 uses a novel edge operator to locate and follow edges around outlines until they form a closed loop. This operator is designed to find the edge strength perpendicular to the direction of travel between neighboring pixels. FIG. 6 illustrates the way in which the edge extractor 11 works. FIG. 6A shows a 3.times.3 array of pixels it being understood that this is a small part of the image of a document which has been scanned which will typically be 2,400.times.3,300 pixels. As has been indicated the edge extractor raster scans the image and "X" represents the current point on the scan. It will be seen that on FIG. 6A the chain code of each point in the neighborhood of "X" is given by a number in the box corresponding to that pixel. If "X" is the present position then the edge value at the next position with chain code "N" is the difference between the grey levels at points with chain codes (N+1) (modulo 8) and (N-1) (modulo 8). Thus referring to FIG. 6A the edge value at the point with chain code 0 is found by subtracting the grey level at 7 from the grey level at 1. This step is illustrated in FIG. 6B; FIG. 6C shows a similar representation of the operator for chain code 7. It should be noted that there is only a single operator for each direction and that the result is signed. An edge is indicated when there is an edge value greater than a specific threshold.

Once an edge is located an edge following algorithm comes into effect. The main process is to apply an edge operator corresponding to chain code 4 in a raster scan of the image. The raster scan thus locates the top edge of a new object which typically will be a character of text. When such a point is located with an edge value greater than a specific starting threshold and of either sign, the raster scan repeats a check one pixel lower in the image. When an edge is located there must be an edge of the same sign and strength greater than the starting threshold over a 2 by 2 cell-to ensure that the edge is significant. The light and dark grey levels from the strongest of these edges are used to determine a center value which is the average of the light and dark grey levels. On detection of an edge raster scanning is suspended and the line tracking process is started from the point with the higher edge value. The double check is provided to ensure that the edge is significant and that tracking starts at the strongest edge value. A region of grey values centered on the center value will be regarded during outline following as "grey", those above as "white" and those below as "black".

The tracking algorithm operates essentially as follows. Line tracking commences by tracking left from the starting point, i.e. from the point shown as chain code 4 on FIG. 6 and attempts to complete an anti-counter track of an outline. Given the chain code N of the last direction taken, edge values are found at each of the five points with chain codes N-2, N-1, N, N+1 and N+2 (modulo 8) from the current position. We define the five or less points which have the correct sign and are above the threshold, the edge set corresponding to this position and direction. From the edge set the point with the strongest edge value is found. The choice of the next point is non-trivial, but in most cases is the point with the strongest edge value.

This is illustrated in FIG. 7 of the drawings. In FIG. 7A there is shown a 5.times.5 array of pixels and the values in each square of the array represents the actual grey level values for part the character "a" which was illustrated in FIG. 2 of the drawings. Assuming that the current point is that circled, i.e. point 80 and the last move was in direction 4 in FIG. 6, then the edge values which would be calculated at the circle 80 are shown in FIG. 7B. The next step is to the rounded square 81 corresponding to edge value 13 shown in FIG. 7B. The edge values shown in FIG. 7C are those calculated at this square. FIG. 7D demonstrates how the calculated edge values change with the last chain code and represent the edge values at the diamond 82.

FIG. 7 shows that at each step there are several useful edge values. As long as the edge following process is in progress, points on the edge path are marked current to indicate that they form part of the outline being processed. If only one point is marked at each step the raster scan will later find unused points on the same edge and attempt to follow the outline again. Therefore as many points as possible (up to a limit of three) are marked subject to the following constraints:

A. The strongest edge is always marked.

B. All points to be marked must be in the edge set and have edge values over the tracking threshold.

C. All marked points must be adjacent.

Thus in FIG. 7 the points marked will be 13, 10 in FIG. 7B, 11, 12 in FIG. 7C, and 12 in FIG. 7D assuming a threshold of 4.

If at any time no points are found over the tracking threshold then the edge is considered to have faded, the object is discarded and the whole edge is remarked as deleted. The tracking process continues until the edge fades, a closed loop is completed, or the edge coincides with a previously completed outline. When an edge closes a satisfactory loop, all points on the loop are remarked used. The reason for changing all the marks when an edge is complete is to enable the discrimination between collisions with other lines and closing up the loop.

FIG. 7 also illustrates that there are instances when the next step is not an obvious choice. For example at the rounded square in FIG. 7C the highest edge values are 12 and 11. In many cases the choice has an insignificant impact on the result, although this is not necessarily always the case. The choice can have an effect in the following situations.

A. Line cross over: In some fonts lines are of variable width. At the thinnest part of the line it may be possible to follow an edge path across the line and thus incorrectly break-up the character.

B. Object merger: Some characters may be printed in very close juxtaposition. At the point when the two characters almost touch, there can be a two-way decision to be made when traversing either character. Both decisions must be correct otherwise the outline of one will collide with the other, resulting in the loss of one or both characters.

C. Fading: Where an edge is blurred it can be necessary to take a path corresponding to a weaker edge value in order to avoid the edge fading at a subsequent step.

The problems are exemplified in FIG. 8 of the drawings. FIG. 8A shows a portion of the character "a" which is illustrated in FIG. 2 of the drawings, particularly that part of the "a" at the point where the top of the loop meets the stem. The darkest pixels are shaded. FIG. 8B gives the edge values at the circled point 90, showing two points with an edge value 11. The edge sets from each of these points are shown in FIGS. 8C and 8D with the possible destinations indicated by a rounded square and a diamond in FIG. 8A. The rounded square is the correct path and the diamond is on the wrong side of the loop.

Problem B referred to above is illustrated in FIG. 9. FIG. 9A shows the correct outlines for a pair of closely spaced characters. The starting point for following each outline is marked with a +sign. In FIG. 9B a different choice was made while tracing the "t" resulting in a merging of the characters. The same path was followed in FIG. 9C, but with a different decision on the second path through the difficult region. The result is that the "i" is completed, but the edge did not close at the start of the loop and there fore both characters were lost. FIG. 9D shows the situation where the "t" was followed correctly, but the same route was taken for the "i" as in FIG. 90 so that it collided with the complete "t" and was deleted. A solution to this problem is to look ahead several steps to edge values beyo