WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Automatic training of character templates using a text line image, a text line transcription and a line image source model    
United States Patent5594809   
Link to this pagehttp://www.wikipatents.com/5594809.html
Inventor(s)Kopec; Gary E. (Belmont, CA); Chou; Philip A. (Menlo Park, CA); Niles; Leslie T. (Palo Alto, CA)
AbstractA technique for automatically producing, or training, a set of bitmapped character templates defined according to the sidebearing model of character image positioning uses as input a text line image of unsegmented characters, called glyphs, as the source of training samples. The training process also uses a transcription associated with the text line image, and an explicit, grammar-based text line image source model that describes the structural and functional features of a set of possible text line images that may be used as the source of training samples. The transcription may be a literal transcription of the line image, or it may be nonliteral, for example containing logical structure tags for document formatting and layout, such as found in markup languages. Spatial positioning information modeled by the text line image source model and the labels in the transcription are used to determine labeled image positions identifying the location of glyph samples occurring in the input line image, and the character templates are produced using the labeled image positions. In another aspect of the technique, a set of character templates defined by any character template model, such as a segmentation-based model, is produced using the grammar-based text line image source model and specifically using a tag transcription containing logical structure tags for document formatting and layout. Both aspects of the training technique may represent the text line image source model and the transcription as finite state networks.
   














 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Drawing from US Patent 5594809
Automatic training of character templates using a text line image, a

     text line transcription and a line image source model - US Patent 5594809 Drawing
Automatic training of character templates using a text line image, a text line transcription and a line image source model
Inventor     Kopec; Gary E. (Belmont, CA); Chou; Philip A. (Menlo Park, CA); Niles; Leslie T. (Palo Alto, CA)
Owner/Assignee     Xerox Corporation (Stamford, CT)
Patent assignment
All assignments
Publication Date     January 14, 1997
Application Number     08/431,253
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     April 28, 1995
US Classification     382/161 382/228
Int'l Classification     G06K 009/62
Examiner     Boudreau; Leo
Assistant Examiner     Tran; Phuoc
Attorney/Law Firm     Bares; Judith C.
Address
Parent Case    
Priority Data    
USPTO Field of Search     382/161 382/177 382/228 382/229 382/230 382/309 382/310 395/2.49 395/2.51 395/2.64 395/2.65
Patent Tags     automatic training character templates text line image, a text line transcription line image source model
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
5526444
Kopec
382/233
Jun,1996

[0 after 0 votes]
5333275
Wheatley
704/243
Jul,1994

[0 after 0 votes]
5321773
Kopec
382/209
Jun,1994

[0 after 0 votes]
5303313
Mark
382/235
Apr,1994

[0 after 0 votes]
5237627
Johnson
382/198
Aug,1993

[0 after 0 votes]
5020112
Chou
382/226
May,1991

[0 after 0 votes]
4599692
Tan
706/12
Jul,1986

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


What is claimed:

1. A method of operating a machine to train a set of bitmapped character templates for use in a recognition system; each of the bitmapped character templates being based on a character template model defining character image positioning referred to as a sidebearing model of character image positioning; the machine including a processor and a memory device for storing data; the data stored in the memory device including instruction data the processor executes to operate the machine; the processor being connected to the memory device for accessing the data stored therein; the method comprising:

operating the processor to receive and store an image definition data structure defining an image including a plurality of glyphs indicating a single line of text, hereafter referred to as a text line image source of glyph samples; each glyph occurring in the text line image source of glyph samples being an image instance of a respective one of a plurality of characters in a character set, hereafter referred to as a glyph sample character set; each one of the set of bitmapped character templates being trained representing a respective one of the plurality of characters in the glyph sample character set;

operating the processor to receive and store in the memory device a text line image source model data structure, hereafter referred to as a text line image source model; the text line image source model modeling as a grammar a spatial image structure of a set of text line images; the text line image source of glyph samples being one of the set of text line images modeled by the text line image source model; the text line image source model including spatial positioning data modeling spatial positioning of the plurality of glyphs occurring in the text line image source of glyph samples;

operating the processor to determine, for each respective glyph occurring in the text line image source of glyph samples, an image coordinate position in the text line image source of glyph samples indicating an image origin position of the respective glyph using the spatial positioning data included in the text line image source model; each image coordinate position being hereafter referred to as a glyph sample image origin position;

operating the processor to produce a glyph label data item paired with each glyph sample image origin position determined for the respective glyphs occurring in the text line image source of glyph samples; each glyph label data item being hereafter referred to as a respectively paired glyph label; each respectively paired glyph label indicating the character in the glyph sample character set represented by the respective glyph; the processor, in producing each respectively paired glyph label, using mapping data included in the text line image source model mapping respective ones of the glyphs occurring in the text line image source of glyph samples to respectively paired glyph labels; the processor, further in producing each respectively paired glyph label, using a text line transcription data structure associated with the text line image source of glyph samples, hereafter referred to as a transcription, including an ordered arrangement of transcription label data items; the processor using the transcription and the mapping data to pair each glyph label with the respective glyph sample image origin position of a respective glyph occurring in the text line image source of glyph samples; and

operating the processor to produce the set of bitmapped character templates using the text line image source of glyph samples, the glyph sample image origin positions and the respectively paired glyph labels; the processor determining, in each bitmapped character template produced, an image pixel position included therein indicating a template image origin position; each bitmapped character template produced having a characteristic image positioning property such that, when a second bitmapped character template is positioned in an image with the template image origin position thereof displaced from the template image origin position of a preceding first bitmapped character template by a character set width thereof, and when a first bounding box entirely containing the first bitmapped character template overlaps in the image with a second bounding box entirely containing the second bitmapped character template, the first and second bitmapped character templates have substantially nonoverlapping foreground pixels.

2. The method of claim 1 of operating the machine to train bitmapped character templates wherein operating the processor to produce the set of bitmapped character templates includes

determining, for each bitmapped character template, a collection of sample image regions, referred to as a glyph sample collection, occurring in the text line image source of glyph samples using the glyph sample image origin positions and the respectively paired glyph labels; each sample image region including one of the glyph sample image origin positions; and

producing the set of bitmapped character templates contemporaneously using the glyph sample collections by assigning foreground pixel color values to selected template pixel positions in respective ones of the bitmapped character templates; one of the selected template pixel positions in a first one of the set of bitmapped character templates being selected for assigning a foreground pixel color value thereto on the basis of template contribution measurements computed using sample pixel positions included in the glyph sample collections identified for the character represented by the first bitmapped character template.

3. The method of claim 2 wherein

each bitmapped character template is represented as a template image region data structure, referred to as a template image region, including a template pixel position designated as a template origin pixel position; the template image region indicating a respective one of the characters in the glyph sample character set being paired with the glyph sample collection identified for the respective character, and being referred to as a respectively paired template image region;

each sample image region in one of the glyph sample collections has a vertical and horizontal size dimension determined relative to the template origin pixel position in the respectively paired template image region so that the glyph sample image origin position determined for each glyph sample is positioned in the respective sample image region at the sample pixel position identical in pixel location to the template origin pixel position in the respectively paired template image region; whereby the sample image regions in each glyph sample collection are effectively aligned at respective glyph sample image origin positions and are hereafter referred to as aligned sample image regions; each sample pixel position in a first one of the aligned sample image regions being respectively paired with the sample pixel position in the same pixel location in a second one of the aligned sample image regions; and

the step of assigning foreground pixel color values to template pixel positions uses respectively paired sample pixel positions included in the aligned sample image regions for all of the glyph sample collections and further includes

(a) computing the template contribution measurement for each template pixel position using each respectively paired sample pixel position included in the sample image region;

(b) selecting the template pixel position having the highest positive template contribution measurement as a selected template pixel position;

(c) assigning a foreground pixel color value to the selected template pixel;

(d) modifying each sample pixel position paired with the selected template pixel position to indicate a background pixel color value; and

(e) repeating steps (a) through (d) while at least one of the template contribution measurements being computed is positive.

4. The method of claim 1 of operating the machine to train bitmapped character templates further including operating the processor to determine, for each bitmapped character template produced, a character set width using the text line image source of glyph samples, the glyph sample image origin positions and the respectively paired glyph labels; the character set width indicating an image distance measurement between the template image origin position of the bitmapped character template and a template image origin position of a next adjacent bitmapped character template.

5. The method of claim 1 of operating the machine to train bitmapped character templates wherein

the transcription associated with the text line image source of glyph samples is a tag transcription including at least one nonliteral transcription label, hereafter referred to as a tag, indicating at least one character code representing a character with which a respective glyph in the text line image source of glyph samples cannot be paired by visual inspection thereof; the at least one character code indicated by the tag indicating markup information about the text line image source of glyph samples; the markup information, when interpreted by a document processing operation, producing at least one display feature included in the text line image source of glyph samples perceptible as a visual formatting characteristic of the text line image source of glyph samples; and

the processor, in producing the respectively paired glyph labels using the transcription and the mapping data,

uses the spatial positioning information about the plurality of glyphs to identify at least one glyph sample in the text line image of glyph samples related to the tag, and

uses the at least one character code indicated by the tag in producing the respectively paired glyph label paired with the at least one glyph sample identified.

6. The method of claim 1 wherein

the text line image source model is represented as a stochastic finite state network data structure indicating a regular grammar, hereafter referred to as a text line image source network; the text line image source network modeling the text line image source of glyph samples as a series of nodes and transitions between pairs of the nodes; the text line image source network representing the mapping data mapping a respective one of the glyphs occurring in the text line image source of glyph samples to a glyph label as at least one sequence of transitions from a first node to a final node, called a path, through the text line image source network; the path indicating path data items associated therewith and accessible by the processor; the path data items associated with the path indicating a pairing between substantially each one of the plurality of glyphs occurring in the text line image source of glyph samples and a glyph label indicating a character in the glyph sample character set;

the transcription data structure associated with the text line image source of glyph samples is represented as a finite state network data structure, hereafter referred to as a transcription network, modeling a set of transcriptions as a series of transcription nodes and a sequence of transcription transitions between pairs of the transcription nodes; each transcription transition having a transcription label associated therewith; at least one sequence of transcription transitions, called a transcription path, through the transcription network indicating the ordered arrangement of the transcription labels in one of the transcriptions included in the set of transcriptions; and

the processor, in determining the glyph sample image origin positions and the respectively paired glyph labels, merges the series of nodes of the text line image source network with the series of transcription nodes of the transcription network to produce a transcription-image network indicating modified mapping data mapping a respective one of the transcription labels included in the transcription to a respective glyph sample image origin position and to a respectively paired glyph label indicating the character in the glyph sample character set; the transcription-image network representing the modified mapping data as at least one complete transcription-path through the transcription-image network; the at least one complete transcription-path indicating the path data items; performs a decoding operation on the text line image source of glyph samples using the transcription-image network to produce the at least one complete transcription-path; and obtains the glyph sample image origin positions and the respectively paired glyph labels using the path data items indicated by the at least one complete transcription-path.

7. The method of claim 6 wherein

the path data items associated with the complete transcription-path include a message string, a character template and an image displacement;

the transcription labels associated with transitions in the transcription network are message strings in the transcription-image network such that the transcription-image network models a relationship between each transcription label and a glyph occurring in the text line image of glyph samples; and

the processor determines, for transitions in the transcription-path having non-null character templates associated therewith, the glyph sample image origin position of each glyph by using the image displacement associated with the respective transition, and determines the respectively paired glyph label using a character label indicated by the non-null character template indicating the character in the glyph sample character set represented by the respective glyph sample.

8. The method of claim 6 wherein operating the processor to perform the decoding operation on the text line image source of glyph samples to produce the complete transcription-path includes

producing a plurality of complete transcription-paths through the transcription-image network; each complete transcription-path indicating a target text line ideal image;

computing a target image value for each one of the plurality of target text line ideal images by comparing, for each target text line ideal image, color values indicated by pixels defining the text line image source of glyph samples with color values of respectively paired pixels defining the target text line ideal image; and

determining one of the plurality of complete transcription-paths as a best complete transcription-path using the target image values.

9. The method of claim 6 wherein operating the processor to perform a decoding operation on the text line image source of glyph samples to produce the complete transcription-path includes

performing a dynamic programming based decoding operation to compute an optimum score at each of a plurality of lattice nodes in a decoding lattice data structure representing the transcription-image network; the dynamic programming based decoding operation producing and storing an optimizing transition identification data item for each lattice node in the decoding lattice; the optimizing transition identification data item being produced as a result of computing the optimum score and indicating one of a plurality of possible transitions into a respective one of the lattice nodes that optimizes the score for the respective lattice node; and

performing a backtracing operation to retrieve a sequence of transitions indicating a decoding lattice path; the backtracing operation starting with a final lattice node and ending with a first lattice node in the decoding lattice path; the sequence of transitions being retrieved using the optimizing transition identification data item produced for each lattice node as a result of computing the optimum scores; the decoding lattice path indicating the complete transcription-path through the transcription-image network.

10. The method of claim 6 wherein

the character template associated with a transition in the transcription-image network is one of a plurality of initial character templates representing a respective character in the glyph sample character set;

the decoding operation uses the plurality of initial character templates to produce the complete transcription-path; and

after producing the set of bitmapped character templates using the text line image source of glyph samples, the glyph sample image origin positions and the respectively paired glyph labels thereof, performing at least one additional iteration of the steps of performing the decoding operation, obtaining the glyph sample image origin positions and respectively paired glyph labels, and producing the set of bitmapped character templates; wherein the at least one additional iteration of the decoding operation uses the set of bitmapped character templates produced in a prior iteration as the plurality of initial character templates.

11. The method of claim 1 wherein the processor, prior to determining the glyph sample image origin positions and the respectively paired glyph labels, produces the text line image source of glyph samples by performing a text-line segmentation operation on an input two-dimensional (2D) image source of glyph samples.

12. A method of operating a machine to train a set of bitmapped character templates for use in a recognition system; the machine including a processor and a memory device for storing data; the data stored in the memory device including instruction data the processor executes to operate the machine; the processor being connected to the memory device for accessing the data stored therein; the method comprising:

operating the processor to receive and store an image definition data structure defining an image including a plurality of glyphs occurring therein indicating a single line of text, hereafter referred to as a text line image source of glyph samples; each glyph occurring in the text line image source of glyph samples being an image instance of a respective one of a plurality of characters in a character set, hereafter referred to as a glyph sample character set; each one of the set of bitmapped character templates being trained representing a respective one of the plurality of characters in the glyph sample character set;

operating the processor to receive and store in the memory device a text line image source model data structure, hereafter referred to as a text line image source model; the text line image source model modeling the text line image source of glyph samples as a grammar and including spatial positioning data modeling spatial positioning of the plurality of glyphs occurring in the text line image source of glyph samples;

operating the processor to determine a plurality of glyph samples occurring in the text line image source of glyph samples using the spatial positioning data included in the text line image source model;

operating the processor to produce a glyph label data item, hereafter referred to as a respectively paired glyph label, paired with each glyph sample occurring in the text line image source of glyph samples; the respectively paired glyph label indicating the respective one of the characters in the glyph sample character set represented by the glyph sample; the processor, in producing each respectively paired glyph label, using mapping data included in the text line image source model mapping a respective one of the glyphs occurring in the text line image source of glyph samples to a glyph label indicating the character in the glyph sample character set represented by the respective glyph; the processor, further in producing each respectively paired glyph label, using a text line transcription data structure associated with the text line image source of glyph samples including an ordered arrangement of transcription label data items; the text line transcription data structure including at least one nonliteral transcription label, hereafter referred to as a tag, indicating at least one character code representing a character with which a respective glyph in the text line image source of glyph samples cannot be paired by visual inspection thereof; the at least one character code indicated by the tag indicating markup information about the text line image source of glyph samples; the markup information, when interpreted by a document processing operation, producing at least one display feature included in the text line image source of glyph samples perceptible as a visual formatting characteristic of the text line image source of glyph samples; the text line transcription data structure being hereafter referred to as a tag transcription; the processor, in producing the respectively paired glyph label using the tag transcription and the mapping data,

using the spatial positioning information about the plurality of glyphs to identify the glyph sample in the text line image of glyph samples related to the tag, and

using the tag in producing the respectively paired glyph label paired with the glyph sample identified; and

operating the processor to produce the set of bitmapped character templates indicating the characters in the glyph sample character set using the glyph samples identified by the respectively paired glyph labels.

13. The method of claim 12 wherein

the text line image source model is represented as a stochastic finite state network data structure indicating a regular grammar, hereafter referred to as a text line image source network; the text line image source network modeling the text line image source of glyph samples as a series of nodes and transitions between pairs of the nodes; the text line image source network representing the mapping data mapping a respective one of the glyphs occurring in the text line image source of glyph samples to a glyph label as a sequence of transitions from a first node to a final node, called a path, through the text line image source network; each transition having path data items accessible by the processor associated therewith; the path data items including a message string, a character template and an image displacement; the path data items indicating a pairing between substantially each one of the plurality of glyphs occurring in the text line image source of glyph samples and a glyph label indicating a character in the glyph sample character set;

the tag transcription associated with the text line image source of glyph samples is represented as a finite state network data structure, hereafter referred to as a tag transcription network, modeling a set of tag transcriptions, produced as an output of a recognition operation performed on the text line image source of glyph samples, as a series of transcription nodes and a sequence of transcription transitions between pairs of the transcription nodes; each transition having a transcription label associated therewith; one transcription transition having the tag associated therewith; at least one sequence of transcription transitions, called a transcription path, through the tag transcription network indicating the ordered arrangement of the transcription labels in one of the tag transcriptions included in the set of tag transcriptions; and

the processor, in determining the glyph samples and the respectively paired glyph labels, merges the series of nodes of the text line image source network with the series of transcription nodes of the tag transcription network to produce a transcription-image network indicating modified mapping data mapping a respective one of the transcription labels included in the tag transcription to a respective glyph sample and to a respectively paired glyph label indicating the character in the glyph sample character set; transcription labels associated with transcription transitions in the tag transcription network becoming message strings associated with transitions in the transcription-image network; the tag associated with the transcription transition in the tag transcription network becoming a message string associated with a transition included in the transcription-image network such that the transcription-image network models a relationship between the tag and at least one glyph occurring in the text line image of glyph samples; the transcription-image network representing the modified mapping data as a complete transcription-path through the transcription-image network; the complete transcription-path indicating the path data items; performs a decoding operation on the text line image source of glyph samples using the transcription-image network to produce the complete transcription-path; and determines, for each respective one of the transitions in the transcription-path having a non-null character template associated therewith, the glyph sample indicated by the transition using the image displacement associated therewith, and determines the respectively paired glyph label using a character label indicated by the non-null character template indicating the character in the glyph sample character set represented by a respective glyph sample.

14. The method of claim 13 wherein operating the processor to perform the decoding operation on the text line image source of glyph samples to produce the complete transcription-path includes

producing a plurality of complete transcription-paths through the transcription-image network; each complete transcription-path indicating a target text line ideal image;

computing a target image value for each of the plurality of target text line ideal images by comparing, for each target text line ideal image, color values indicated by pixels defining the text line image source of glyph samples with color values of respectively paired pixels defining the target text line ideal image; and

determining one of the plurality of complete transcription-paths as a best complete transcription-path using the target image values.

15. The method of claim 13 wherein operating the processor to perform a decoding operation on the text line image source of glyph samples to produce the complete transcription-path includes

performing a dynamic programming based decoding operation to compute an optimum score at each of a plurality of lattice nodes in a decoding lattice data structure representing the transcription-image network; the dynamic programming based decoding operation producing and storing an optimizing transition identification data item for each lattice node in the decoding lattice; the optimizing transition identification data item being produced as a result of computing the optimum score and indicating one of a plurality of possible transitions into a respective one of the lattice nodes that optimizes the score for the respective lattice node; and

performing a backtracing operation to retrieve a sequence of transitions indicating a decoding lattice path; the backtracing operation starting with a final lattice node and ending with a first lattice node in the decoding lattice path; the sequence of transitions being retrieved using the optimizing transition identification data item produced for each lattice node as a result of computing the optimum scores; the decoding lattice path indicating the complete transcription-path through the transcription-image network.

16. The method of claim 13 wherein

the character template associated with a transition in the transcription-image network is one of a plurality of initial character templates representing a respective character in the glyph sample character set;

the decoding operation uses the plurality of initial character templates to produce the complete transcription-path; and

after producing the set of bitmapped character templates using the glyph samples and the respectively paired glyph labels thereof, performing at least one additional iteration of the steps of performing the decoding operation, determining the glyph samples and respectively paired glyph labels, and producing the set of bitmapped character templates; wherein the at least one additional iteration of the decoding operation uses the set of bitmapped character templates produced in a prior iteration as the plurality of initial character templates.

17. The method of claim 12 of operating the machine to train bitmapped character templates wherein the processor, prior to determining the glyph samples occurring in the text line image source of glyph samples and respectively paired glyph labels thereof, produces the text line image source of glyph samples by performing a text-line segmentation operation on an input two-dimensional (2D) image source of glyph samples.

18. The method of claim 12 of operating the machine to train bitmapped character templates wherein

each of the set of bitmapped character templates is based on a character template model having a characteristic image positioning property such that when a first rectangular bounding box entirely contains a first character image, and a second rectangular bounding box entirely contains a second character image adjacent to the first character image, the first rectangular bounding box does not substantially overlap with the second rectangular bounding box; and

the step of operating the processor to determine the glyph samples occurring in the text line image source of glyph samples includes determining, for each glyph sample, image coordinates of a glyph sample bounding box in the text line image source of glyph samples that entirely defines image dimensions of a respective glyph sample.

19. The method of claim 18 wherein the step of operating the processor to produce the set of bitmapped character templates includes producing the bitmapped character templates from the text line image source of glyph samples using the image coordinates of glyph sample bounding boxes to define the image dimensions of the respective glyph samples.

20. The method of claim 18 wherein

the step of operating the processor to determine the glyph samples further includes, for each glyph sample, producing an image definition data structure defining an isolated glyph sample using the image coordinates of the glyph sample bounding box of the respective glyph sample; and

the step of operating the processor to produce the set of bitmapped character templates includes, for each respective one of the set of bitmapped character templates, identifying the image definition data structures defining the isolated glyph samples as samples of the character in the glyph sample character set indicated by the respective bitmapped character template using respectively paired glyph labels, and assigning a foreground pixel color value to selected ones of a plurality of pixel positions included in the respective bitmapped character template using pixel color values included in the isolated glyph samples identified.

21. The method of claim 12 of operating the machine to train bitmapped character templates wherein

the bitmapped character templates are based on a character template model having a characteristic image positioning property such that, when a second bitmapped character template is positioned in an image with a template image origin position thereof displaced from a template image origin position of a preceding first bitmapped character template by a character set width thereof, and when a first bounding box entirely containing the first bitmapped character template overlaps in the image with a second bounding box entirely containing the second bitmapped character template, the first and second bitmapped character templates have substantially nonoverlapping foreground pixels;

the step of operating the processor to determine the glyph samples occurring in the text line image source of glyph samples and respectively paired glyph labels thereof includes determining an image position in the text line image source of glyph samples indicating a glyph sample image origin position of each glyph sample; and

the step of operating the processor to produce the bitmapped character templates includes using the glyph sample image origin positions to determine sample image regions in the text line image source of glyph samples for use in producing the bitmapped character templates; the processor identifying a template image origin position for each bitmapped character template produced.
 Description Submit all comments and votes
 


CROSS REFERENCE TO OTHER APPLICATIONS

The invention of the present application is related to several other inventions that are the subject matter of copending, commonly assigned U.S. patent applications, respectively identified as Ser. No. 08/431,223, "Automatic Training of Character Templates Using a Transcription and a Two-Dimensional Image Source Model"; Ser. No. 08/431,714, "Method of Producing Character Templates Using Unsegmented Samples"; Ser. No. 08/430,635, "Unsupervised Training of Character Templates Using Unsegmented Samples"; and Ser. No. 08/460,454, "Method and System for Automatic Transcription Correction".

FIELD OF THE INVENTION

The present invention relates generally to the field of computer-implemented methods of and systems for pattern recognition, and more particularly to a method of, and machine for, training bitmapped character templates for use in computer-implemented systems for document image decoding and character recognition.

BACKGROUND

Information in the form of language symbols (i.e., characters) or other symbolic notation that is visually represented to a human in an image on a marking medium, such as a computer display screen or paper, is capable of manipulation for its semantic content by a processor included in a computer system when the information is accessible to the processor in an encoded form, such as when each of the language symbols is available to the processor as a respective character code selected from a predetermined set of character codes (e.g. ASCII code) that represent the symbols to the processor. An image is typically represented in a computer system as a two-dimensional array of image data, with each item of data in the array providing a value indicating the color (typically black or white) of a respective location of the image. An image represented in this manner is frequently referred to as a bitmapped or binary image. Each location in a binary image is conventionally referred to as a picture element, or pixel. Sources of bitmapped images include images produced by scanning a paper form of a document using an optical scanner, or by receiving image data via facsimile transmission of a paper document. When manipulation of the semantic content of the characters in an image by a processor is desirable, a process variously called "recognition," or "character recognition," or "optical character recognition" must be performed on the image in order to produce, from the images of characters, a sequence of character codes that is capable of being manipulated by the processor.

Character recognition systems typically include a process in which the appearance of an isolated, input character image, or "glyph," is analyzed and, in a decision making process, classified as a distinct character in a predetermined set of characters. The term "glyph" refers to an image that represents a realized instance of a character. The classification analysis typically includes comparing characteristics of the isolated input glyph (e.g., its pixel content or other characteristics) to units of reference information about characters in the character set, each of which defines characteristics of the "ideal" visual representation of a character in its particular size, font and style, as it would appear in an image if there were no noise or distortion introduced by the image creation process. The unit of reference information for each character, typically called a "character template," "template" or "prototype," includes identification information, referred to as a "character label," that uniquely identifies the character as one of the characters in the character set. The character label may also include such information as the character's font, point size and style. A character label is output as the identification of the input glyph when the classification analysis determines that a sufficient match between the glyph and the reference information indicating the character label has been made.

The representation of the reference information that comprises a character template may be referred to as its model. Character template models are broadly identifiable as being either bitmapped, or binary, images of characters, or lists of high level "features" of binary character images. "Features" are measurements of a character image that are derived from the binary image and are typically much fewer in number than the number of pixels in the character image. Examples of features include a character's height and width, and the number of closed loops in the character. Within the category of binary character template models, at least two different types of models have been defined: one model may be called the "segmentation-based" model, and describes a character template as fitting entirely within a rectangular region, referred to as a "bounding box," and describes the combining of adjacent character templates as being "disjoint"--that is, requiring nonoverlapping bounding boxes. U.S. Pat. No. 5,321,773 discloses another binary character template model that is based on the sidebearing model of letterform shape description and positioning used in the field of digital typography. The sidebearing model, described in more detail below in the discussion accompanying FIG. 1, describes the combining of templates to permit overlapping rectangular bounding boxes as long as the foreground (e.g., typically black) pixels of one template are not shared with, or common with, the foreground pixels of an adjacent template; this is described as requiring the templates to have substantially "disjoint supports."

1. Overview of Training Character Templates For Recognition Systems

Training character templates is the process of using training data to create, produce or update the templates used for the recognition process. Training data can be broadly defined as a collection of character image samples, each with an assigned character label identifying the character in the character set that it represents, that provide the information necessary to produce templates according