WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Information processing methodology    
United States Patent5369508   
Link to this pagehttp://www.wikipatents.com/5369508.html
Inventor(s)Lech; Robert (Jackson, NJ); Medina; Mitchell A. (Essex Fells, NJ); Elias; Catherine B. (Plainsboro, NJ)
AbstractAn information processing methodology gives rise to an application program interface which includes an automated digitizing unit, such as a scanner, which inputs information from a diversity of hard copy documents and stores information from the hard copy documents into a memory as stored document information. Portions of the stored document information are selected in accordance with content instructions which designate portions of the stored document information required by a particular application program. The selected stored document information is then placed into the transmission format required by a particular application program in accordance with transmission format instructions. After the information has been transmission formatted, the information is transmitted to the application program. In one operational mode, the interface interactively prompts the user to identify, on a display, portions of the hard copy documents containing information used in application programs or for storage.
   














 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Drawing from US Patent 5369508
Information processing methodology - US Patent 5369508 Drawing
Information processing methodology
Inventor     Lech; Robert (Jackson, NJ); Medina; Mitchell A. (Essex Fells, NJ); Elias; Catherine B. (Plainsboro, NJ)
Owner/Assignee     System X, L. P. (New York, NY)
Patent assignment
All assignments
Publication Date     * November 29, 1994
Application Number     08/143,135
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     October 29, 1993
US Classification     358/462 358/448 358/449 358/453 382/100 382/305
Int'l Classification     H04N 001/40
Examiner     Hjerpe; Richard
Assistant Examiner     Grant II; Jerome
Attorney/Law Firm     Foley & Lardner
Address
Parent Case     This is a continuation of application Ser. No. 07/672,865filed Mar. 20, 1991 now U.S. Pat. No. 5,258,855.
Priority Data    
USPTO Field of Search     358/462 358/400 358/401 358/403 358/447 358/448 358/449 358/451 358/452 358/453 358/460 358/462 358/463 358/467 358/470 358/471 358/474 382/61 382/48
Patent Tags     information processing methodology
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
5258855
Lech
358/462
Nov,1993

[0 after 0 votes]
5153927
Yamanari
382/311
Oct,1992

[0 after 0 votes]
5140650
Casey
382/283
Aug,1992

[0 after 0 votes]
5095445
Sekiguchi
709/246
Mar,1992

[0 after 0 votes]
5034990
Klees

Jul,1991

[0 after 0 votes]
4034343
Wilmer
382/295
Jul,1977

[0 after 0 votes]
4667248
Kanno
358/452
Dec,1969

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


What is claimed is:

1. A method of processing information from a diversity of hard copy documents, said method comprising the steps of:

(a) receiving output representing a diversity of hard copy documents from an automated digitizing unit and storing information from said diversity of hard copy documents into a memory, said information not fixed from one document to the next;

(b) identifying portions of said hard copy documents corresponding to a first variable; and

(c) storing information from said portions of said hard copy documents corresponding to said first variable into memory locations for said first variable.

2. A method as set forth in claim 1, wherein step (b) includes displaying an image of a hard copy document on a display based on the contents of said memory.

3. A method as set forth in claim 1, further comprising the steps of:

identifying portions of said hard copy documents corresponding to a second variable, said portions of said hard copy documents corresponding to said second variable being different from said portions of said hard copy documents corresponding to said first variable; and

storing information from said portions of said hard copy documents corresponding to said second variable into memory locations for said second variable.

4. A method as set forth in claim 1, wherein step (b) includes prompting identification of said portions of said hard copy documents corresponding to said first variable.

5. A method as set forth in claim 1, wherein step (c) includes storing image information from said portions of said hard copy documents corresponding to said first variable into said memory locations for said first variable.

6. A method as set forth in claim 1, wherein step (c) includes storing textual information from said portions of said hard copy documents corresponding to said first variable into said memory locations for said first variable.

7. A method as set forth in claim 1, further comprising the steps of detecting and correcting errors resulting from said inputting.

8. A method as set forth in claim 1, further comprising the step of utilizing a template to associate portions of said hard copy documents with specific variables.

9. A method as set forth in claim 1, further comprising receiving instructions identifying at least one character or symbol, located on said hard copy documents, which identifies a location on said hard copy documents containing a value of a specific variable.

10. A method as set forth in claim 1, further comprising receiving instructions identifying at least one character or symbol, located on said hard copy documents, which identifies a relative location on said hard copy documents containing a value of a specific variable.

11. A method as set forth in claim 2, further comprising the steps of:

identifying portions of said hard copy documents corresponding to a second variable, said portions of said hard copy documents corresponding to said second variable being different from said portions of said hard copy documents corresponding to said first variable; and

storing information from said portions of said hard copy documents corresponding to said second variable into memory locations for said second variable.

12. A method as set forth in claim 3, further comprising the step of storing image information from said portions of said hard copy documents corresponding to said second variable into said memory locations for said second variable.

13. A method as set forth in claim 3, further comprising the step of storing textual information from said portions of said hard copy documents corresponding to said second variable into said memory locations for said second variable.

14. A method as set forth in claim 3, further comprising the step of prompting identification of said portions of said hard copy documents corresponding to said second variable.

15. A method of processing information from a diversity of hard copy documents, said method comprising the steps of:

(a) scanning a diversity of hard copy documents and storing information from said diversity of hard copy documents into a memory, said information not fixed from one document to the next;

(b) identifying portions of said hard copy documents corresponding to a first variable; and

(c) storing information from said portions of said hard copy documents corresponding to said first variable into memory locations for said first variable.

16. A method as set forth in claim 15, wherein step (b) includes displaying an image of a hard copy document on a display based on the contents of said memory.

17. A method as set forth in claim 15, further comprising the steps of:

identifying portions of said hard copy documents corresponding to a second variable, said portions of said hard copy documents corresponding to said second variable being different from said portions of said hard copy documents corresponding to said first variable; and

storing information from said portions of said hard copy documents corresponding to said second variable into memory locations for said second variable.

18. A method as set forth in claim 15, wherein step (b) includes prompting identification of said portions of said hard copy documents corresponding to said first variable.

19. A method as set forth in claim 15, wherein step (c) includes storing image information from said portions of said hard copy documents corresponding to said first variable into said memory locations for said first variable.

20. A method as set forth in claim 15, wherein step (c) includes storing textual information from said portions of said hard copy documents corresponding to said first variable into said memory locations for said first variable.

21. A method as set forth in claim 15, further comprising the steps of detecting and correcting errors resulting from said scanning.

22. A method as set forth in claim 15, further comprising the step of utilizing a template to associate portions of said hard copy documents with specific variables.

23. A method as set forth in claim 15, further comprising receiving instructions identifying at least one character or symbol, located on said hard copy documents, which identifies a location on said hard copy documents containing a value of a specific variable.

24. A method as set forth in claim 15, further comprising receiving instructions identifying at least one character or symbol, located on said hard copy documents, which identifies a relative location on said hard copy documents containing a value of a specific variable.

25. A method as set forth in claim 16, further comprising the steps of:

identifying portions of said hard copy documents corresponding to a second variable, said portions of said hard copy documents corresponding to said second variable being different from said portions of said hard copy documents corresponding to said first variable; and

storing information from said portions of said hard copy documents corresponding to said second variable into memory locations for said second variable.

26. A method as set forth in claim 17, further comprising the step of storing image information from said portions of said hard copy documents corresponding to said second variable into said memory locations for said second variable.

27. A method as set forth in claim 17, further comprising the step of storing textual information from said portions of said hard copy documents corresponding to said second variable into said memory locations for said second variable.

28. A method as set forth in claim 17, further comprising the step of prompting identification of said portions of said hard copy documents corresponding to said second variable.

29. A method of processing data extracted from a diversity of hard copy documents, said method comprising the steps of:

(a) receiving output representing a diversity of hard copy documents from an automated digitizing unit and storing information from said diversity of hard copy documents into a memory as stored document information;

(b) selecting portions of said stored document information in accordance with content instructions defining portions of said stored document information required by an application unit;

(c) formatting selected stored document information into a transmission format used by said application unit based on transmission format instructions; and

(d) transmitting formatted selected stored document information to said application unit.

30. A method as set forth in claim 29, wherein step (a) includes storing textual information representing characters on said hard copy documents.

31. A method as set forth in claim 29, wherein step (a) includes storing digitized image information representing the actual appearance of said hard copy documents.

32. A method as set forth in claim 29, further comprising detecting and correcting errors in said stored document information resulting from said inputting.

33. A method as set forth in claim 29, wherein step (a) includes the step of utilizing a template to associate portions of said hard copy documents with specific variables.

34. A method as set forth in claim 29, wherein step (a) includes receiving instructions identifying at least one character or symbol, located on said hard copy documents, which identifies a location on said hard copy documents containing a value of a specific variable.

35. A method as set forth in claim 29, wherein step (a) includes receiving instructions identifying at least one character or symbol, located on said hard copy documents, which identifies a relative location on said hard copy documents containing a value of a specific variable.

36. A method as set forth in claim 29, further comprising the step of printing textual copies of said hard copy documents based on said stored document information.

37. A method as set forth in claim 29, wherein step (a) includes receiving output representing a diversity of hard copy documents from a scanner.

38. A method as set forth in claim 31, further comprising the step of printing copies of said hard copy documents based on said digitized image information.

39. An application program interface, comprising:

an automated digitizing unit which extracts information from a diversity of hard copy documents and stores said information from said diversity of hard copy documents in a memory as stored document information;

a processor selecting portions of said stored document information in accordance with content instructions, said content instructions designating portions of said stored document information required by an application unit;

a formatter formatting selected stored document information into a transmission format used by said application unit based on transmission format instructions; and

an output unit transmitting formatted selected stored document information to said application unit.

40. An interface as set forth in claim 39, wherein said stored document information includes textual information representing characters on said hard copy documents.

41. An interface as set forth in claim 39, wherein said stored document information includes digitized image information representing the actual appearance of said hard copy documents.

42. An interface as set forth in claim 39, further comprising an error correcting unit detecting and correcting errors resulting from extracting by said automated digitizing unit.

43. An interface as set forth in claim 39, further comprising:

a template definition unit for defining a template which associates locations on said hard copy documents with specific variables.

44. An interface as set forth in claim 39, further comprising a search unit for searching for at least one character or symbol, located on said hard copy documents, which identifies a location on said hard copy documents containing a value for a specific variable.

45. An interface as set forth in claim 39, wherein said automated digitizing unit includes a scanner.

46. An interface as set forth in claim 41, further comprising a printer which prints out copies of the actual appearance of said hard copy documents based on said digitized image information.
 Description Submit all comments and votes
 


BACKGROUND OF THE INVENTION

The invention is directed to a system for efficiently processing information originating from hard copy documents. More specifically, the invention is directed to a hard copy document to application program interface which minimizes the need to manually process hard copy documents.

In the past, information contained on hard copy documents was manually entered into a computer via the input controller of a particular computer. The original document was then filed away for future reference. Automatic input of data was limited to the input of Magnetic Ink Character Recognition (MICR) data and to Optical Character Recognition (OCR) data. This fixed-position data was forwarded directly to a dedicated computer application specifically designed to accommodate the input format. In more recent years, typewritten text has been mechanically inputted into a computer via a text file. Examples of this latter type of system are word processors and photo-typesetters

These conventional systems have limitations which decrease the efficiency of processing information from a hard copy document. For example, the systems discussed above are limited in their application to MICR, OCR, or typewritten data. Parsing and processing data is limited to the particular requirements of the particular computer application which requires the input data. In addition, in these conventional systems, the actual hard copy document must be retained for future reference at great expense.

In a sophisticated computer network, different users may require different portions of the information contained on a hard copy document. For example, if the hard copy document is an invoice returned with payment of a bill, the accounting department may need all of the monetary information contained on the bill while the mailroom may need only customer address information, to update a customer's address. Therefore, there is a need for a system in which specific information from a hard copy document can be selectively distributed to various users.

Another problem with conventional systems is that users, even within the same company, may require that the information extracted from a hard copy document be transmitted to a particular application program in a specific transmission format. For example, one department in a company may use a particular application program which must receive information using a particular character as a delimiter and other departments may require the information in a different format using different delimiters.

Another problem, particularly for small businesses, is that current systems can not efficiently accommodate the inputting of information from a diversity of hard copy documents. A large business which receives many forms in the same format can afford a system which inputs a high volume of information in that format into memory. For example, it is cost-effective for a bank which processes hundreds of thousands of checks a month to buy a dedicated machine which can read information off of checks having a rigidly defined, or fixed, format. However, as the diversity of forms received by a business increases relative to the number of forms that must be processed, it becomes less cost-effective to design a dedicated machine for processing each type of form format. This problem is particularly significant in small businesses which may, for example, receive fifty invoices a month, all in different, non-fixed, formats. It is frequently not cost-effective for a small business to design dedicated systems for inputting information in each of these various formats. This leaves a small business with no other practical alternative than to manually input the information off of each invoice each month.

SUMMARY OF THE INVENTION

It is an object of the invention, therefore, to provide an application program interface which allows a user to select specific portions of information extracted from a diversity of hard copy documents and allows the user to direct portions of this information to several different users in accordance with the needs of the particular user.

It is also an object of the invention to provide a cost-effective system for inputting hard copy documents which can accommodate hard copy documents in a diversity of formats.

It is another object of the invention to provide an application program interface which allows a user to put information, which is to be transmitted, into a particular transmission format, based upon the needs of the receiver of the information.

It is a further object of the invention to provide an application program interface which will allow the extraction, selection, formatting, routing, and storage of information from a hard copy document in a comprehensive manner such that the hard copy document itself need not be retained.

It is another object of the invention to provide a system which reduces the amount of manual labor required to process information originating from a hard copy document.

A further object of the invention is to reduce the time required to process information originating from a hard copy document so that a higher volume of transactions involving hard copy documents can be processed.

The invention provides an application program interface which inputs a diversity of hard copy documents using an automated digitizing unit and which stores information from the hard copy documents in a memory as stored document information. Portions of the stored document information are selected in accordance with content instructions which define portions of the stored document information required by a particular application unit. Selected stored document information is then formatted into the transmission format used by the particular application program based on transmission format instructions. The transmission formatted selected stored document information is then transmitted to the particular application program. The hard copy documents may contain textual information or image information or both.

The interface operates in three different modes.

In a first mode, the interface extracts all of the information from hard copy documents and stores this information in memory. Parsing of various portions of the extracted information is performed in accordance with content instructions.

In a second mode, the user operates interactively with the interface by use of a display and an input device, such as a mouse. In this second mode, a hard copy document is inputted and displayed on the display. The interface then prompts the user to identify the location of various information. For example, the interface can ask the user to identify the location of address information on the hard copy document. In response, the user positions the mouse to identify address information using a cursor. The identified information is then stored as address information in memory. Subsequently, the interface again prompts the user to identify other pieces of information, which are then stored in the appropriate locations in memory. This process proceeds until all of the information which is desired to be extracted off of the hard copy document is stored in memory.

In a third mode of operation, selected portions of information are extracted off of hard copy documents in accordance with predetermined location information which has been specified by the user. For example, the user can define a template which specifies the location of information on hard copy documents. Templates can be formed in conjunction with second mode operation. Alternatively, the user can instruct the interface to search hard copy documents for a particular character or symbol, located on the hard copy documents. The information desired to be extracted off of the hard copy documents is specified relative to the location of this character or symbol.

The interface can also prompt or receive from an applications program or another information processing system, required information, content instructions, and format instructions.

Other objects, features, and advantages of the invention will be apparent from the following detailed description of the invention.

BRIEF DESCRIPTION OF THE DRAWINGS

The invention will be described in further detail below with reference to the accompanying drawings, in which:

FIG. 1 illustrates hardware for implementing a preferred embodiment of the instant invention;

FIG. 2 illustrates an example of a hard copy document containing information to be processed by the instant invention;

FIGS. 3A and 3B are enlarged views of the computer of FIG. 1 used to explain how the invention interactively prompts a user to identify information;

FIG. 4 is an overall data flow diagram for the FIG. 1 preferred embodiment;

FIG. 5 is a detailed input data flow diagram for the FIG. 1 preferred embodiment;

FIG. 6 is a detailed information processing data flow diagram for the FIG. 1 preferred embodiment;

FIG. 7 is a more detailed information processing data flow diagram for the maintain library module of FIG. 6;

FIG. 8 is a more detailed information processing data flow diagram for the maintain definitions module of FIG. 6;

FIG. 9 is a more detailed information processing data flow diagram for the process document module of FIG. 6;

FIG. 10 is a detailed output data flow diagram for the FIG. 1 preferred embodiment;

FIG. 11 lists data corresponding to the hard copy document of FIG. 2;

FIGS. 12A, 12B, and 12C illustrate examples of data which can be selected from the extracted data of FIG. 11 in accordance with content instructions;

FIGS. 13A, 13B, and 13C illustrate examples of the data of FIGS. 12A, 12B, and 12C formatted in accordance with various transmission format instructions to form input files; and

FIG. 14 illustrates another example of a hard copy document containing information to be processed by the instant invention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

Hardware

The invention provides an interface between

information originating from a hard copy document and a computer application unit which uses the information. The computer application unit can be a particular computer application program or a device which is controlled in accordance with instructions or information from the hard copy document. The invention also allows storing a copy of the hard copy document in a memory and retrieving the copy of the hard copy document. By providing a comprehensive and integrated system which can accommodate almost all of the possible uses of information contained on a hard copy document, the instant invention allows for a paperless office.

The invention includes hardware and software necessary to extract, retrieve, and process information from the hard copy document. A copy of the actual image of the hard copy document is stored in memory. Textual information extracted from the hard copy document is also stored in memory. Textual information is information, such as alphanumeric characters, which is recognized on the hard copy document and which is stored in a form which corresponds to the particular recognized character. For example, the extracted characters can be stored in the ASCII format in an electronic memory.

The user can have all of the information extracted from the hard copy document and stored in memory. Alternatively, the interface can interactively prompt the user to identify specific pieces of information for storage. The interface can also extract specific pieces of information using a predefined template. The interface can also prompt or receive from another information processing system or an applications program desired information, content instructions, and format instructions.

The instant invention also provides for parsing information extracted from the hard copy document and for directing this parsed information to specific users or application programs as an input file.

The invention also permits the user to define the transmission format of the input file for a particular computer application unit.

FIG. 1 illustrates hardware for implementing a preferred embodiment of a hard copy document to application program interface according to the instant invention. The interface 200 processes information extracted off of hard copy document 100 and provides information to application units 270 in a form required by each particular application unit. The interface extracts information off of a hard copy document 100 utilizing a scanner 210. The scanner 210 can be any type of scanner which extracts information off of hard copy documents, for example, an Optical Reader.

The scanned information is stored in a scanner memory 220 or in main memory 250, as will be described in greater detail below. If main memory 250 or another memory is available to store the scanned information, then scanner memory 220 can be omitted.

The information from scanner memory 220 or main memory 250 is transmitted to computer 230. In the preferred embodiment, computer 230 includes a display 232, a keyboard 234, and a mouse 236. The display 232 displays an image of the hard copy document itself and/or information necessary to process the information extracted off of the hard copy document.

The computer 230 is used to select portions of the stored document information contained in memory in accordance with content instructions which define portions of the stored document information required by an application unit. These content instructions may be provided by the application program. Alternatively, the content instructions can be inputted via an input device such as a keyboard, a touch screen, a mouse, a notepad a voice recognition device, or the like.

The computer 230 is also used to format selected stored document information into the transmission format used by an application unit based on transmission format instructions. The transmission format instructions may be provided by the application program. Alternatively, the transmission format instructions can be inputted via a keyboard, a touch screen, a mouse, a notepad, a voice recognition device, or the like.

Thus, the computer 230 is used to generate an input file for a particular application unit. The computer 230 is connected to scanner memory 220, main, or permanent, memory 250, a printer 260, and application units 270, via a bus 240. Although FIG. 1 illustrates use of a bus to connect components together, it is understood that any routing or connecting link, implemented in hardware or software or both, can be employed instead of, or in addition to, a bus. Instructions to or in the computer 230 control the main memory 250, the printer 260, the application units 270, and the bus 240. Instructions to or in computer 230 can also control exchanges of information with scanner memory 220.

When the computer 230 generates an input file for a particular document, the computer 230 can send this input file directly to an application unit or can store this input file in the main memory 250 until required by an application unit. The main memory 250 may also optionally store a copy of the image information for the hard copy document and the textual information for the hard copy document. Thus, the image information and textual information from the hard copy document can be retrieved and printed out on printer 260. In addition, image and textual information stored in scanner memory 220 or in main memory 250 can be used to form additional input files at the time of input or at a later time, based on content instructions and transmission format instructions. Thus, the invention can, at the discretion of the user, eliminate the need to retain copies of hard copy documents, permitting a paperless office.

The application units 270 include particular application programs and devices which are controlled in accordance with information contained on hard copy document 100.

FIG. 2 illustrates an example of a hard copy document 100 which contains information to be processed by the instant invention. The document illustrated in FIG. 2 is a bill from XYZ Corporation to customer ABC Corporation. FIG. 2 is only an example of a type of document that can be processed by the instant invention.

In a first operational mode, the scanner 210 stores all of the information extracted off of hard copy document 100 in the scanner memory 220 or, alternatively, in main memory 250. The extracted information is stored in two forms. The actual image of the hard copy document 100 is stored as image information in the scanner memory 220. In addition, the scanner memory 220 stores textual information recognized on the hard copy document 100 by, for example, employing standard character recognition software. In the preferred embodiment, the textual information is stored in ASCII format. The scanner memory 220 can be, for example, an electronic, magnetic, or optical memory.

FIG. 3A illustrates an enlarged view of the computer 230 of FIG. 1. This view will be used to describe a second mode of operation. In this second mode of operation, the hard copy document 100 is scanned and a copy of the document 100 is displayed on display 232 of computer 230, based on the contents of information temporarily stored in scanner memory 220. After the document is displayed on display 232, the computer 230 interactively prompts the user to identify the location of specific pieces of information on the hard copy document. In the FIG. 3A illustration, this prompt message is indicated as the message beginning with the arrow.

For example, the prompt message can ask the user to identify the location of account number information on the hard copy document. The user then uses an input device, such as keyboard 234 or mouse 236 or a touch screen, notepad, voice recognition device, or other input device to position a cursor on the display to identify the location of the information requested by the prompt message. For example, the cursor could be used to define a block (which could be highlighted) containing the requested information, followed by a mouse "enter" click. In this example, the user would move the mouse to identify the location of the account number information contained on the hard copy document 100. The computer 230 then stores the information which has been identified by the user as account number information in the appropriate address or subfile or as the appropriate variable or parameter in memory. The computer then prompts the user to identify the location of other information on the hard copy document, such as, statement date information. The process proceeds until all of the desired information has been stored into the appropriate locations in memory.

FIG. 3B illustrates a variation of the second mode for interactively prompting the user for information. In FIG. 3B, the display is split into two portions. A left-hand portion 232L displays the image of the hard copy document and a right-hand portion 232R displays the required application program information. For example, in FIG. 3B, portion 232R displays a spreadsheet used by an application program. While observing the split display, the user can input instructions to associate specific pieces of information on the hard copy document (for example, the vendor name indicated by the mouse arrow 232A) with particular subfiles in memory (for example, the vendor field next to which the cursor 232C appears), using a mouse or other input device(s) or both. The split display also allows the user to generate content format instructions while observing the information required for a particular application program on the right-hand portion.

These second modes of operation are efficient for small businesses which receive a small number of a wide variety of invoices, since the user does not necessarily have to store all of the information that appears on the hard copy document. A further advantage is that data input is quicker, easier, and more accurate than with previous keyboard methodology. In addition, by specifying the location on the hard copy document of information, the user may optionally create a template, to be described in further detail below, for each different type of invoice. This template is stored for future use when another hard copy document in the same format is received.

More specifically, instructions from computer 230 can direct the scanner 210 and scanner memory 220, and/or main memory 250, to scan and/or store only specific portions of hard copy document 100. After the interactive prompts required to obtain information for a desired application program, the unused information stored in scanner memory 220 or 250 can be erased. Further, scanning of a second identical document can be limited to only those portions of the document which contain needed information.

More specifically, in FIG. 2, the lines 10 drawn around certain portions of the document represent the areas which the user has previously identified as the portions of a document to be extracted by the scanner 210 and stored in scanner memory 220 and/or main memory 250. Since the logo 20 and the message 30 have not been identified as an area to be scanned and stored, these areas are not scanned and stored in subsequent documents. Since the user has previously associated each of the areas 10 with a specific subfile of information, e.g., the account number, the scanned information is stored in memory locations corresponding to that subfile.

Data Processing

FIGS. 4-10 illustrate the flow of data in the FIG. 1 preferred embodiment. FIG. 4 illustrates the overall data flow for the FIG. 1 preferred embodiment. The preferred embodiment includes an input process module 1.0, an information processing module 2.0, and an output processing module 3.0. The information processing module 2.0 is equipped to receive instructions from and transmit information to a user. The information processing module 2.0 can also transmit to and receive information from a remote external device through communication interface 4.0. Input process module 1.0 and output processing module 3.0 can also access communication interface 4.0. A module is implemented in hardware, software, or a combination of hardware and software. The specific implementation for a particular business application depends upon a variety of factors, for example, the relative costs of hardware and software implemented systems, the frequency with which a user will want to expand or modify the system, and the like.

FIG. 5 is a more detailed diagram of the input process module 1.0 of FIG. 4. The input process module 1.0 includes a character input module 1.1, an image input module 1.2, and, in the preferred embodiment, a character recognition device 1.3. The character input module inputs textual information, such as alphanumeric characters, from

an input device such as keyboard 234. The image input module 1,2 inputs image information, for example, a digitized image of the actual appearance of hard copy document 100. Textual information can include textual input from an input device such as keyboard 234 and textual information extracted from the document by character recognition device Both types of information comprise an input document which is transmitted to information processing module 2.0. In the FIG. 1 preferred embodiment, the processing performed by input process module 1.0 occurs in scanner memory 220, computer 230, and main memory 250.

FIG. 6 illustrates information processing data flow for the FIG. 1 preferred embodiment, that is, FIG. 6 illustrates data flow in the information processing module 2.0.

The information processing module 2.0 includes a maintain library module 2.1, to be described in further detail below in conjunction with FIG. 7, a maintain definitions module 2.2, to be described in further detail below in conjunction with FIG. 8, and a process document module 2.3 to be described in further detail below in conjunction with FIG. 9.

The information processing module 2.0 is the module which coordinates and drives the entire system. In the preferred embodiment, the information processing module 2.0 is implemented primarily by computer 230.

FIG. 7 illustrates information processing data flow in the maintain library module 2.1. The maintain library module 2.1 maintains a library of image information, for example, a digitized image representing the actual appearance of the hard copy document, and textual information of the hard copy documents for reference during processing. This library can be incorporated within scanner memory 220, main memory 250, or another independent memory, for example, a RAM disk. The maintain library module 2.1 includes a store document module 2.1.1, a correct errors module 2.1.2, a retrieve document module 2.1.3, and a document file 2.1.4. These modules operate collectively to store, retrieve, and correct document information.

The store document module 2.1.1, prior to routing the document to the document file 2.1.4, may provide information on recognition errors which may have occurred while inputting the document. For example, the store document module 2.1.1 identifies that a character contained on hard copy document 100 was not recognized. The store document module 2.1.1 also optionally causes a copy of the document and its parsing to be displayed on the display 232 for confirmation by the user. The user may utilize this opportunity to identify any errors in the displayed document and, in conjunction with the correct errors module 2.1.2, to revise the document's parsing, if necessary, prior to storage of the document in memory. The module 2.1.1 also provides a facility for the user to name a particular hard copy document for cataloging, storage, and retrieval purposes. After the document is named, the store document module 2.1.1 stores copies of the document in the document file 2.1.4.

The correct errors module 2.1.2 processes instructions from the user to correct errors identified by the store document module 2.1.1 and errors that have been spotted by the user during the confirmation process.

The retrieve document module 2.1.3 permits the user to retrieve a copy of a document previously stored in the document file 2.1.4. As described above, long-term storage is provided by main memory 250, if necessary.

FIG. 8 illustrates a more detailed information processing data flow diagram for the maintain definitions module 2.2 of FIG. 6. The maintain definitions module 2.2 allows the user to define system and document parameters and maintains the defin