WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Digital authentication with analog documents    
United States Patent6243480   
Link to this pagehttp://www.wikipatents.com/6243480.html
Inventor(s)Zhao; Jian (64 Thomas Olney Common, Providence, RI 02904); Koch; Eckhard (Darmstaedterstr. 8, 64625 Bensheim, DE)
AbstractTechniques for protecting the security of digital representations, and of analog forms made from them. The techniques include authentication techniques that can authenticate both a digital representation and an analog form produced from the digital representation, an active watermark that contains program code that may be executed when the watermark is read, and a watermark agent that reads watermarks and sends messages with information concerning the digital representations that contain the watermarks. The authentication techniques use semantic information to produce authentication information. Both the semantic information and the authentication information survive when an analog form is produced from the digital representation. In one embodiment, the semantic information is alphanumeric characters and the authentication information is either contained in a watermark embedded in the digital representation or expressed as a bar code. With the active watermark, the watermark includes program code. When a watermark reader reads the watermark, it may cause the program code to be executed. One application of active watermarks is making documents that send messages when they are operated on. A watermark agent may be either a permanent resident of a node in a network or of a device such as a copier or it may move from one network node to another. In the device or node, the watermark agent executes code which examines digital representations residing in the node or device for watermarked digital representations that are of interest to the watermark agent. The watermark agent then sends messages which report the results of its examination of the digital representations. If the watermarks are active, the agent and the active watermark may cooperate an the agent may cause some or all of the code than an active watermark contains to be executed.



 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Drawing from US Patent 6243480
Digital authentication with analog documents - US Patent 6243480 Drawing
Digital authentication with analog documents
Inventor     Zhao; Jian (64 Thomas Olney Common, Providence, RI 02904); Koch; Eckhard (Darmstaedterstr. 8, 64625 Bensheim, DE)
Owner/Assignee    
Patent assignment
All assignments
Publication Date     June 5, 2001
Application Number     09/070,524
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     April 30, 1998
US Classification     382/100
Int'l Classification     G06K 009/00
Examiner     Johns; Andrew W.
Assistant Examiner     Nakhjavan; Shervin
Attorney/Law Firm     Nelson; Gordon E.
Address
Parent Case    
Priority Data    
USPTO Field of Search     382/100 382/135 382/232 380/287 380/54 705/50 705/54
Patent Tags     digital authentication analog documents
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
5982390
Stoneking
345/474
Nov,1999

[0 after 0 votes]
5943422
Van Wie
705/54
Aug,1999

[0 after 0 votes]
5862218
Steinberg
713/176
Jan,1999

[0 after 0 votes]
5862223
Walker
705/50
Jan,1999

[0 after 0 votes]
5710834
Rhoads
382/232
Jan,1998

[0 after 0 votes]
5680455
Linsker
380/246
Oct,1997

[0 after 0 votes]
5668897
Stolfo

Sep,1997

[0 after 0 votes]
5659628
Tachikawa
382/135
Aug,1997

[0 after 0 votes]
5646997
Barton
713/176
Jul,1997

[0 after 0 votes]
4734856
Davis
706/62
Mar,1988

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


What is claimed is:

1. Apparatus which authenticates a digital representation from which an analog form may be made, the apparatus comprising:

an authenticator which uses semantic information in the digital representation that will be present in the analog form to produce first authentication information; and

an incorporator which incorporates the first authentication information into the digital representation such that the first authentication information is preserved in the analog form and the semantic information remains useful in the analog form to produce second authentication information which is comparable to the first authentication information,

whereby the first authentication information may be retrieved from the analog form and compared with the second authentication information produced from the semantic information in the analog form to determine authenticity of the analog form.

2. The apparatus set forth in claim 1 wherein:

the incorporator incorporates the authentication information in a form which cannot be perceived by unaided observation of the analog form.

3. The apparatus set forth in claim 2 wherein:

the form which cannot be perceived is a digital watermark.

4. The apparatus set forth in claim 1 wherein:

the incorporator incorporates the first authentication information into the digital representation in a form which can be perceived by unaided observation of the analog form.

5. The apparatus set forth in claim 4 wherein:

the form which can be perceived is a bar code.

6. The apparatus set forth in any one of claims 1 through 4 wherein:

the first authentication information is a digest made from the semantic information.

7. The apparatus set forth in any one of claims 1 through 4 wherein:

the first authentication information is robust with regard to insubstantial errors in reading the semantic information from the analog form.

8. The apparatus set forth in claim 7 wherein:

the first authentication information reflects at least in part an order of the semantic information.

9. Apparatus for determining authenticity of an analog form, the analog form containing first authentication information that is produced using semantic information present in the analog form and that is incorporated into the analog form such that the semantic information remains useful in the analog form to produce second authentication information which is comparable to the first authentication information,

the apparatus comprising:

a semantic information recognizer that recognizes the semantic information in the analog form;

an authentication information reader that reads the first authentication information from the analog form; and

an authenticator that computes the second authentication information from the recognized semantic information and determines whether the analog form is authentic by comparing the first authentication information with the second authentication information.

10. The apparatus set forth in claim 9 wherein:

the authentication information is incorporated in a form which cannot be perceived by unaided observation of the analog form.

11. The apparatus set forth in claim 10 wherein:

the form is a digital watermark; and

the authentication information reader is a digital watermark reader.

12. The apparatus set forth in claim 9 wherein:

the first authentication information is incorporated in a form which can be perceived by unaided observation of the analog form.

13. The apparatus set forth in claim 12 wherein:

the form is a bar code; and

the authentication information reader is a bar code reader.

14. The apparatus set forth in any one of claims 9 through 13 wherein:

the first authentication information is a digest made from the semantic information.

15. The apparatus set forth in any of claims 9 through 13 wherein:

the authenticator computes the second authentication information in a fashion which is robust with regard to insubstantial errors made by the semantic information recognizer.

16. The apparatus set forth in claim 15 wherein:

the authenticator compares the first authentication information with the second authentication information such that a partial match within a threshold indicates that the analog form is authentic.

17. The apparatus set forth in claim 15 wherein:

the first and second authentication information reflect at least in part an order of the semantic information.

18. The apparatus set forth in any one of claims 9 through 13 wherein:

the authenticator compares the first authentication information with the second authentication information in a manner which is robust with regard to insubstantial errors made by the semantic information recognizer.

19. The apparatus set forth in claim 18 wherein:

the semantic information is subject to constraints; and

the authenticator includes an error corrector that employs the constraints to correct errors in the recognized semantic information and uses the corrected recognized semantic information to recompute the second authentication information when there is not a precise match between the first authentication information and the second authentication information.

20. The apparatus set forth in any one of claims 9 through 13 wherein:

the analog form is produced from a digital representation that includes the first authentication information.

21. The apparatus set forth in any of claims 9 through 13 wherein:

the analog form is a document;

the semantic information includes alphanumeric characters in the document; and

the semantic information recognizer is an optical character recognizer.

22. The apparatus set forth in claim 21 wherein:

the document is paper digital cash.

23. The apparatus set forth in claim 21 wherein:

the document is a paper digital check.

24. The apparatus set forth in claim 21 wherein:

the document is an identification document.

25. A scanner characterized in that:

the scanner employs the apparatus set forth in claim 21 to determine authenticity of analog forms scanned by the scanner.

26. The apparatus set forth in claim 9 wherein:

the analog form is a document;

the semantic information includes alphanumeric characters in the document;

the semantic information recognizer includes an optical character recognizer; and

the document includes a background image in addition to the alphanumeric characters, the first authentication information being incorporated into the background image in a form which cannot be perceived by unaided observation.

27. The apparatus set forth in claim 26 wherein:

the first authentication information is incorporated into the background image as a digital watermark.

28. The apparatus set forth in claim 27 wherein:

the document is paper digital cash wherein the semantic information includes a serial number for the digital cash and a money amount.

29. The apparatus set forth in claim 27 wherein:

the document is a digital check wherein the semantic information includes an identifier for the bank account, an amount to be paid, and the name of the payer.

30. The apparatus set forth in claim 29 wherein:

the first authentication information is encrypted with a private key belonging to the payer, whereby the payer signs the semantic information.

31. The apparatus set forth in claim 27 wherein:

the document is an identification document, the identification document being issued by an issuing authority and the semantic information including identification information.

32. The apparatus set forth in claim 31 wherein:

the first authentication information is encrypted with a private key belonging to the issuing authority, whereby the issuing authority signs the semantic information.

33. The apparatus set forth in claim 31 wherein:

the identification document is a bankcard and the institution that issues the bankcard is the issuing authority.

34. The apparatus set forth in any one of claims 28 through 33 wherein:

the first authentication information is a first digest made from the semantic information and the second authentication information is a second digest made from the recognized semantic information.

35. The apparatus set forth in claim 34 wherein:

the authenticator determines whether the analog form is authentic by determining whether the second digest exactly matches the first digest.

36. The apparatus set forth in claim 35 wherein:

the first and second digests are made using a one-way hash function.

37. An optical scanning device characterized in that:

the optical scanning device employs the apparatus set forth in any one of claims 26 through 33 to determine authenticity of a document scanned thereby.

38. A method of authenticating a digital representation from which an analog form may be made,

the method comprising the steps of:

producing first authentication information from semantic information in the digital representation that will be present in the analog form; and

incorporating the first authentication information into the digital representation such that the first authentication information is preserved in the analog form and the semantic information remains useful in the analog form to produce second authentication information which is comparable to the first authentic information, whereby the first authentication information may be retrieved from the analog form and compared with the second authentication information produced from the semantic information in the analog form to determine authenticity of the analog form.

39. A method of determining authenticity of an analog form, the analog form containing first authentication information that is produced using semantic information present in the analog form and that is incorporated into the analog form such that the semantic information remains useful in the analog form to produce second authentication information which is comparable to the first authentication information, the method comprising the steps of:

recognizing the semantic information in the analog form;

reading the first authentication information;

computing second authentication information from the recognized semantic information; and

determining whether the analog form is authentic by comparing the first authentication information with the second authentication information.

40. The method set forth in claim 39 wherein:

the analog form is a document;

the semantic information includes alphanumeric characters in the document;

the document includes a background image in addition to the alphanumeric character, the first authentication information being incorporated into the background image in a form which cannot be perceived by unaided observation;

in the step of recognizing the semantic information, the semantic information is recognized by an optical character recognizer; and

in the step of reading the first authentication information, the first authentication information is read from the background image.

41. The method set forth in claim 40 wherein:

the first authentication information is incorporated into the background image as a digital watermark.

42. The apparatus set forth in claim 41 wherein:

the document is paper digital cash wherein the semantic information includes a serial number for the digital cash and a money amount.

43. The apparatus set forth in claim 41 wherein:

the document is a digital check wherein the semantic information includes an identifier for the bank account, an amount to be paid, and the name of the payer.

44. The apparatus set forth in claim 43 wherein:

the first authentication information is encrypted with a private key belonging to the payer, whereby the payer signs the semantic information.

45. The apparatus set forth in claim 41 wherein:

the document is an identification document, the identification document being issued by an issuing authority and the semantic information including identification information.

46. The apparatus set forth in claim 45 wherein:

the first authentication information is encrypted with a private key belonging to the issuing authority, whereby the issuing authority signs the semantic information.

47. The apparatus set forth in claim 45 wherein:

the identification document is a bankcard and the institution that issues the bankcard is the issuing authority.

48. The apparatus set forth in any one of claims 42 through 47 wherein:

the first authentication information is a first digest made from the semantic information; and

in the step of computing the second authentication information, the second authentication information is a second digest computed from the recognized semantic information.

49. The apparatus set forth in claim 48 wherein:

the step of determining whether the analog form is authentic determines whether the second digest exactly matches the first digest.

50. The apparatus set forth in claim 49 wherein:

the first and second digests are made using a one-way hash function.

51. An optical scanning device characterized in that:

the optical scanning device employs the method set forth in any one of claims 40 through 47 to determine authenticity of a document scanned thereby.
 Description Submit all comments and votes
 


CROSS REFERENCE TO RELATED PATENT APPLICATIONS

This application has the same Detailed Description as Jian Zhao, Active Watermarks and Watermark Agents, assigned to Fraunhofer CRCG and filed on even date with this application.

BACKGROUND OF THE INVENTION

1. Field of the Invention

The invention relates generally to digital representations of images and other information and more specifically to techniques for protecting the security of digital representations and of analog forms produced from them.

2. Description of the Prior Art

Nowadays, the easiest way to work with pictures or sounds is often to make digital representations of them. Once the digital representation is made, anyone with a computer can copy the digital representation without degradation, can manipulate it, and can send it virtually instantaneously to anywhere in the world. The Internet, finally, has made it possible for anyone to distribute any digital representation from anywhere in the world

From the point of view of the owners of the digital representations, there is one problem with all of this: pirates, too, have computers, and they can use them to copy, manipulate, and distribute digital representations as easily as the legitimate owners and users can. If the owners of the original digital representations are to be properly compensated for making or publishing them, the digital representations must be protected from pirates. There are a number of different approaches that can be used:

the digital representation may be rendered unreadable except by its intended recipients; this is done with encryption techniques;

the digital representation may be marked to indicate its authenticity; this is done with digital signatures;

the digital representation may contain information from which it may be determined whether it has been tampered with in transit; this information is termed a digest and the digital signature often includes a digest;

the digital representation may contain a watermark, an invisible indication of ownership which cannot be removed from the digital representation and may even be detected in an analog copy made from the digital representation; and

the above techniques can be employed in systems that not only protect the digital representations, but also meter their use and/or detect illegal use.

For an example of a system that uses encryption to protect digital representations, see U.S. Pat. No. 5,646,999, Saito, Data Copyright Management Method, issued Jul. 8, 1997; for a general discussion of digital watermarking, see Jian Zhao, "Look, It's Not There", in: BYTE Magazine, January, 1997. Detailed discussions of particular techniques for digital watermarking may be found in E. Koch and J. Zhao, "Towards Robust and Hidden Image Copyright Labeling", in: Proc. Of 1995 IEEE Workshop on Nonlinear Signal and Image Processing, Jun. 20-22, 1995, and in U.S. Pat. No. 5,710,834, Rhoads, Method and Apparatus Responsive to a Code Signal Conveyed through a Graphic Image, issued Jan. 20, 1998. For an example of a commercial watermarking system that uses the digital watermarking techniques disclosed in the Rhoads patent, see Digimarc Watermarking Guide, Digimarc Corporation, 1997, available at in March, 1998 at http://www.digimarc.com.

FIG. 1 shows a prior-art system 101 which employs the above protection techniques. A number of digital representation clients 105, of which only one, digital representation client 105(j) is shown, are connected via a network 103 such as the Internet to a digital representation server 129 which receives digital representations from clients 105 and distributes them to clients 105. Server 129 includes a data storage device 133 which contains copied digital representations 135 for distribution and a management data base 139. Server 129 further includes a program for managing the digital representations 135, a program for reading and writing watermarks 109, a program for authenticating a digital representation and confirming that a digital representation is authentic 111, and a program for encrypting and decrypting digital representations 113. Programs 109, 111, and 113 together make up security programs 107.

Client 105 has its own versions of security programs 107; it further has editor/viewer program 115 which lets the user of client 105 edit and/or view digital representations that it receives via network 103 or that are stored in storage device 117. Storage device 117 as shown contains an original digital representation 119 which was made by a user of client 105 and a copied digital representation 121 that was received from DR Server 129. Of course, the user may have made original representation 119 by modifying a copied digital representation. Editor/viewer program 115, finally, permits the user to output digital representations to analog output devices 123. Included among these devices are a display 123, upon which an analog image 124 made from a digital representation may be displayed and a printer 127 upon which an analog image 126 made from the digital representation may be printed. A loudspeaker may also be included in analog output devices 123. The output of the analog output device will be termed herein an analog form of the digital representation. For example, if the output device is a printer, the analog form is printed sheet 126; if it is a display device, it is display 124.

When client 105(j) wishes to receive a digital representation from server 129, it sends a message requesting the the digital representation to server 129. The message includes at least an identification of the desired digital representation and an identification of the user. Manager 131 responds to the request by locating the digital representation in CDRs 135, consulting management data base 139 to determine the conditions under which the digital representation may be distributed and the status of the user of client 105 as a customer. If the information in data base 139 indicates to manager 131 that the transaction should go forward, manager 131 sends client 105(j) a copy of the selected digital representation. In the course of sending the copy, manager 131 may use watermark reader/writer 109 to add a watermark to the digital representation, use authenticator/confirmer 111 to add authentication information, and encrypter/decrypter 113 to encrypt the digital representation in such a fashion that it can only be decrypted in DR client 105(j).

When client 105(j) receives the digital representation, it decrypts it using program 113, confirms that the digital representation is authentic using program 111, and editor/viewer 115 may use program 109 to display the watermark. The user of client 105(j) may save the encrypted or unencrypted digital representation in storage 117. The user of client 105(j) may finally employ editor/viewer 115 to decode the digital representation and output the results of the decoding to an analog output device 123. Analog output device 123 may be a display device 125, a printer 127, or in the case of digital representations of audio, a loudspeaker.

It should be pointed out that when the digital representation is displayed or printed in analog form, the only remaining protection against copying is watermark 128, which cannot be perceived in the analog form by the human observer, but which can be detected by scanning the analog form and using a computer to find watermark 128. Watermark 128 thus provides a backup to encryption: if a digital representation is pirated, either because someone has broken the encryption, or more likely because someone with legitimate access to the digital representation has made illegitimate copies, the watermark at least makes it possible to determine the owner of the original digital representation and given that evidence, to pursue the pirate for copyright infringement and/or violation of a confidentiality agreement.

If the user of client 105(j) wishes to send an original digital representation 19 to DR server 129 for distribution, editor/viewer 115 will send digital representation 119 to server 129. In so doing, editor/viewer 115 may use security programs 107 to watermark the digital representation, authenticate it, and encrypt it so that it can be decrypted only by DR Server 129. Manager 131 in DR server 129 will, when it receives digital representation 119, use security programs 107 to decrypt digital representation 119, confirm its authenticity, enter information about it in management data base 139, and store it in storage 133.

In the case of the Digimarc system referred to above, manager 131 also includes a World Wide Web spider, that is, a program that systematically follows World Wide Web links such as HTTP and FTP links and fetches the material pointed to by the links.

Manager program 131 uses watermark reading/writing program to read any watermark, and if the watermark is known to management database 139, manager program 131 takes whatever action may be required, for example, determining whether the site from which the digital representation was obtained has the right to have it, and if not, notifying the owner of the digital representation.

While encryption, authentication, and watermarking have made it much easier for owners of digital representations to protect their property, problems still remain. One such problem is that the techniques presently used to authenticate digital documents do not work with analog forms; consequently, when the digital representation is output in analog form, the authentication is lost. Another is that present-day systems for managing digital representations are not flexible enough. A third is that watermark checking such as that done by the watermark spider described above is limited to digital representations available on the Internet. It is an object of the present invention to overcome the above problems and thereby to provide improved techniques for distributing digital representations.

SUMMARY OF THE INVENTION

The problem that digital authentication techniques are limited to digital representations is overcome by an authentication technique that is based on semantic information, that is, information that must be present in any analog form made from the digital representation. The semantic information is used to produce identification information such as a digest and the digest is added to the digital representation in a manner that does not affect the semantic information. In one embodiment, the identification information is embedded in the digital representation as a watermark; in another, the digest is expressed as a barcode. When a digital representation or analog form contains authentication information that is based on the semantic information, the representation or form is authenticated by again using the semantic information to compute authentication information and then comparing the newly-computed authentication information with the authentication information in the representation or form. If the two match, the digital representation or analog form is authentic. Depending on the semantic information and the purpose of the authentication, the match may either be precise or fuzzy. Among the uses of authentication based on semantic information are authentication of digital forms of electronic documents, authentication of paper digital cash, authentication of paper digital checks, and authentication of identification cards such as bankcards.

Other objects and advantages of the invention will be apparent to those skilled in the arts to which the invention pertains upon perusing the following Detailed Description and Drawing, wherein:

BRIEF DESCRIPTION OF THE DRAWING

FIG. 1 is a block diagram of a prior-art system for securely distributing digital representations;

FIG. 2 is a diagram of a first embodiment of an analog form that can be authenticated;

FIG. 3 is a diagram of a second embodiment of an analog form that can be authenticated;

FIG. 4 is a diagram of a system for adding authentication information to an analog form;

FIG. 5 is a diagram of a system for authenticating an analog form;

FIG. 6 is a diagram of a system for making an active watermark;

FIG. 7 is an example of code from an active watermark;

FIG. 8 is a diagram of a system for executing the code in an active watermark;

FIG. 9 is a diagram of a system for producing a watermark agent;

FIG. 10 is a diagram of a system for receiving a watermark agent;

FIG. 11 is a detailed diagram of access information 603; and

FIG. 12 is an example of code executed by a watermark agent.

The reference numbers in the drawings have at least three digits. The two rightmost digits are reference numbers within a figure; the digits to the left of those digits are the number of the figure in which the item identified by the reference number first appears. For example, an item with reference number 203 first appears in FIG. 2.

DETAILED DESCRIPTION

The following Detailed Description will first disclose a technique for authenticating digital representations that survives output of an analog form of the digital representation, will then disclose active watermarks, that is, watermarks that contain programs, and will finally disclose watermark agents, that is, programs which examine the digital watermarks on digital representations stored in a system and thereby locate digital representations that are being used improperly.

Authentication That is Preserved in Analog Forms: FIGS. 2-5

Digital representations are authenticated to make sure that they have not been altered in transit. Alteration can occur as a result of transmission errors that occur during the course of transmission from the source of the digital representation to its destination, as a result of errors that arise due to damage to the storage device being used to transport the digital representation, as a result of errors that arise in the course of writing the digital representation to the storage device or reading the digital representation from the storage device, or as a result of human intervention. A standard technique for authentication is to make a digest of the digital representation and send the digest to the destination together with the digital representation. At the destination, another digest is made from the digital representation as received and compared with the first. If they are the same, the digital representation has not changed. The digest is simply a value which is much shorter than the digital representation but is related to it such that any change in the digital representation will with very high probability result in a change to the digest.

Where human intervention is a serious concern, the digest is made using a one-way hash function, that is, a function that produces a digest from which it is extremely difficult or impossible to learn anything about the input that produced it. The digest may additionally be encrypted so that only the recipient of the digital representation can read it. A common technique is to use the encrypted digest as the digital signature for the digital representation, that is, not only to show that the digital representation has not been altered in transit, but also to show that it is from whom it purports to be from. If the sender and the recipient have exchanged public keys, the sender can make the digital signature by encrypting the digest with the sender's private key. The recipient can use the sender's public key to decrypt the digest, and having done that, the recipient compares the digest with the digest made from the received digital representation. If they are not the same, either the digital representation has been altered or the digital representation is not from the person to whom the public key used to decrypt the digest belongs. For details on authentication, see Section 3.2 of Bruce Schneier, Applied Cryptography, John Wiley and Sons, 1994.

The only problem with authentication is that it is based entirely on the digital representation. The information used to make the digest is lost when the digital representation is output in analog form. For example, if the digital representation is a document, there is no way of determining from a paper copy made from the digital representation whether the digital representation from which the paper copy was made is authentic or whether the paper copy is itself a true copy of the digital representation.

While digital watermarks survive and remain detectable when a digital representation is output in analog form, the authentication problem cannot be solved simply by embedding the digest or digital signature in the watermark. There are two reasons for this:

Watermarking changes the digital representation; consequently, if a digital representation is watermarked after the original digest is made, the watermarking invalidates the original digest, i.e., it is no longer comparable with the new digest that the recipient makes from the watermarked document.

More troublesome still, when a digital representation is output in analog form, so much information about the digital representation is lost that the digital representation cannot be reconstructed from the analog form. Thus, even if the original digest is still valid, there is no way of producing a comparable new digest from the analog form.

What is needed to overcome these problems is an authentication technique which uses information for authentication which is independent of the particular form of the digital representation and which will be included in the analog form when the analog form is output. As will be explained in more detail in the following, the first requirement is met by selecting semantic information from the digital representation and using only the semantic information to make the digest. The second requirement is met by incorporating the digest into the digital representation in a fashion such that it on the one hand does not affect the semantic information used to make the digest and on the other hand survives in the analog form. In the case of documents, an authentication technique which meets these requirements can be used not only to authenticate analog forms of documents that exist primarily in digital form, but also to authenticate documents that exist primarily or only in analog form, for example paper checks and identification cards.

Semantic Information

The semantic information in a digital representation is that portion of the information in the digital representation that must be present in the analog form made from the digital representation if the human who perceive the analog form is to consider it a copy of the original from which the digital representation was made. For example, the semantic information in a digital representation of an image of a document is the representations of the alphanumeric characters in the document, where alphanumeric is understood to include representations of any kind of written characters or punctuation marks, including those belonging to non-Latin alphabets, to syllabic writing systems, and to ideographic writing systems. Given the alphanumeric characters, the human recipient of the analog form can determine whether a document is a copy of the original, even though the characters may have different fonts and may have been formatted differently in the original document. There is analogous semantic information in digital representations of pictures and of audio information. In the case of pictures, it is the information that is required for the human that perceives the analog form to agree that the analog form is a copy (albeit a bad one) of the original picture, and the same is the case with audio information.

In the case of a document written in English, the semantic information in the document is the letters and punctuation of the document. If the document is in digital form, it may be represented either as a digital image or in a text representation language such as those used for word processing or printing. In the first case, optical character recognition (OCR) technology may be applied to the image to obtain the letters and punctuation; in the second case, the digital representation may be parsed for the codes that are used to represent the letters and punctuation in the text representation language. If the document is in analog form, it may be scanned to produce a digital image and the OCR technology applied to the digital image produced by scanning.

Using Semantic Information to Authenticate an Analog Form: FIGS. 2 and 3

Because the semantic information must be present in the analog form, it may be read from the analog form and used to compute a new digest. If the old digest was similarly made from the semantic information in the digital representation and the old digest is readable from the analog form, the new digest and the old digest can be compared as described in the discussion of authentication above to determine the authenticity of the analog form.

FIG. 2 shows one technique 201 for incorporating the old digest into an analog form 203. Analog form 203 of course includes semantic information 205; here, analog form 203 is a printed or faxed document and semantic information 205 is part or all of the alphanumeric characters on analog form 203. Sometime before analog form 203 was produced, semantic information 205 in the digital representation from which analog form 203 was produced was used to make semantic digest 207, which was incorporated into analog form 203 at a location which did not contain semantic information 205 when analog form 203 was printed. In some embodiments, semantic digest 207 may be added to the original digital representation; in others, it may be added just prior to production of the analog form. Any representation of semantic digest 207 which is detectable from analog form 203 may be employed; in technique 201, semantic digest 207 is a visible bar code. Of course, semantic digest 207 may include additional information, for example, it may be encrypted as described above and semantic digest 207 may include an identifier for the user whose public key is required to decrypt semantic digest 207. In such a case, semantic digest 207 is a digital signature that persists in the analog form.

With watermarking, the semantic digest can be invisibly added to the analog form. This is shown in FIG. 3. In technique 301, analog form 303 again includes semantic information 305. Prior to producing analog form 303, the semantic information in the digital representation from which analog form 303 is produced is used as described above to produce semantic digest 207; this time, however, semantic digest 207 is incorporated into watermark 307, which is added to the digital representation before the analog form is produced from the digital representation and which, like the bar code of FIG. 2, survives production of the analog form. A watermark reader can read watermark 307 from a digital image made by scanning analog form 303, and can thereby recover semantic digest 207 from watermark 307. As was the case with the visible semantic digest, the semantic digest in watermark 307 may be encrypted and may also function as a digital signature.

Adding a Semantic Digest to an Analog Form: FIG. 4

FIG. 4 shows a system 401 for adding a semantic digest to an analog form 203. The process begins with digital representation 403, whose contents include semantic information 205. Digital representation 403 is received by semantics reader 405, which reads semantic information 205 from digital representation 403. Semantics reader 405's operation will depend on the form of the semantic information. For example, if digital representation 403 represents a document, the form of the semantic information will depend on how the document is represented. If it is represented as a bit-map image, the semantic information will be images of alphanumeric characters in the bit map; if it is represented using one of the many representations of documents that express alphanumeric characters as codes, the semantic information will be the codes for the alphanumeric characters. In the first case, semantics reader 405 will be an optical character reading (OCR) device; in the second, it will simply parse the document representation looking for character codes.

In any case, at the end of the process, semantics reader 405 will have extracted some form of semantic information, for example the ASCII codes corresponding to the alphanumeric characters, from representation 403. This digital information is then provided to digest maker 409, which uses it to make semantic digest 411 in any of many known ways. Depending on the kind of document the semantic digest is made from and its intended use, the semantic digest may have a form which requires an exact match with the new digest or may have a form which permits a "fuzzy" match. Digital representation 403 and semantic digest 411 are then provided to diges