or
Bookmark and Share
Automatic extraction of human-readable lists from structured documents
 
   
Document Number
US Patent 7558792
Issued Date
July 7, 2009
Link
Inventors
Bier; Eric A. (Palo Alto, CA)
Map
Abstract
One aspect of the invention extracts a human readable list from a document. It does this by accessing a file that contains data that represents a portion of the document. The data is formatted in accordance with a document formatting description. The data is parsed into tokens that include container tokens and textual tokens. From the container tokens, this aspect determines a context for some of the textual tokens. Once the context is determined, this aspect determines a separator pattern between one of the textual tokens and an adjacent textual token where both the textual token and the adjacent textual token have the same context. Once the separator pattern is determined, the textual tokens can be extracted responsive to the separator pattern. Finally, the textual tokens are presented as the human readable list (for example, displayed, returned in a database, returned in response to a function or subroutine call, etc.).
Tags:
Description:
Amusing 0%
Clever 0%
Complex 0%
Efficient 0%
Historic 0%
Important 0%
Innovative 0%
Interesting 0%
Practical 0%
Simple 0%
Number of Claims:
17
Comments:
no comments yet
Published
July 7, 2009
Application Number
10/879,843
Filed
June 29, 2004
US Classification
707/6  
Int'l Classification
G06F   17/30   (20060101)  
Examiner
Assistant Examiner
Attorney/Law Firm
USPTO Field of Search
707/6  
Related Patents
Claims
Description
About| FAQs| Terms & Disclaimer| Link to Us| Contact Us