WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Text analysis system    

Get related patents on CD
United States Patent5392428   
Link to this pagehttp://www.wikipatents.com/5392428.html
Inventor(s)Robins; Stanford K. (Mendota Heights, MN)
AbstractA computer-based system for analyzing textual works is achieved by operating on a model of the text as stored in a relational database. The text is divided into user-defined segments, and the system maintains a series of records, each of which characterizes a segment of the text. The system generates a one-to-one association between each record and the indicia which indicate the length of the record and correspond to the beginning and end points of text segments. The system also includes topic records which maintain a list of topics. The system generates one-to-many associations between topics and records so that a link is established between a particular topic and one or more records. Based on the model, the system manages the text and generates reports for analysis of the text.
   














 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History Custom Search
Drawing from US Patent 5392428
Text analysis system - US Patent 5392428 Drawing
Text analysis system
Inventor     Robins; Stanford K. (Mendota Heights, MN)
Owner/Assignee    
Patent assignment
All assignments
Company News
Publication Date     February 21, 1995
Application Number     08/217,136
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     March 23, 1994
US Classification    
Int'l Classification    
Examiner     Heckler; Thomas M.
Assistant Examiner    
Attorney/Law Firm     Merchant, Gould, Smith, Edell, Welter & Schmidt
Address
Parent Case     This is a continuation of application Ser. No. 07/732,823, filed Jul. 19, 1991, abandoned, which is a continuation-in-part of application Ser. No. 07/722,856, filed Jun. 28, 1991, now abandoned.
Priority Data    
USPTO Field of Search    
Patent Tags     text analysis
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
5267155
Buchanan
715/540
Nov,1993

[0 after 0 votes]
5157783
Anderson
707/4
Oct,1992

[0 after 0 votes]
5148366
Buchanan
715/531
Sep,1992

[0 after 0 votes]
5062074
Kleinberger

Oct,1991

[0 after 0 votes]
4959769
Cooper
707/200
Sep,1990

[0 after 0 votes]
3670310
Bharwani
707/3
Jun,1972

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B

[0 market size comments]
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%

[0 market share comments]
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%

[0 reasonable royalty comments]
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

[0 Guesstimation of Royalty Value Comments]
License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
[0 license availability comments]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
[0 owner/assignee comments]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

[0 competitive advantage comments]
Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

[0 commercial alternatives comments]
 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


What is claimed is:

1. A computer-based information analysis system for creating a representation of at least a portion of at least one of first and second predetermined bodies of text, the representation comprising records which correspond to text segments each of which have a length determined by the system user, the system comprising:

a) input means for entering data, the input means comprising topic input means for inputting a set of user-defined topics;

b) organization means for structuring the representation, the organization means comprising:

i) record division means for creating at least one of first and second sets of records, wherein each set of records corresponds to a particular body of text and wherein each record comprises data which characterizes a particular text segment chosen by a system user, the record division means comprising demarcation indicia means for entering demarcation indicia representing the user selected length of each record;

ii) one-to-one association organizing means comprising demarcation indicia association means for establishing a one-to-one association between a predetermined record and the corresponding demarcation indicia for indicating the length of the record; and

iii) one-to-many association organizing means comprising topic organizing means for establishing a user-designated one-to-many association between at least one user-defined topic and at least one of the records; and

c) output means for generating a report.

2. The system of claim 1 wherein:

a) the input means further comprises annotation input means for inputting one or more user-defined annotations to the text;

b) the one-to-one association organizing means further comprises annotation organizing means for establishing a one-to-one association between each annotation and a corresponding record; and (

c) the output means further comprises record retrieval means for retrieving one or more records according to one or more annotations associated with at least one record.

3. The system of claim 2 wherein the output means further comprises:

a) text search and retrieval query means for locating each occurrence of a string of characters in the annotations; and

b) text search and retrieval report means comprising:

i) occurrence report means for listing the location in the annotations of all occurrences of the string of characters; and

ii) context report means for reporting each string of characters as it occurs in a range of surrounding text of a size determined by a user.

4. The system of claim 3 wherein the text search and retrieval query means further comprises Boolean query means for enabling a user to formulate a query comprising at least one character string optionally in Boolean relation to at least one condition of conjunctive or disjunctive or negative proximity to another character string.

5. The system of claim 2 wherein:

a) the input means further comprises annotation input means for inputting user-defined annotations selected from the group consisting of text, symbols, digitized pictorial material, digitized voice, or a binary representation of analog information; and

b) the organizing means further comprises annotation organizing means for organizing annotations into at least one of first and second sets classified according to type of annotation.

6. The system of claim 1 wherein:

a) the input means further comprises synopsis input means for inputting at least one of first and second sets of user-created synopses of the text; and

b) the one-to-one association organizing means further comprises synopsis organizing means for establishing a one-to-one association between each synopsis and a corresponding record.

7. The system of claim 6 wherein the output means further comprises:

a) text search and retrieval query means for locating each occurrence of a string of characters in at least one of the first and second sets of synopses; and

b) text search and retrieval report means comprising:

i) occurrence report means for listing the location in at least one of the first and second sets of synopses of all occurrences of the string of characters; and

ii) context report means for reporting each string of characters as it occurs in a range of surrounding text of a size determined by a user.

8. The system of claim 7 wherein the text search and retrieval query means further comprises Boolean query means for enabling a user to formulate a query comprising at least one character string optionally in Boolean relation to at least one condition of conjunctive or disjunctive or negative proximity to another character string.

9. The system of claim 1 wherein:

a) the input means further comprises descriptive sentence input means for inputting one or more sentences corresponding to the synopses for describing the content of the synopsis in abstract form;

b) the one-to-one association organizing means further comprises descriptive sentence organizing means for establishing a one-to-one association between each descriptive sentence and a corresponding record; and (

c) the output means further comprises record retrieval means for retrieving one or more records according to one or more of the descriptive sentences associated with at least one record.

10. The system of claim 9 wherein the output means further comprises:

a) text search and retrieval query means for locating each occurrence of a string of characters in the descriptive sentences; and

b) text search and retrieval report means comprising:

i) occurrence report means for listing the location in the descriptive sentences of all occurrences of the string of characters; and

ii) context report means for reporting each string of characters as it occurs in a range of surrounding text of a size determined by a user.

11. The system of claim 10 wherein the text search and retrieval query means further comprises Boolean query means for enabling a user to formulate a query comprising at least one character string optionally in Boolean relation to at least one condition of conjunctive or disjunctive or negative proximity to another character string.

12. The system of claim 1 wherein:

a) the input means further comprises item-identifier input means for inputting at least one of first and second sets of item-identifiers corresponding to items comprehended by the text;

b) the one-to-many association organizing means further comprises item-identifier association means for establishing a one-to-many association between each item-identifier and the corresponding records which comprehend a reference to the respective item; and

(c) the output means further comprises record retrieval means for retrieving one or more records according to one or more of the item identifiers corresponding to items comprehended by the text, and associated with at least one of the records.

13. The system of claim 12 wherein:

a) the input means further comprises item depictor means for inputting item depictors describing items comprehended by the text;

b) the one-to-one association organizing means further comprises item depictor one-to-one association organizing means for establishing a one-to-one association between each item depictor and a corresponding item-identifier; and

(c) the output means further comprises record retrieval means for retrieving one or more records according to one or more of the item depictors describing items comprehended by the text, and associated with at least one of the records.

14. The system of claim 12 wherein the organizing means further comprises item class organizing means for organizing the item-identifiers into at least one of first and second sets classified according to a userdefined class of item.

15. The system of claim 12 wherein the organizing means further comprises item reference organizing means for organizing item-identifiers into at least one of first and second sets classified according to a user-defined type of item.

16. The system of claim 1 wherein the output means further comprises:

a) text search and retrieval query means for locating each occurrence of a string of characters in at least one of the first and second bodies of text; and

b) text search and retrieval report means comprising:

i) occurrence report means for listing the location in at least one of the first and second bodies of text of all occurrences of the string of characters; and

ii) context report means for reporting each string of characters as it occurs in a range of surrounding text of a size determined by a user.

17. The system of claim 16 wherein the text search and retrieval query means further comprises Boolean query means for enabling a user to formulate a query comprising at least one character string optionally in Boolean relation to at least one condition of conjunctive or disjunctive or negative proximity to another character string.

18. The system of claim 16 wherein:

a) the text search and retrieval query means further comprises topic query means for enabling a user to define queries to text occurring in records previously associated with at least one topic; and

b) the text search and retrieval report means further comprises record report means for reporting each record in which the string occurs.

19. The system of claim 1 wherein:

a) the input means further comprises:

i) item-identifier input means for inputting at least one of first and second sets of item-identifiers corresponding to items referenced by the text; and

ii) item-location indicium input means for inputting at least one of first and second sets of item location indicia wherein each indicium serves to demarcate the location in the text of a reference in the text to an item;

b) the one-to-many association organizing means further comprises means for establishing a one-to-many association between each item-identifier and all item-location indicia for the same item; and

(c) the output means further comprises record retrieval means for retrieving one or more records according to one or more of the item identifiers corresponding to items references by the text, and associated with at least one of the records.

20. The system of claim 19 wherein the one-to-one association organizing means comprises item-location indicium organizing means for establishing a one-to-one association between each item-location indicium and the corresponding record which comprehends a reference to the respective item.

21. The system of claim 1 wherein:

a) the input means further comprises:

i) item-identifier input means for inputting at least one of first and second sets of item-identifiers corresponding to items located in the text; and

ii) item-location indicium input means for inputting at least one of first and second sets of item location indicia wherein each indicium serves to demarcate the location in the text of the items located in the text;

b) the one-to-many association organizing means further comprises means for establishing a one-to-many association between each item-identifier and all item-location indicia for the same item; and

(c) the output means further comprises record retrieval means for retrieving one or more records according to one or more of the item identifiers corresponding to items located in the text, and associated with at least one of the records.

22. The system of claim 21 wherein the one-to-one association organizing means comprises item-location indicium organizing means for establishing a one-to-one association between each item-location indicium and the corresponding record which comprehends a reference to the respective item.

23. The system of claim 1 wherein each record corresponds to a unique portion of the text.

24. The system of claim 1 wherein:

a) the input means further comprises date input means for inputting one or more user-determined dates associated with at least one of the records;

b) the one-to-one association organizing means further comprises date organizing means for establishing a one-to-one association between each date and a corresponding record; and

(c) the output means further comprises record retrieval means for retrieving one or more records according to one or more dates associated with at least one record.

25. The system of claim 1 wherein

(a) the input means further comprises time input means for inputting a user-determined time associated with at least one of the records; and

(b) the output means further comprises record retrieval means for retrieving one or more records according to one or more times associated with at least one record.

26. The system of claim 1 wherein:

(a) the input means further comprises time period input means for inputting a user-defined range of dates associated with at least one of the records; and

(b) the output means further comprises record retrieval means for retrieving one or more records according to one or more ranges of dates associated with at least one record.

27. The system of claim 1 wherein:

(a) the input means further comprises time period input means for inputting a user-defined range of times associated with at least one of the records; and

(b) the output means further comprises record retrieval means for retrieving one or more records according to one or more ranges of times associated with at least one record.

28. The system of claim 1 wherein:

a) the topic input means further comprises subtopic input means for inputting one or more sets of user-defined subtopics wherein each set of subtopics comprises a subset of a set of topics;

b) the organizing means further comprises subtopic one-to-many organizing means for establishing a userdesignated one-to-many association between at least one of the user-defined subtopics and at least one of the records; and

(c) the output means further comprises record retrieval means for retrieving one or more records according to one or more of the user-defined subtopics associated with at least one record.

29. The system of claim 1 wherein:

a) the input means further comprises priority input means for enabling a user to input indicia of user-defined priority of degree of importance of at least one of the records;

b) the organizing means further comprises priority organizing means for establishing a user-designated association between an indicium of priority and at least one of the records; and

(c) the output means further comprises record retrieval means for retrieving one or more records according to one or more of the user-defined indicia of priority associated with at least one record.

30. The system of claim 1 wherein:

a) the input means further comprises reminder input means to enable a user to input user-defined reminders;

b) the organizing means further comprises reminder organizing means for establishing a user-designated association between the user-defined reminders and at least one of the records; and

(c) the output means further comprises record retrieval means for retrieving one or more records according to one or more of the user-defined reminders associated with at least one record.

31. The system of claim 1 wherein:

a) the organizing means further comprises query organizing means for storing a list of one or more user-defined queries; and

b) the output means further comprises query report means for generating a report of all locations of one or more of the queries in at least one of the first and second bodies of text.

32. The system of claim 1 wherein the output means comprises data report means for reporting user-selectable portions of the data contained in the records.

33. The system of claim 1 wherein the output means comprises relational report means for graphically reporting:

a) at least one of the one-to-one associations; and

b) at least one of the one-to-many associations.

34. The system of claim 1 wherein the output means comprises sorting means for arranging in alpha-numeric and chronological order, as appropriate, data from each of the records to be included in the report.

35. The system of claim 1 wherein:

a) the input means further comprises converter input means for converting the text of at least one of the first and second predetermined bodies of text into a series of numbered lines, the converter input means comprising:

i) converter input test means for testing whether the format of text to be converted comprehends adequate location indicia so that no further indicia will be required by the user; and

ii) reference copy converter means for superimposing reference page numbers or reference page and line numbers on the converted text in the event the format of the converted text does not comprehend adequate location indicia; and

b) the organizing means further comprises reference copy organizing means for establishing a one-to one association between the reference page numbers or reference page and line numbers of the converted text and corresponding location indicia in the text of at least one of the first or second predetermined bodies of text.

36. The system of claim 1 wherein the output means further comprises record retrieval means for retrieving one or more records according to one or more of the user-defined topics associated with at least one of the records.
 Description Submit all comments and votes
 


FIELD OF THE INVENTION

A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.

The present invention relates to a data processing system for the analysis of textual works. The present invention is particularly useful for, but not limited to, analyzing and managing transcripts of depositions and court proceedings, and items referenced in those transcripts.

BACKGROUND OF THE INVENTION

One of the primary methods of gathering evidence in litigation is through the use of depositions. A deposition is a witness' testimony under oath and is typically recorded in a transcript by a court reporter. The deposition transcript is read and analyzed by those in the legal profession to assist in the litigation process and prepare for a trial. During a trial, the testimony of multiple witnesses may be recorded in a trial transcript of the court proceedings. This trial transcript may then be read and analyzed by lawyers to prepare for post-trial motions or an appeal.

There are a number of available systems which attempt to assist lawyers, and others in the legal profession, with the analysis and management of deposition and trial transcripts. A typical system will allow full-text searching, meaning that a user may search all words in a transcript for any locations of a particular word. This is accomplished either by a brute force method of searching the transcript word by word or with the use of an index of all words in the transcript.

However, full-text searching, or searching through raw text, is of limited effectiveness in retrieving relevant and useful information from a database. Therefore, when a database merely consists of the raw text of a transcript, the computer-based system interacting with the database is not using the full power of the computer. This type of system is thus of only limited use in assisting the lawyer with the analysis of a transcript or other document.

Full-text searching has at least the following limitations which may affect the analysis of text. Most words have synonyms or near-synonyms that a user may not realize are used in the text. Since full-text searching only searches for the exact character string entered by a user, it will not locate any synonyms and relevant information could be missed. This same limitation is present in the case of misspellings of a word or even correct variations in the spelling of a word. Some systems attempt to allow for this situation by providing the capability of multiple queries or searching for a partial word (wildcard search). A partial word search allows a system to search for all occurrences of words that begin with a particular character string (root). Even with partial word searches, however, there are limitations because a partial word search usually generates irrelevant occurrences of all words of which the search characters are the root. Finally, full-text searching tends to recall only a small fraction of relevant information in the text and usually also recalls much irrelevant information.

Other systems allow the user to perform what is known in the industry as issue coding. As the user works through the transcript, the user may assign codes to certain portions of the transcript, and each unique code represents an issue of fact or law. Issue codes are typically used because a witness will typically not use the actual words of the issue in their testimony. For example, in a personal injury suit, one of the issues may be contributory negligence. The witness will never actually say "contributory negligence," but the witness' testimony may be relevant to that issue and thus provide support necessary to prove that issue.

In order to provide for this situation, a system may allow the user to assign a certain code to each portion of the transcript that relates to an issue such as contributory negligence. In such a system, the user could assign a "contributory negligence" code at every place in the transcript that the user believes is relevant to that issue. The system may then search the transcript for each of these issue codes and report the occurrences of an issue code. This type of system is also of limited effectiveness in managing a text database. There are no associations between an issue code assigned to one part of a transcript and the same issue code assigned to another part of the transcript. Changing an issue code at one location in the transcript will have no effect on the same issue code at other locations in the text. These issue codes are simply like another word in the text, and searching for issue codes is similar to full-text searching. The limitations of full-text searching are again present in issue coding.

Other systems also have limitations in the ability to manage exhibits or items referenced in the text. These systems typically handle exhibits by having the user enter words into the text, similar to issue codes, in order to identify the location of a reference to an item. This type of system does not have the ability to, for example, track exhibits across multiple transcripts. Therefore, with this type of system, there is no correlation between an exhibit referenced in one transcript and the same exhibit referenced in another transcript. The management of exhibits is also limited in full-text searching. Full-text searching does not have the ability to reference items external to the text such as exhibits, or non-text items in the text such as graphs or diagrams.

There is thus a need for a complex and sophisticated system for managing and analyzing the text contained in a document. For example, there is a need for this type of system in the legal profession in order to have a more powerful tool for analyzing transcripts and assisting in litigation.

SUMMARY OF THE INVENTION

The present invention solves these and other shortcomings of the prior art described above. The present invention also solves other shortcomings of the prior art which will become apparent to those skilled in the art upon reading and understanding the present specification.

The present invention is a computer-based system for analyzing textual works by operating on a model of the text as stored in a relational database. The text is divided into user-defined segments, and the system maintains a series of records, each of which characterizes a segment of the text. The system generates a one-to-one association between each record and indicia which indicate the length of the record and correspond to the beginning and end points of text segments.

The system also includes topic records which maintain a list of topics. The system generates one-to-many associations between topics and records so that a link is established between a particular topic and one or more records.

The system manages the text and generates reports of the data contained in the records and the relationships between the records and associated data for analysis of the text based on the model in the relational database. As further explained in the present specification, the use of a model of the text facilitates management of the text and provides for powerful and versatile reporting functions in the present invention.

In the preferred embodiment, the system may further include exhibit records which maintain a list of items referenced in the text. Such a system generates one-to-many associations between items and records so that a link is established between a particular item and one or more records. The preferred system is further enhanced by creating one-to-one associations between records and annotations to the text, synopses characterizing the text, and chronological information such as dates and times.

BRIEF DESCRIPTION OF THE DRAWINGS

In the drawings where like numerals refer to like components throughout several views,

FIGS. 1A and 1B are a map of a preferred database structure of the present invention, showing the relationships of the various files.

FIGS. 2A and 2B are a chart of the various files contained within a preferred database compatible with the present invention.

FIG. 2C is a chart of the files contained within the preferred database compatible with the present invention for indexing documents.

FIG. 3 represents a preferred flow of data through the disclosed text management and analysis system.

FIG. 4 represents a preferred flow of data through a process of creating a record in the Digest Records File.

FIG. 5 represents a preferred flow of data through a process of relating a record in the Digest Records File to the preferred database compatible with the present invention.

FIG. 6 represents a preferred flow of data through exhibit referencing to text and records in the Digest Records File.

FIG. 7 represents a preferred flow of data through full-text searching and reporting.

FIG. 8 represents a preferred flow of data through searching and reporting of a search domain.

FIG. 9 represents a preferred flow of data through reporting selected fields of records in the Digest Records File.

FIG. 10 represents a graphical illustration of a database relationship.

FIG. 11 represents a preferred user interface for accessing various systems of the disclosed embodiment.

FIG. 12 represents a preferred user interface for entering topics into topic records.

FIG. 13 represents a preferred user interface for creating an exhibit set record.

FIG. 14 represents a preferred user interface for inputting exhibit depictors into exhibit records.

FIG. 15 represents a preferred user interface for creating a keyword list.

FIG. 16 represents a preferred user interface for creating records in the Digest Records File and associating the records in the Digest Records File with other records.

FIG. 17 represents a preferred user interface for creating an association between a record in the Digest Records File and an exhibit record.

FIG. 18 is a report which represents the records in the Digest Records File and the associations of the records in the Digest Records File with other records.

FIG. 19 represents a preferred user interface for generating a report.

FIG. 20 represents a preferred method of reporting a full-text search.

FIG. 21 represents a preferred user interface for structuring a report.

FIG. 22 is a block diagram showing a full-text indexing scheme.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

In the following detailed description of the preferred embodiment, reference is made to the accompanying drawings which form a part hereof and in which is shown by way of illustration a specific embodiment in which the invention may be practiced. This embodiment is described in sufficient detail to enable those skilled in the art to practice the invention, and it is to be understood that other embodiments may be utilized and that structural or logical changes may be made without departing from the scope of the present invention. The following detailed description is, therefore, not to be taken in a limiting sense, and the scope of the present invention is defined by the appended claims.

I. INTRODUCTION

The preferred system utilizes a relational database to construct an analog or model of a textual work such as a deposition transcript. By using a model, the present system has much versatility and power in the analysis and management of the database. A model allows, among other things, cross-referencing of different files in the database for sorting and searching capabilities which assist in the analysis of the information stored in the database. For example, a user may track deposition exhibits across multiple transcripts because a file in the model cross-references each deposition exhibit with the location in the text where the exhibit is referenced. Another file in the preferred model cross-references topics with corresponding locations in the source text, i.e. the text to be analyzed, which discuss this topic. This allows sorting and searching by a particular topic or multiple topics.

When the source text is deposition transcripts, searching and sorting by topics is a powerful tool for a litigator in assessing the testimony of witnesses. The preferred embodiment describes a system for analyzing source text including transcripts of depositions or court proceedings. One skilled in the art will recognize that the present invention may be used for analyzing any type of textual work, which may include, for example, multivolume periodicals, books, and scientific research reports.

The preferred embodiment contains certain fundamental functions, the preferred data flow of which is shown in FIG. 3. The preferred system allows the user to enter topics and other information. The preferred system allows a the user to create records by dividing the source text so that each record contains data which characterizes a segment of the source text. Each record also contains indicia which represent the beginning and the end of each record. The preferred system creates one-to-one associations between each record and the corresponding indicia so that only one record relates to each indicia. The user may relate topics or subjects to records in order to further characterize the data contained within the record. The preferred system creates one-to-many associations between each topic and the corresponding records so that each topic may relate to many different records. Finally, the preferred system has the capability to generate reports of the data contained within the records and the relationships between data fields. The result is a model of source text in database format which may comprise segments of source text represented in database records, enhanced by the capability of the user to retrieve, among other things, topic relationships, synopses, annotations, item reference relationships, and chronological relationships.

The preferred embodiment operates in a Pascal environment on a Macintosh hardware platform. One skilled in the art will recognize that other programming or hardware environments may be used without departing from the scope of the invention.

II. DATABASE STRUCTURE OF A MODEL

FIG. 1 is a map of the preferred relational database structure which forms the basis for the model. One skilled in the art will recognize the utility of such a database map in understanding the disclosed system. The boxes on the map represent database files, each file comprising a series of records. Each record contains one or more fields, and the letters on the right side of the records shown identify the data type of the corresponding field. Preferred data types may be defined as shown in Table 1.

TABLE 1 ______________________________________ DATA TYPE MEANING ______________________________________ A Alpha (fixed length character string). B Boolean. D Date. H Time. I Integer (2 bytes). L Long integer (4 bytes). T Text of any length. ______________________________________ .COPYRGT. 1991 Robins Analytics, Inc.

The lines between the boxes of FIGS. 1A and 1B represent the relationships between the files. FIGS. 2A-2C show the structure of the individual files in the database map of FIGS. 1A and 1B.

The set of all the files shown in FIGS. 1A, 1B, and 2A-2C constitutes one model. The present system may create additional models utilizing another set of identical files. Therefore, the present invention may be used, for example, for analyzing deposition transcripts from many different litigation cases.

A. Case Information

The files discussed in the text following immediately below typically contain information which characterizes identifying features of the text, as opposed to the content of the text. The information contained within these files is preferably used as a reference for all other files in the database map so that the system can associate information in the other files of a model with a particular body of source text.

1. Cases File

The Cases File 100 contains the identification of one case, such as one litigation case. One case file may contain multiple transcripts. The fields of each record in the Cases File 100 may be defined as shown in Table 2.

TABLE 2 ______________________________________ FIELD MEANING ______________________________________ version Version of the system operating on this model. dateTime Date and time when this version was created. doWarning If the user wants the system to display a warning (such as a protective order) before entering data files. ID Identification number for this record. ______________________________________ .COPYRGT. 1991 Robins Analytics, Inc

2. Matters File

The Matters File 101 contains information which identifies a body of source text for a particular witness or court proceeding. For example, the model may contain one Matters File for each witness in a litigation case. The fields of each record in the Matters File 101 may be defined as shown in Table 3.

TABLE 3 ______________________________________ FIELD MEANING ______________________________________ matterType Number indicating witness or proceeding. first.sub.-- Judge First name of the witness or, for court proceedings such as a trial or motion hearing, name of the judge. last.sub.-- title Last name of the witness, or title of the proceeding. address.sub.-- fullnam Witness' address, or full name of the proceeding. phone.sub.-- tribunal Witness' phone number, or tribunal of the proceeding. fax.sub.-- number Witness' fax number, or tribunal file number for proceeding. ID Identification number for this record. caseID Identification number of the corresponding record in the Cases File 100. topicIds Array of Identification numbers for all topics considered for this transcript. ______________________________________ .COPYRGT. 1991 Robins Analytics, Inc.

3. Transcript Information File

The Transcript Information File 106 typically contains identification information for the transcripts of each of the witnesses or proceedings of the Matters File 101. For example, the Matters File 101 may contain one record for a particular witness who was deposed on several different days. The Transcript Information File 106 normally would have a different record for the transcript of the witness' testimony for each of those days. While there is preferably only one record in the Matters File 101 for each witness or proceeding, that record may be linked to several records in the Transcript Information File 106. The fields of each record in the Transcript Information File 106 may be defined as shown in Table 4.

TABLE 4 ______________________________________ FIELD MEANING ______________________________________ date Date of the deposition or court proceeding. preparedBy Optional name of the person who is analyzing the transcript. matterID Identification number of the corresponding witness' or proceeding's record in the Matters File 101. location Address where the deposition or proceeding occurred. crName Name of the court reporter who recorded the testimony in the transcript. crAddress Court reporter's address. crFirm Court reporter's employer. crPhone Court reporter's phone number. startPage First page number of the testimony in the transcript. endPage Last page number of the testimony in the transcript. exhibitSetID Identification number of the corresponding exhibit set record for this transcript in the Exhibits File 103. volumeinfo Volume number (for multi- volume transcripts). linesPerPage Number of lines of testimony per page in the transcript. referenceCopy Whether it is a reference copy (if a reference copy was generated in order to convert the ASCII file from the court reporter). ID Identification number for this record. ______________________________________ .COPYRGT. 1991 Robins Analytics, Inc.

4. Appearances File

The Appearances File 109 typically contains additional information on each of the transcripts identified in the records of the Transcript Information File 106. Appearances typically means the attorneys present at the deposition or court proceeding. The appearances information preferably is linked to the record of the corresponding transcript in the Transcript Information File 106. The fields of each record in the Appearances File 109 may be defined as shown in Table 5.

TABLE 5 ______________________________________ FIELD MEANING ______________________________________ name Name of the attorney/representative. otherInfo User-defined textual data. transcriptID Identification number of the corresponding record in the Transcript Information File 106. ______________________________________ .COPYRGT. 1991 Robins Analytics, Inc.

B. Digest Records File

The Digest Records File 111 typically contains one or more records for each transcript, and the information in the fields of a particular source text's records characterizes the content of the source text. The user establishes the length of each record, and each record may have a different user-defined length as desired. Each record contains data which identifies the locations in the source text which correspond to the beginning and end points of a record.

If the source text is a deposition transcript, for example, the records corresponding to this transcript will contain information entered by the user which characterizes the witness' testimony. By using the Digest Records File 111 in conjunction with other files, the system may manipulate the data in the model to assist the user in analyzing the witness' testimony. These capabilities will be explained in more detail below.

The fields of each record in the Digest Records File 111 may be defined as shown in Table 6.

TABLE 6 ______________________________________ FIELD MEANING ______________________________________ fromLine Line number which corresponds to the beginning of this segment of text in the transcript. toLine Line number which corresponds to the end of this segment of text in the transcript . transcriptID Identification number of the corresponding record for this transcript in the Transcript Information File 106. ID Identification number for this record. digest Summary of the testimony within the beginning and end points that correspond to this record. Comment User-defined annotation. TopicSentence Sentence which characterizes the content of the testimony within the beginning and end points that correspond to this record. ______________________________________ .COPYRGT. 1991 Robins Analytics, Inc.

In the preferred system, the fields which establish the length of the record, fromLine and toLine, are line numbers generated by the system which correspond to specific page and line numbers in the transcript or other document. The system preferably converts a page and line number in the transcript to a single line number by knowing the value of the linesPerPage field in the Transcript Information File 106. For example, if the transcript has 25 lines per page, and line number 4 on page 3 is the beginning of the record, this point will be converted in the preferred system to 54 (fromLine=(25 lines/page.times.2 pages)+4). Manipulation and management of the records is facilitated by using single numbers to define the limits of a record, as opposed to two numbers (page and line) for each limit.

The comment field contains a user-defined annotation which may be any textual data. For example, the user may enter a note which reminds the user to take additional testimony from this witness on a particular topic.

C. Topics Files

The Topics File 104 contains a list of topics, one record for each topic. The topics may be subjects which characterize either a particular segment of the source text or items identified in the source text. In order to characterize a portion of the source text with a topic, the topics may be linked to a record in the Digest Records File 111 via the Topic Reference File 107. Topics may also be linked to an exhibit (item) via the TopicToExhibit Reference File 105.

The present system may also create subtopics for the model. A subset of master topics, or simply topics, consisting of one or more of the topics, would typically be used with any transcript for a particular case, whereas a set of subtopics would be grouped with one particular topic.

The fields of each record in the Topics File 104 may be defined as shown in Table 7.

TABLE 7 ______________________________________ FIELD MEANING ______________________________________ topic Name of the topic. ID Identification number for this record. superTopicID If this record is a subtopic, superTopicID is the identification number of the record for the topic associated with this subtopic. Otherwise, if this record is a master topic, superTopicID is zero. ______________________________________ .COPYRGT. 1991 Robins Analytics, Inc.

The fields of each record in the Topic Reference File 107 may be defined as shown in Table 8.

TABLE 8 ______________________________________ FIELD MEANING ______________________________________ topicID Identification number for the record containing a topic to be associated with a record in the Digest Records File. digestRecordID Identification number for the corresponding record in the Digest Records File. ______________________________________ .COPYRGT. 1991 Robins Analytics, Inc.

As shown in Table 8, the Topic Reference File 107 creates a link between a topic record and a record in the Digest Records File. Since each topic may be linked to more than one record in the Digest Records File, the Topic Reference File 107 may create a one-to-many association between a particular topic and multiple records by having a series of records with the same topic record identification number and different identification numbers of records in the Digest Records File.

The fields of each record in the TopicToExhibit Reference File 105 may be defined as shown in Table 9.

TABLE 9 ______________________________________ FIELD MEANING ______________________________________ topicID Identification number for the record containing a topic to be associated with an exhibit record. exhibitID Identification number for the corresponding exhibit record. ______________________________________ .COPYRGT. 1991