WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Computer system and method for data base indexing and information retrieval    
United States Patent4817036   
Link to this pagehttp://www.wikipatents.com/4817036.html
Inventor(s)Millett; Ronald P. (Orem, UT); Millett; Howard L. (Orem, UT); Allen; Dell K. (Orem, UT)
AbstractA computer system and method for data base indexing and information retrieval. A number of keywords are selected and each record of a data base is searched to determine in which records each keyword appears. The central processing unit of the system then creates a vector for each keyword which identifies each record number of the data base where the keyword appears and numerically sorts the record numbers. A special bit processor next transforms each vector into a bit string that is identified by one of the keywords. The bit strings are returned to the central processing unit and are stored in secondary storage so as to form an index for the data base. To retrieve information, one or more keywords are input to the central processing unit. The input keywords are used by the central processing unit to identify the bit string for each keyword. The keywords may be logically joined using "AND," "OR" and/or "NOT" commands. Each bit string retrieved from the index is then sent to the special purpose bit processor, which combines the bit strings according to the particular command. The resultant bit string is transformed by the bit processor into a vector which is returned to the central processing unit and which then is used to identify the individual records which contain the combined keywords.
   














 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Drawing from US Patent 4817036
Computer system and method for data base indexing and information

     retrieval - US Patent 4817036 Drawing
Computer system and method for data base indexing and information retrieval
Inventor     Millett; Ronald P. (Orem, UT); Millett; Howard L. (Orem, UT); Allen; Dell K. (Orem, UT)
Owner/Assignee     Brigham Young University (Provo, UT)
Patent assignment
All assignments
Publication Date     March 28, 1989
Application Number     06/712,546
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     March 15, 1985
US Classification     707/1 707/3
Int'l Classification     G06F 015/40
Examiner     Shaw; Gareth D.
Assistant Examiner     Kulik; Paul
Attorney/Law Firm     Workman, Nydegger, Jensen
Address
Parent Case    
Priority Data    
USPTO Field of Search     364/900 364/200 364/300 364/900 MS File 364/200 MS File
Patent Tags     computer data base indexing information retrieval
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
3388381



[0 after 0 votes]
3643226



[0 after 0 votes]
4554631
Reddington
707/4
Nov,1985

[0 after 0 votes]
4468728
Wang
707/1
Aug,1984

[0 after 0 votes]
4422158
Galie
707/5
Dec,1983

[0 after 0 votes]
4417321
Chang
707/7
Nov,1983

[0 after 0 votes]
4414644
Tayler
711/162
Nov,1983

[0 after 0 votes]
4410960
Kasuya
712/300
Oct,1983

[0 after 0 votes]
4408301
Iida
345/636
Oct,1983

[0 after 0 votes]
4403300
Bavoux
710/260
Sep,1983

[0 after 0 votes]
4384342
Imura
711/5
May,1983

[0 after 0 votes]
4384343
Morganti
707/3
May,1983

[0 after 0 votes]
4318184
Millett
707/1
Mar,1982

[0 after 0 votes]
4086628
Woodrum
707/7
Apr,1978

[0 after 0 votes]
4085446
Nagamura
711/117
Apr,1978

[0 after 0 votes]
4074235
Thomas
707/1
Feb,1978

[0 after 0 votes]
4068224
Bechtle
382/235
Jan,1978

[0 after 0 votes]
4068298
Dechant
707/3
Jan,1978

[0 after 0 votes]
4068301
Ishino
711/117
Jan,1978

[0 after 0 votes]
4068303
Morita
711/207
Jan,1978

[0 after 0 votes]
4056711
Lamar
235/454
Nov,1977

[0 after 0 votes]
3916387
Woodrum
707/3
Oct,1975

[0 after 0 votes]
3737864
Werner
712/242
Jun,1973

[0 after 0 votes]
3725875
Choate
711/104
Apr,1973

[0 after 0 votes]
3716840
Masten
707/3
Feb,1973

[0 after 0 votes]
3699531
Heimann
345/25
Oct,1972

[0 after 0 votes]
3678461
Choate
706/12
Jul,1972

[0 after 0 votes]
3678470
Choate
708/1
Jul,1972

[0 after 0 votes]
3670310
Bharwani
707/3
Jun,1972

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


What is claimed and desired to be secured by United States Letters Patent is:

1. In a computer system comprising a CPU, an input/output terminal connected to said CPU, a main CPU memory and a secondary storage means containing a data base, a method of indexing individual records of said data base, and rapidly searching and retrieving selected records corresponding to one or more keywords input to said CPU, said method comprising the steps of:

said CPU forming a vector for each said keyword, each said vector comprising one or more array elements which together comprise a numerically sorted list of all record numbers where the keyword for that vector is found;

said CPU transforming each said vector so as to form a data base index comprising a bit string for each said vector, said step of transforming each said vector comprising the steps of:

(a) transforming said numerically sorted list of record numbers into a binary matrix wherein each row of said matrix corresponds to a binary representation of one of said vector array elements, and wherein each column of said matrix corresponds to a level of said hierarchal tree;

(b) determining the first column of said matrix where both ones and zeros are present;

(c) grouping said ones and zeros to identify the number of bits in each such group;

(d) determining whether the first and last bit in each said group are both ones, are zero and one or both zeros, and outputting a "01," "11" or "10," respectively, so as to form one bit pair of said bit string;

(e) splitting the next column of said matrix into groups of bits based on the number of bits in each group determined in step (c);

(f) repeating steps (c) and (d) for each said group of said next column; and

(g) repeating steps (b) through (f) until each column of said matrix has been done.

said CPU storing said data base index in said secondary storage means;

inputting at said input/output terminal at least one keyword;

said CPU searching said data base index and retrieving the bit string for said keyword input at said terminal; said CPU transforming said retrieving bit string back into the vector for said input keyword; and

said CPU identifying at said input/output terminal the records of said data base identified by said list of record numbers associated with the vector for said input keyword.

2. A method as defined in claim 1 wherein the size of said data base is defined by a number of bytes and wherein said keywords comprise substantially all words contained in each record of said data base.

3. A method as defined in claim 2 wherein said data base index is not larger than approximately one-fifth the number of bytes contained in said data base.

4. A method as defined in claim 2 wherein said step of transforming each said vector comprises transforming a binary representation of said vector to a bit string in which the number of bits is reduced, on the average, to not more than approximately one-fifth the number of bits contained in said binary representation of said vector.

5. A method as defined in claim 1 wherein each said bit string comprises a header identifying one of said keywords, the number of bit pairs in the bit string, the number of array elements in the vector from which said bit string was transformed, and a level in said hierarchal tree at which said bit string begins.

6. A method as defined in claim 1 wherein each said bit string comprises a plurality of bit pairs and wherein each said bit pair represents a level and a node number of said hierarchal tree.

7. A method as defined in claim 6 wherein said step of transforming said retrieved bit string back into the vector for said keyword comprises the steps of:

beginning from left to right and front top to bottom, retrieving in sequence each bit pair for each level and node of said hierarchal tree represented by said bit string;

determining whether each retrieved bit pair represents a terminal node of said tree; and

for each retrieved bit pair representing a terminal node of said tree, outputting an array element consisting of a record number derived from the node number represented by said bit pair for said terminal node.

8. A method as defined in claim 1 wherein said computer system further comprises a bit processor and a memory for programming said bit processor.

9. A method as defined in claim 8 wherein said step of transforming each said vector comprises the steps of said CPU calling said bit processor and said bit processor thereafter receiving from said CPU each said vector and transforming each said vector and then returning to said CPU a bit string formed from said vector.

10. A method as defined in claim 8 wherein said step of transforming said retrieved bit string back into the vector for said keyword comprises the steps of said CPU calling said bit processor and said bit processor thereof thereafter receiving from said CPU said retrieved bit string and transforming said retrieved bit string back into said vector and then returning said vector to said CPU.

11. A method as defined in claim 1 further comprising the steps of:

inputting at least two keywords at said input/output terminal;

inputting at said input/output terminal a command instructing said CPU to logically join said keywords in a selected way;

said CPU searching said data base index and retrieving the bit string for each said keyword to be joined;

said CPU merging the bit strings retrieved for said keywords to form a resultant bit string for the joined keywords;

said CPU transforming said resultant bit string into a vector which identifies by record number each record of said data base containing the joined keywords; and

said CPU identifying at said input/output terminal the records identified by the vector for said joined keywords.

12. A method as defined in claim 11 wherein each said bit string comprises a plurality of bit pairs and wherein each said bit pair represents a level and a node number of said hierarchal tree.

13. A method as defined in claim 12 wherein said command comprises a logical "AND" instruction and wherein said step of merging said retrieved bit string comprises the steps of:

beginning from left to right and from top to bottom, retrieving in sequence each bit pair of said retrieved bit string at each level and each node of said hierarchal tree represented by said retrieved bit strings;

determining each level and node at which a bit pair for each retrieved bit string is present and performing a logical "AND" operation on the corresponding bits of each bit pair and each such level and outputting the results of said "AND" operation as a bit pair of said resultant bit string; and

skipping each bit pair at each level where a bit pair for only one retrieved bit string is present.

14. A method as defined in claim 12 wherein said command comprises a logical "OR" instruction and wherein said step of merging said retrieved bit strings comprises the steps of:

beginning from left to right and from top to bottom, retrieving in sequence each bit pair of said retrieved bit strings at each level and each node of said hierarchal tree represented by said retrieved bit strings;

determining each level and node at which a bit pair for each retrieved bit string is present and performing a logical "OR" operation on the corresponding bits of each bit pair at each such level, and outputting the result of said "OR" operation as a bit pair of said resultant bit string; and

reproducing each bit pair at each level and node where a bit pair for only one retrieved bit string is present, said output bit pairs and said reproduced bit pairs together forming said resultant bit string.

15. A method as defined in claim 12 wherein said command comprises a logical "NOT" instruction for a first keyword "NOT" a second keyword, and wherein said step of merging said retrieved bit strings comprises the steps of:

beginning from left to right and from top to bottom, retrieving in sequence each bit pair of said retrieved bit strings at each level and each node of said hierarchal tree represented by said retrieved bit strings;

determining each level and node where only a bit pair of the retrieved bit string for said first keyword is present, and reproducing the bit pair of said retrieved bit string for said first keyword at each such level and node; and

determining each level and node where a bit pair for the retrieved bit strings of both said keywords is present, and if said node is non-terminal reproducing the bit pair for keyword one, and if said node is terminal outputting a bit pair in which a "1" bit is present only where a bit for said first keyword is "1" and a corresponding bit for said second keyword is "0."

16. A method as defined in claim 11 wherein said computer system further comprises a bit processor and a memory for programming said bit processor.

17. A method as defined in claim 16 wherein said step of merging said bit strings comprises the steps of said CPU calling said bit processor and sending said retrieved bit strings to said bit processor, and said bit processor thereafter merging said bit strings to from said resultant bit string and returning said resultant bit string to said CPU.

18. A method as defined in claim 17 wherein said step of transforming said resultant bit string comprises the steps of said CPU calling said bit processor and sending said resultant bit string to said bit processor, and said bit processor thereafter transforming said resultant bit string into said vector for said joined keywords and returning said vector to said CPU.

19. A method as defined in claim 1 further comprising the steps of:

adding one or more new records to said data base and identifying all keywords contained by said new records;

said CPU forming a vector for each keyword of said new records and transforming each vector into a bit string for each keyword of the new records;

said CPU searching said data base index and retrieving from said index each keyword corresponding to a keyword contained in said new records, and then merging the bit strings for said corresponding keywords to form an updated bit string for that keyword; and

said CPU adding to said index the bit string for each keyword of said new records for which no corresponding keyword was found in said index.

20. A method as defined in claim 19 wherein said step of merging said bit strings comprises performing a logical "OR" operation on each pair of bit strings to be merged.

21. A method as defined in claim 1 further comprising the steps of:

deleting one or more records from said data base and identifying all keywords contained by said deleted records;

said CPU forming a vector for each keyword of said deleted records and transforming each vector into a bit string for each keyword of the deleted records; and

said CPU searching said data base index and retrieving from said index each keyword corresponding to a keyword contained in said deleted records, and then merging the bit strings for said corresponding keywords to form an updated bit string for each such keyword.

22. A method as defined in claim 21 wherein said step of merging said bit strings comprises performing a logical "NOT" operation on each pair of bit strings to be merged.

23. A method of indexing, searching and retrieving individual records contained in a computer data base, said method comprising the steps of:

inputting to a computer a plurality of records to be stored as said data base, and assigning a record number to each said record, each said record number corresponding to a terminal node of a hierarchal tree;

identifying a plurality of keywords contained in said records and listing for each keyword those record numbers where that keyword is found;

forming a vector for each said keyword, each said vector containing one or more array elements which together comprise a numerically sorted list of all said record numbers which identify the records where the keyword for that vector is found;

transforming each said vector by converting each said array element into a corresponding binary representation and thereafter deriving from said binary representations a bit string comprising a plurality of bit pairs, each bit pair representing a level and a node of a hierarchal tree, and each said bit string being identified by the keyword for the vector from which said bit string was formed;

storing each said keyword and its associated bit string so as to form an index to said data base;

inputting to said computer at least two keywords and command instructing said computer to logically join said keywords in a specified manner;

searching said index and retrieving therefrom the bit strings identified by said keywords;

merging said retrieved bit strings to form a resultant bit string for said logically joined keywords, said resultant bit string comprising a plurality of bit pairs, each bit pair representing a level and a node of said hierarchal tree;

transforming said resultant bit string into a vector containing one or more array elements which together comprise a numerically sorted list of all record numbers for those records in which said logically joined keywords appear; and

outputting from said computer an identification of all said records in which said logically joined keywords apear.

24. A method as defined in claim 23 wherein said records each comprise a plurality of words and wherein said keywords comprise substantially all words contained in each said record of said data base.

25. A method as defined in claim 24 wherein the size of said data base is defined by a number of bytes and wherein said index to said data base is not larger than approximately one-fifth the number of bytes contained in said data base.

26. A method as defined in claim 25 wherein said step of transforming each said vector into a bit string comprises arranging said binary representations in the form of a corresponding binary matrix and thereafter transforming said binary matrix into said bit string, the size of said binary matrix being defined in terms of a number of bits and the number of bits contained ins aid bit string being reduced by said transformation, on the average, to not more than approximately one-fifth the number of bits contained in said binary matrix.

27. A method as defined in claim 23 wherein said step of transforming each said vector into a bit string comprises the steps of:

(a) transforming said numerically sorted list of record numbers of said vector into a binary matrix wherein each row of said matrix corresponds to a binary representation of one of said vector array elements of said numerically sorted list, and wherein each column of said binary matrix corresponds to a level of said hierarchal tree;

(b) determining the first column of said matrix where both ones and zeros are present;

(c) grouping said ones and zeros to identify the number of bits in each such group;

(d) determining whether the first and last bit in each said group are both ones, are zero and one or both zeros, and outputting a "01," "11" or "10" respectively, so as to form one bit pair of said bit string;

(e) splitting the next column of said matrix into groups of bits based on the number of bits in each group determined in step (c);

(f) repeating steps (c) and (d) for each said group of said next column; and

(g) repeating steps (b) through (f) until each column of said matrix has been done.

28. A method as defined in claim 23 wherein said step of transforming said resultant but string into a vector containing a numerically sorted list of all record numbers which identify the records in which said logically joined keywords appear comprises the steps of:

beginning from left to right and from top to bottom, retrieving in sequence of each bit pair for each level and node of said hierarchal tree represented by said resultant bit string;

determining whether each retrieved bit pair represents a terminal node of said tree; and

for each retrieved bit pair representing a terminal node of said tree, outputting an array element derived from the node number represented by said bit pair for said terminal node.

29. A method as defined in claim 23 wherein said command instructing said computer to logically join said keywords comprises a logical "AND" instruction and wherein said step of merging said retrieved bit strings to form said resultant bit string comprises the steps of:

beginning from left to right and from top to bottom, retrieving in sequence each bit pair of said retrieved bit strings at each level and at each node of said hierarchal tree represented by said retrieved bit strings;

determining each level and node at which a bit pair for each retrieved bit string is present and performing a logical "AND" operation on the corresponding bits of each bit pair at each such level and outputting the result of such logical "AND" operation as a new bit pair at each such level; and

skipping each bit pair at each level where a bit pair for only one retrieved bit string is present.

30. A method as defined in claim 23 wherein said command instructing said computer to logically join said keywords comprises a logical "OR" instruction and wherein said step of merging said retrieved bit strings comprises the steps of:

beginning from left to right and from top to bottom, retrieving in sequence each bit pair of said retrieved bit strings at each level and each node of said hierarchal tree represented by said retrieved bit strings;

determining each level and node at which a bit pair for each retrieved bit string is present and performing a logical "OR" operation on the corresponding bits of each bit pair at each such level, and outputting the result of said logical "OR" operation as a new bit pair at each such level; and

reproducing each bit pair at each level and node where a bit pair for only one retrieved bit string is present, said new bit pairs and said reproduced bit pairs together forming said resultant bit string.

31. A method as defined in claim 30 further comprising the steps of:

inputting to said computer a plurality of new records to be added to said data base and identifying all keywords contained in said new records;

forming a vector for each keyword contained in said new records;

transforming each vector for the keywords of said new records into a bit string;

searching said index to said data base and retrieving from said index all keywords corresponding to the keywords contained in said new records;

performing said logical "OR" operation so as to merge the bit string for each keyword contained in said new records with said corresponding keyword retrieved from said index so as form a new bit string for that keyword; and

storing said new bit string in said index to said data base.

32. A method as defined in claim 23 wherein said command instructing said computer to logically join said keywords comprises a logical "NOT" instruction for selecting all records containing a first keyword and "NOT" a second keyword, and wherein said step of merging said retrieved bit strings comprises the steps of:

beginning from left to right and from top to bottom, retrieving in sequence each bit pair of said retrieved bit strings at each level and each node of said hierarchal tree represented by said retrieved bit strings;

determining each level and each node where only a bit pair of the retrieved bit string for said first keyword is present, and reproducing the bit pair of said retrieved bit string for said first keyword; and

determining each level and each node where a bit pair for the retrieved bit strings of both said keywords is present, and if said node is non-terminal reproducing the bit pair for keyword one, and if said node is terminal outputting a bit pair in which a "1" bit is present only where a bit for said first keyword is "1" and a corresponding bit for said second keyword is "0."

33. A method as defined in claim 32 further comprising the steps of:

deleting one or more records from said data base and identifying all keywords contained by said deleted records;

forming a vector for each keyword of said deleted records and transforming each said vector into a bit string identifying each keyword of said deleted records;

searching said index and retrieving from said index each keyword corresponding to a keyword contained in said deleted records; and

performing said logical "NOT" operation so as to merge the bit strings for said corresponding keyword to form an updated bit string for each such keyword and thereafter storing said updated bit string in said index.

34. A computer system for indexing and rapidly searching and retrieving individual records contained in a data base, said system comprising:

a main CPU memory for controlling a CPU;

a secondary storage means comprising:

(a) a plurality of individual data base records each containing one or more keywords from which a vector for each said keyword is formed, each said vector comprising one or more array elements which together comprise a numerically sorted list of all record numbers which identify the terminal nodes of a hierarchal tree where the records containing the keyword for that vector are found; and

(b) an index to said data base records comprising a plurality of randomly linked bit strings deriv