WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Indexing/compression scheme for supporting graphics and data selection    
United States Patent5301315   
Link to this pagehttp://www.wikipatents.com/5301315.html
Inventor(s)Pellicano; Russell A. (North Bayshore, NY)
AbstractA method for interactively selecting and displaying a distribution of all data fields in a selected database. The primary function of the methodology is to allow a user to view the data contained within a selected database in a graphical format. Additionally, the user has the capability of selecting specific sets of data for more detailed display as well as determining the effect of the selection on all remaining data fields which comprise the selected database. The first step in the process is the accessing of the particular database from the host system. Once the selected database has been accessed, a distribution matrix for each field in the selected database is constructed from the data contained within the selected database. The distribution matrix is then graphically displayed on any type of graphical display unit. From this initial graphic display, the user can make various selections for detailed graphical display and study and analyze the data structure and data content of the database.
   














 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Drawing from US Patent 5301315
Indexing/compression scheme for supporting graphics and data selection - US Patent 5301315 Drawing
Indexing/compression scheme for supporting graphics and data selection
Inventor     Pellicano; Russell A. (North Bayshore, NY)
Owner/Assignee     Computer Concepts Corp. (Bohemia, NY)
Patent assignment
All assignments
Publication Date     April 5, 1994
Application Number     07/766,860
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     September 27, 1991
US Classification     707/4
Int'l Classification     G06F 015/401
Examiner     Kulik; Paul V.
Assistant Examiner    
Attorney/Law Firm     Scully, Scott, Murphy & Presser
Address
Parent Case    
Priority Data    
USPTO Field of Search     395/600 395/147 395/161 364/554 364/555 364/715.02
Patent Tags     indexing/compression scheme supporting graphics data selection
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
5237681
Kagan
707/104.1
Aug,1993

[0 after 0 votes]
4752889
Rappaport
706/11
Jun,1988

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


What is claimed is:

1. A method for selecting and displaying a distribution of data in a selected data structure, said selected data structure having a particular file structure associated therewith and comprising at least a plurality of data records, and a plurality of data fields having data therein, said method comprising the steps of:

(a) accessing said selected data structure resident on a host system and reading the data therein to determine selected parameters of the distribution of said data;

(b) building a plurality of distribution matricies from data contained within said selected data structure with each matrix having a field header and an array of data bins and data summeries associated therewith;

(c) generating a display from said plurality of distribution matricies for said selected data structure to provide a graphical representation of the distribution of data contained within said selected data structure.

2. The method for selecting and displaying according to claim 1, wherein said step of accessing said selected data structure comprises the steps of:

(a) retrieving each of said plurality of data records within said selected data structure;

(b) reading each data field in one of said plurality of data records, and converting the data in said data field to a distribution data file; and

(c) repeating step (b) for each of said plurality of data records to form a master distribution data file for said selected data structure.

3. The method for selecting and displaying according to claim 2, wherein said step of building a plurality of distribution matricies comprises the steps of:

(a) calculating the number of bins to utilize for all data within said selected data structure;

(b) calculating a minimum and a maximum field data value for said plurality of data records in said selected data structure;

(c) calculating a deltabin value for one of said plurality of data fields in said selected data structure and assigning a bin range for the calculated number of bins based upon said deltabin value;

(d) generating a field header for one of said plurality of data fields from said distribution data file and at least the results of steps (a)-(c);

(e) processing all data contained in said master distribution data file to build said plurality of distribution matrices from said field header and said master distribution data file;

(f) processing each of said plurality of data fields contained in said selected data structure by repeating steps (c) through (e).

4. The method for selecting and displaying according to claim 3, wherein said step of building a plurality of distribution matricies further comprises the step of compressing all information contained in said plurality of distribution matricies into a master distribution matrix.

5. The method of selecting and displaying according to claim 4, wherein said step of generating a display from said plurality of distribution matricies for said selected data structure comprises the steps of:

(a) opening at least said master distribution matrix to be displayed; and

(b) displaying the distribution of data contained in said master distribution matrix in a graphical format.

6. The method of selecting and displaying according to claim 5, wherein said step of generating a display from said plurality of distribution matricies for said selected data structure further comprises the step of expanding the compressed information contained in said master distribution matrix.

7. A method for selecting and displaying a distribution of data in a selected data structure, said selected data structure having a particular file structure associated therewith and comprising at least a plurality of data records, and a plurality of data fields having data therein, said method comprising the steps of:

(a) accessing said selected data structure resident on a host system and reading the data therein to determine selected parameters of the distribution of said data;

(b) building a plurality of distribution matricies from data contained within said selected data structure with each matrix having a field header and an array of data bins and data summeries associated therewith to form a master distribution matrix;

(c) generating a first display from said master distribution matrix for said selected data structure to provide a graphical representation of the distribution of data contained within said selected data structure, said first display providing a range of data values for each selected field; and

(d) modifying a specific range of data values for a selected data field in said first display, and thereafter generating a second display to determine the representative effect of said modification upon the master distribution matrix.

8. The method of selecting and displaying according to claim 7, wherein said method further comprises the step of generating a third display of a detailed distribution for a selected range of said master distribution matrix to provide a detailed graphical representation of the data contained within the selected range of said master distribution matrix.

9. The method for selecting and displaying according to claim 8, wherein said step of accessing said selected data structure comprises the steps of:

(a) retrieving each of said plurality of data records within said selected data structure;

(b) reading each data field in one of said plurality of data records, and converting the data in said data field to a distribution data file; and

(c) repeating step (b) for each of said plurality of data records to form a master distribution data file for said selected data structure.

10. The method for selecting and displaying according to claim 9, wherein said step of building a plurality of distribution matricies comprises the steps of:

(a) calculating the number of bins to utilize for all data within said selected data structure;

(b) calculating a minimum and a maximum field data value for said plurality of data records in said selected data structure;

(c) calculating a deltabin value for one of said plurality of data fields in said selected data structure and assigning a bin range for the calculated number of bins based upon said deltabin value;

(d) generating a field header for one of said plurality of data fields from said master distribution data file and at least the results of steps (a)-(c);

(e) processing all data contained in said master distribution data file to build said master distribution matrix from said field header and said master distribution data file;

(f) processing each of said plurality of data fields contained in said selected data structure by repeating steps (c) through (e).

11. The method of selecting and displaying according to claim 10, wherein said step of generating a first display from said master distribution matrix for said selected data structure comprises the steps of:

(a) opening at least said master distribution matrix to be displayed; and

(b) displaying at least a portion of the distribution of data contained in said master distribution matrix in a graphical format.

12. The method for selecting and displaying according to claim 7, wherein said step of modifying a specific range for a selected data field in said first display further comprises generating a second master distribution matrix wherein said master distribution matrix is altered by said modification of said specific range of data values to form said second master distribution matrix.

13. The method for selecting and displaying according to claim 9, wherein the step of generating a third display further comprises the steps of:

(a) identifying the data records in the master distribution data file having field values within said modified range;

(b) partitioning a listbox within said third display; and

(c) displaying the field values of the records identified in step (a) within said listbox.

14. The method for selecting and displaying according to claim 12, which further includes the step of generating a third display further comprises the steps of:

(a) identifying the distribution of data contained in said second master distribution matrix corresponding to field values within the modified range;

(b) identifying the data records in the master distribution data file having field values within said modified range;

(c) partitioning a listbox with said third display; and

(d) displaying the field values of the records identified in step (b) within said listbox.
 Description Submit all comments and votes
 


BACKGROUND OF THE INVENTION

1. Field of the Invention

The present invention relates to a general indexing and compression scheme for supporting graphics and data selection, and more particularly, to a method for displaying and selecting a distribution of all data fields in a selected database The method of the present invention allows a user to choose fields in a particular database, build a distribution of the data in the fields, graphically view the data structure, and select specific sets of data to determine the effect of adding or subtracting data to the selected sets on the remaining data fields.

2. Discussion of the Prior Art

Over the past twenty years or so, advancement in the art of computerized information storage and retrieval has significantly expanded man's capability for efficiently accessing information. Todays computers with enhanced integrated circuit technology are capable of storing tremendous amounts of information, or to be more exact, data. Accordingly, there is an ever increasing need for the development of systems and processes capable of managing the data and which permit the efficient utilization of the data. The efficiency of computerized information storage and retrieval systems is directly related to how efficiently the database can be searched and how quickly records from the database can be retrieved. There are a plurality of systems and methods currently available which utilize various combinations of filing schemes, indexing schemes, and data compression schemes to enhance the efficiency of data manipulation; however, the efficient retrieval of data is really only the first step in the process. The display of the data in a way that is meaningful to the system user is a second and necessary step of the process, because no matter how fast one can access data, if the data retrieved is arbitrarily displayed on the screen or monitor, then the data is meaningless to the user. In addition to displaying the data in a meaningful way, the system user should have the capability of selectively displaying certain detailed aspects of the data structure and determining the effect of adding or deleting data to the selected sets on the remaining data fields.

The prior art references individually disclose systems and methods for the filing, indexing, compression, retrieval, and the display of data in a computer system; however, there appears to be no reference disclosing a system or method for the filling, indexing, compression, retrieval, and display of data in one single embodiment wherein the system user has the capability of selecting specific sets of data to determine the representative effect on the remaining data fields. The following references are representative examples of state of the art systems and methods capable of performing some of the above listed functions, but not all.

In U.S. Pat. No. 4,817,036, Millett et al. discloses a computer system and method for database indexing and information retrieval. A number of keywords are selected and each record of a database is searched to determine in which records each keyword appears. The central processing unit of the system then creates a vector of each keyword which identifies each record number of the database where the keyword appears and numerically sorts the record numbers. A special bit processor next transforms each vector into a bit string that is identified by one of the keywords. The bit strings are returned to the central processing unit and stored in secondary storage so as to form an index for the database. To retrieve information, one or more keywords are input to the central processing unit. The input keywords are used by the central processing unit to identify the bit string for each keyword. The keywords may be logically joined using "AND", "OR", and/or "NOT" commands. Each bit string retrieved from the index is then sent to the special purpose bit processor, which combines the bit strings according to the particular command. The resultant bit string is transformed by the bit processor into a vector which is returned to the central processing unit and which then is used to identify the individual records which contain the combined keywords.

In U.S. Pat. No. 4,961,139, Hong et al. discloses a database management system for real-time applications. A real-time database provides the predictable, high speed data access required for on-line applications, while providing flexible searching capabilities. The data retrieval routines include the option to "read-through-lock" to access data in locked data tables, the capability to directly access the data using tuple identifiers, and the capability to directly access unformatted data from input areas which contain blocks of unformatted data. The data updating routines include an option to omit index updating when updating data and an option to update data in a locked data table. Multiple indexes can be defined for a data table. Thus, high speed searches can be performed based on a variety of data fields. The data storage and retrieval mechanisms are independent and there are hash index tables that connect the multiple index keys to the data tables. The data table structure includes a column defined for storing tuple identifier strings. These tuple identifiers can be used as pointers for chaining to related data stored in other data tables. The database has relatively small programmable memory. There is a common structure for user data tables, index tables, and system data tables. The database includes a minimum number of routines with certain routines providing multiple functionality.

In U.S. Pat. No. 4,232,375, Paugstat et al. discloses a data compression system and apparatus. The system compresses a binary data message generated by a digital input device. The compression process is based upon the deletion of redundant information. The data compression apparatus includes means for storing portions of a first data message generated by a terminal device as the result of a merchandising transaction performed during the time the CPU is disabled, counter means for deleting all redundant data characters of each data message, means for comparing preselected data characters of each succeeding data message with the corresponding data characters of the first data message stored in the terminal device for deleting those data characters when a comparison is found, and table look-up means for selecting a start of record character in accordance with a data character representing the type of data transaction being processed, the start of record character indicating the start of the compressed data record in addition to the transaction type where there is an absence of data in another portion of the compressed data record.

The first reference is representative of a group of references directed to systems and methods for the indexing and retrieval of data in a database. The second reference is representative of a group of references directed to the creation of databases and database management systems. The third reference is representative of a group of references directed to systems and methods for the compression of data to avoid wasting memory. None of the cited references, however, specifically discloses a system which utilizes a process wherein a distribution of data is built from the data contained in a particular database and then graphically displayed with the capability to selectively add or delete data to the distribution and determine the effect of such addition or deletion of data on the remaining data fields in the distribution built from the data contained within the particular database.

SUMMARY OF THE INVENTION

The present invention is directed to a method for displaying and selecting a distribution of all data fields in a selected database. The method for displaying and selecting of the present invention provides for the construction of a distribution matrix for each field in a given record which allows analysis of the selected database to incorporate graphical visualization of the data structure and content. The primary function of the distribution is to allow the user to view the data in a graphical format; however, the method also provides the user with the capability of selecting specific sets of data to be displayed and to determine the effect of the selection on all remaining data fields which comprise the selected database. The first step in the process includes the step of accessing the particular database from a host system having at least one database resident thereon. The databases that can be accessed are the original databases in vendor format. The database is accessed in order to build the distribution from the data in the database. The next step in the process is to construct the distribution from the data contained in the database. The construction of the distribution or distribution matrix is a multi-step process which prepares a representation of the data contained in the database for display and selection. Once the construction of the distribution is complete, the display of the distribution for the selected data fields in the selected database is performed. At this point, the user can make "what if" selections and thereby select specific sets of data to determine the effect of the selection on all of the data fields which comprise the selected database and display a detailed distribution of the data based upon the step of selecting specific sets of data.

The method of the present invention provides a PC-based tool for access and analysis of information across heterogeneous databases. It is intended to enhance costly and inefficient database management systems and to further provide a powerful graphical user interface for comprehensive viewing of the data. The method incorporates a highly efficient matrix indexing technology which allows for rapid access to the information contained within any of a number of databases, and allows for interfacing to major database management systems such as IDMS or SQL/DS for mainframe computers, Oracle or Sybase for mid-range computers, and Paradox or Dbase for micro computers. The distribution construction process utilizes processes which vary for each data type allowed in the various databases which provides the present invention with great versatility.

The present invention provides for a user friendly, completely graphical and menu driven software package which provides a method for displaying and selecting a distribution of all data fields in a selected database. A user of the software package can view the contents of the database in graphical format and make selections of specific sets of data currently displayed and then view the detailed selection and its effect on the remaining data associated with the primary selection. The software package is simple to install, and only requires approximately 250 K of memory. The software package is an order of magnitude faster than currently available query systems and is compatible with most commonly used standard databases. The software package can also be utilized with non-standard databases. For non-standard databases, an entry screen is used to define application generated record structures. The entry screen provides a series of questions enabling the system user to define the application.

BRIEF DESCRIPTION OF DRAWINGS

FIG. 1 is high a level flow chart representing the software necessary to implement the method for displaying and selecting the distribution of all data fields in a selected database of the present invention.

FIG. 2 is a detailed flow chart representing the database access software of the present invention.

FIG. 3 is a detailed flow chart representing the distribution construction software of the present invention.

FIG. 4 is tabular representation of a sample of database records and their corresponding field values.

FIG. 5 is a tabular representation of BIN ranges for the field values illustrated in FIG. 4.

FIG. 6 is a diagrammatic representation of a sample field header data structure utilized in the present invention.

FIG. 7 is a block diagram representation of the matrix work areas of the present invention.

FIG. 8 is a diagrammatic representation of the distribution matrix for record number 1.

FIG. 9 is a diagrammatic representation of the distribution matrix for record number 2.

FIG. 10 is a diagrammatic representation of the distribution matrix for record number 3.

FIG. 11 is a diagrammatic representation of the distribution matrix for record number 4.

FIG. 12 is a diagrammatic representation of the distribution matrix for record number 5.

FIG. 13 is a diagrammatic representation of the distribution matrix for record number 6.

FIG. 14 is a diagrammatic representation of the distribution matrix for record number 7.

FIG. 15 is a diagrammatic representation of the distribution matrix for record number 8.

FIG. 16 is a diagrammatic representation of the distribution matrix for record number 9.

FIG. 17 is a diagrammatic representation of the distribution matrix for record number 10.

FIG. 18 is a diagrammatic representation of the distribution matrix for record number 11.

FIG. 19 is a tabular representation of the final data content of the BIN work area.

FIG. 20 is a tabular representation of the compressed final data content of the BIN work area.

FIG. 21 is a detailed flow chart representing the file opening software of the present invention.

FIG. 22 is a detailed flow chart representing the data expansion software of the present invention.

FIG. 23 is a sample primary graphical display output of the present invention.

FIG. 24 is a primary graphical display of the SAMPLE01 records output of the present invention.

FIG. 25 is a detailed flow chart representing the secondary display software of the present invention.

FIG. 26 is a sample secondary graphical display output of the present invention.

FIG. 27 is a first detailed flow chart representing the testiary display software of the present invention.

FIG. 28 is a second detailed flow chart representing the testiary display software of the present invention.

FIG. 29 is a sample testiary graphical display and list box output of the present invention.

FIG. 30 is a sample graphical display output illustrating the selection process of the present invention.

FIG. 31 is a sample graphical display output illustrating the deselection process of the present invention.

FIG. 32 is a sample graphical display output illustrating the ripple effect of the deselection process of the present invention.

FIG. 33 is a flow chart representing the entire graphical interface software routine of the present invention.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

The present invention is directed to a method for displaying and selecting a distribution of all data fields in a selected database. A database is an organized array of files where data files are fully integrated and the online access of data from these files is possible. A given database file contains a number of fields grouped together to form a record of information. Each record is a complete set of information that can be used in the analysis of the content of the database. The present invention provides for the creation of a distribution matrix for each field, in a selected database which allows analysis of the database to incorporate graphical visualization of the data structure and content. The fields are processed to build a complete matrix array of the field information. The structure of the matrix array shall support the process to display the data content of the field. The primary function of this distribution is to allow the user to view the data content of the file in a graphical format. However, the present invention also provides the user with the capability of selecting specific sets of data to be displayed graphically and to determine the effect of the selection on all the remaining data fields which comprise the particular database if desired.

Referring to FIG. 1, there is shown a high level flow chart representing the software necessary to implement the method for displaying and selecting the distribution of all data fields in a selected database. The flow chart of FIG. 1 illustrates the overall process for displaying and selecting the distribution of data fields, and as each element or block in the flow chart is described, reference to detailed flow charts and graphic figures are made to provide detailed information for each step in the overall process, and to illustrate the process itself. The entire software package is collectively known as dbEXPRESS.TM..

The entry point into the software routine is represented in the flow chart by element 10, which is simply the start or initialization point of the program. Element 100, which is the element directly following element 10, represents the software to implement the process of accessing a particular database from wherever the particular database is stored on the system. The databases may also be imported from Local Area Networks (LAN's), communication ports and from a variety of other areas. The databases that can be accessed are the original database files in vendor format that are resident on the particular system. The present invention, dbEXPRESS.TM., is a software tool for access and analysis of information across heterogeneous databases such as, but not limited to, Paradox and Oracle for micro-computers, Ingres and Sybase for mid-range computers, and DB2 and IMS for mainframe computers. Element 100 also represents the software necessary to convert the particular database file into dbEXPRESS.TM. operational, schema, and data files, from which the distribution is constructed.

FIG. 2 is the detailed flow chart of the database access routine illustrated in FIG. 1 as element 100. Element 102 is the point of entry into the database accessing routine. Element 104, which is the element directly following element 102, represents the software to implement the process step of retrieving the name of the particular database and the particular data tables within the particular database. The names are retrieved to access the data contained therein. The retrieval process is similar to indicating a path as utilized in DOS in the sense that the path names provide a way in which to access particular files. Element 106, which is the element directly following element 104, represents the software to implement the process step of opening the particular database so that the data can be read from the particular database and into dbEXPRESS.TM.. In addition, element 106 represents the software to implement the process step of opening the internal setup or schema of dbEXPRESS.TM. so that the data from the database may be accurately and efficiently transferred into dbEXPRESS.TM.. Element 108, which is the element directly following element 106, represents the software to implement the process step of eliminating the database schema so that only the raw data remains. This software deletes the schema records for all fields in the specified tables in the database and exposes the data so that it may be easily read into dbEXPRESS.TM.. Element 110, which is the element directly following element 108, represents the software to implement the process step of reading in the information or parameters contained in the tables which comprise the database. This information is contained in the file headers of the database, and will be utilized further along in the process. Element 112, which is the element directly following element 110, represents the software to implement the process step of creating the dbEXPRESS.TM. schema record for each field specified in the file headers of the database. This software routine creates a layout of what is to be done with the specific data contained in the selected tables of the database. The layout provides a method of maintaining a reference between what the format of the data as it was stored in the original database to what it was converted to. The layout is maintained because the system user may want to export the dbEXPRESS.TM. files into an external file within the database while maintaining the dbEXPRESS.TM. format. Thus far in the database accessing process, the data contained in the particular database has been opened so that it can be read into dbEXPRESS.TM. memory, and a dbEXPRESS.TM. schema has been created based upon the information contained in the file header. Element 114, which is the element directly following element 112, represents the software to implement the process step of reading a data record from the exposed or raw data into the allocated memory area of dbEXPRESS.TM.. Element 116, which is the element directly following element 114, represents the software to implement the process step of converting the data from the database into destination dbEXPRESS.TM. field record types. No matter the type of data collected, whether it be floating point or alphanumeric, it is converted into true numeric data. For example, if the data in the database is dates of the year represented in ASCII, then the process implemented by element 116 converts the ASCII representations of the dates into dbEXPRESS.TM. representations of the dates. It is important to note that this data is stored or located in program memory space and not computer system memory where the data would be stored as binary data. The data is converted into true numeric data because true numeric data can be manipulated at a much faster rate than any other type or form of data. Element 118, which is the element directly following element 116, represents the software to implement the process step of writing the dbEXPRESS.TM. records into dbEXPRESS.TM. data files. This step is a simple reorganizing and compression procedure which allows for the convenient storage of the data. Element 120, which is the element directly following element 118, represents the software to implement the process step of implementing a more records decision loop. If there are more records to process, then this software transfers processing back to element 116 so that the next set of data can be converted into destination dbEXPRESS.TM. field record types. If there are no more records to process, then this software routine is exited and the software represented by element 122 is then processed. Element 122 represents the software necessary to terminate the processing of the access database software and return control to the main software routine illustrated in FIG. 1.

The next block of software in the overall process which is shown in FIG. 1 is element 200 which represents the software to implement the process of building the distribution file or matrix for each table selected in the particular database. Element 200 represents a block of software that comprises a plurality of distribution algorithms which vary for each data type utilized in the particular database selected. For example, two different distribution procedures would be utilized for integer and floating point data types. However, the overall distribution building process consists of eight essential steps and can be summarized as follows:

(1) Determine the number of bins to use for the particular database.

(2) Find the minimum and the maximum values for each field.

(3) Determine the deltabin value.

(4) Fill in the field header values.

(5) Assign matrix work areas.

(6) Process the records to build the distribution matrix.

(7) Compress the matrix for this field.

(8) Process all fields in the records contained in the current and relational databases. As stated previously, the primary functions of this distribution is to allow the user to view the data content of the file in a graphical format. The underlying method provides for accessing and displaying the number of records that fall into a given bin and for finding which database records are associated with the bin values. A bin is simply a specifically allocated area of memory.

FIG. 3 is the detailed flow chart of the distribution construction routine illustrated in FIG. 1 as element 200. Element 202 is the point of entry into the distribution construction routine. Element 204, which is the element directly following element 202, represents the software to implement the process of retrieving the names of the particular database and the particular data tables within the particular database. As indicated previously, the name of the particular database and tables contained therein are retrieved in order to access the data contained in these files. However, describing this process as a simple retrieval of names is somewhat misleading. The software represented by element 204 is utilized to establish a link to the particular database rather than just simply retrieving names. Element 206, which is the element directly following element 204, represents the software to implement the process step of opening the particular database so that the data can be read from the particular database and into dbEXPRESS.TM.. In addition, element 206 represents the software to implement the process step of opening the internal setup or schema of dbEXPRESS.TM. so that the data from the database may be accurately transferred into dbEXPRESS.TM.. Element 208, which is the element directly following element 206 and represents the software to implement the process step of retrieving the dbEXPRESS.TM. schema record created for each field specified in the file header of the database. The schema records are created by the software represented by element 112 in the flow chart of FIG. 2. Element 210, which is the element directly following element 208, represents the software to implement the process step of copying certain information contained in the dbEXPRESS.TM. schema records directly into the field header for the distribution matrix under construction. The information copied includes the field name and the distribution type such as floating point or alphanumeric. The entire field header structure is discussed in subsequent paragraphs with reference to a figure illustrating the actual layout of the header. Element 212, which is the element directly following element 210, represents the software to implement the process step of determining if there are more schema records for this table. If there are more schema records for a particular table then this software transfers processing back to element 208 so that the next schema record for the particular table can be retrieved. If there are no more schema records for the particular table then this software transfers processing to the software represented by element 214.

At this point in the processing, the data contained in the records has been readied for further processing. The data and the information from various calculations utilizing the data is now utilized to construct the distribution tables for a given table. Element 214, which is the element directly following element 212, represents the software to implement the process step of reading in a record from the dbEXPRESS.TM. data file. Recall that the dbEXPRESS.TM. data files were created by writing the dbEXPRESS.TM. records to the dbEXPRESS.TM. data files by the software represented by element 118 shown in FIG. 2. The reading in of the dbEXPRESS.TM. records is repeated for all records in a particular field as part of a decision loop. As each record is read, the field values comprising the particular record are evaluated to determine the maximum field value and the minimum field value and which record corresponds to the minimum and maximum field values respectively. This part of the distribution construction process is accomplished be elements 216, 218, and 220. Element 216, which is the element directly following element 214, represents the software to implement the process step of determining the minimum field values as each record is read from the dbEXPRESS.TM. data file. This software process is a simple procedure which saves a field value as the minimum field value if the value is less than the previous field value. Element 218, which is the element directly following element 216, represents the software to implement the process step of determining the maximum field value as each record is read from the dbEXPRESS.TM. data file. This software process is a simple procedure which saves a field value as the maximum field value if the value is greater than the previous field value. Element 220, which is the element directly following element 218, represents the software to implement the process step of determining if there are more records to process. If there are more records in the dbEXPRESS.TM. data file then this software transfers processing back to element 214 so that the next record for can be retrieved. If there are no more records then this software transfers processing to the software represented by element 222.

FIG. 4 illustrates an example table or sample utilizing eleven records. The field values corresponding to the eleven records were read from the dbEXPRESS.TM. data files in the order shown in the figure. These numbers are used to illustrate how the distribution tables are to be constructed. As can be seen by simple inspection of the figure, the minimum field value, MINV, is 7.9, and the maximum field value, MAXV, is 23.2. In actuality MINV and MAXV are determined by the simple process described above, and not by inspection. The actual physical layout of the data is not as shown in FIG. 4; however, FIG. 4 provides a convenient format for illustrating the distribution construction process.

Referring back to FIG. 3, element 222, which is the element directly following element 220, represents the software to implement the process step of determining the number of bins to use for the database. This software sets the number of bins to 64, 128, 256, 512, or equal to the number of records present. The number 512 is not an upper limit, for example, if increased display resolution is desired the number of bins may be increased to meet user requirements. In this example, the number of bins is set to 11 because there are 11 records. This example represents the simplest implementation of dbEXPRESS.TM.; however, in actuality, the number of bins can vary with the desired level of resolution as stated above. Element 224, which is the element directly following element 222, represents the software to implement the process step of calculating the DELTABIN value, which is the value to be assigned to each bin, and to assign the bin ranges. The minimum and maximum data values and the DELTABIN value are saved in the field header structure, which is described subsequently. The DELTABIN value is calculated by the software by utilizing the formula given by

(MAXV-MINV)/# of bins=DELTABIN (1)

The software has already determined the minimum and maximum values as well as the setting of the number of bins; therefore, substituting in the values for MAXV, MINV, and the # of bins results in a DELTABIN equal to

DELTABIN=(23.2-7.9)/11=1.39 (2)

Once the DELTABIN value is calculated by this block of software utilizing equation (1), the bin ranges are determined or assigned by the software by breaking up the range of values, 7.9 to 23.2, in DELTABIN value increments, and assigning them to bins in ascending order. The bin numbers and their respective ranges are shown in FIG. 5. The bin range assignments are a calculated value and are not saved as array elements. FIG. 5 is for reference purposes only. The use of these bin ranges will become apparent shortly.

The next step in the distribution construction process is the step of filling in the field header values. As the various information is determined or calculated by the process steps represented in FIG. 3, it is eventually stored in the field header data structure. FIG. 6 illustrates the basic field header structure. Area 201 is the field name and is filled in by the software process step represented by element 210 of FIG. 3. For purposes of this particular example, the field name is chosen to be SAMPLE01. Area 203 is the distribution type field and contains a two digit code indicative of the particular type of data contained in the data file, i.e. floating point type data. In this example a distribution type of 5 is utilized and shall denote that the field data are floating point values. Area 203, like area 201 is filled in by the software process step represented by element 210. Area 205 is the number of bins field and contains the actual number of bins utilized for a given field. The number of bins is represented by a hexadecimal number, and in this case since there are eleven bins, hexadecimal B is placed into area 205. The number of bins is set by the software process step represented by element 222 of FIG. 3; therefore, it is directly copied into the field header. Area 207 is the minimum value field and area 209 is the maximum value field. These two fields contain MINV and MAXV respectively, which were determined by the software process steps represented by elements 216 and 218 respectively. Accordingly, area 207 contains 7.90 and area 209 contains the value 23.20, both of which are represented by floating point numbers. Area 211 is the delta value field and contains the DELTABIN value calculated by the software process step represented by element 224 of FIG. 3; therefore the number 1.39 is placed there and is represented as a floating point number. Area 213 is the number of bin map words field and contains the number of bin map words utilized. Since the number of bins in this example is set at eleven, only one bin map word will be required. The value of this entry is therefore one. Area 215 is the bin map area and is utilized to show which particular bins are active. Since there are only eleven bins, a single 16 bit word is needed to show the active bins. This area is filled in as the distribution construction process continues.

At this point in the distribution construction process, the number of bins to use for the database has been determined, the minimum and maximum values of the field have been determined, the DELTABIN value has been calculated and the header values have been filled in to their respective areas. The program will now assign work areas for the distribution matrix build. As the matrix is built the work area segments will be expanded on an as needed basis. Upon completion of the distribution construction phase the matrix will be compressed in preparation for the operational phase.

Referring back to FIG. 3, the next step in the process is assigning the matrix work areas. Element 226, which is the element directly following element 224, represents the software to implement the process step of allocating a block of work space for each assigned bin. FIG. 7 graphically illustrates how the memory is divided up for each bin. For each assigned bin the program will allocate a 128 byte block of work space. For this particular example, FIG. 7 illustrates the 128 bytes of work space for each of the eleven bins. Table 1 given below contains a listing of the terms or words which comprise the structure of these matrix work areas. The meaning of the words listed in Table 1 shall be explained as the next step in the distribution construction process is explained.

TABLE 1 ______________________________________ WORD NO. DESCRIPTION MNEMONIC ______________________________________ 0 THE ADDRESS OF THE RECBINFR PREVIOUS MATRIX BIN 1 THE ADDRESS OF THE RECBINTO EXTENDED MATRIX BUILD AREA 2 NUMBER OF RECORDS RECBINNO CONTAINED IN THIS BIN 3 NUMBER OF RECORD BIT MAP RECMAPNO WORDS 4 (1st) STARTING RECORD NUMBER RECSTRNO (NEGATED) 5 (1st) BIT MAP WORDS - (POSITIVE RECMAPWD ONLY), THE SIGN BIT MUST BE ZERO. THIS BIT MAP WORD ALLOWS THE BIT MAPPING OF THE NEXT 15 RECORDS AFTER RECSTRNO ______________________________________

The next step in the distribution construction process is the processing of the collected records to actually build the distribution matrix. Element 228 in FIG. 3 represents the first block of software in this phase of the process. Element 228, which is the element directly following element 226, represents the software to implement the process step of reading in a record from the dbEXPRESS.TM. data file. Accordingly, the software now determines the bin number to update based upon the range for each bin as indicated in FIG. 5. Element 230, which is the element directly following element 228,