WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Natural-language interface generating system    

Get related patents on CD
United States Patent4688195   
Link to this pagehttp://www.wikipatents.com/4688195.html
Inventor(s)Thompson; Craig W. (Plano, TX); Ross; Kenneth M. (Bedford, TX)
AbstractA system for interactively generating a natural-language input interface, without any computer-skill programming work being required. The natural-language menu interface thus generated provides a menu-selection technique whereby a totally unskilled computer user, who need not even be able to type, can access a relational or hierarchical database, without any possibility of error. That is, the user addresses commands to the database system simply by selecting words from an appropriate menu of words which could legally follow in commands, so that the user inputs commands which are phrased entirely in English, and these commands cannot be misunderstood by the database system. The present invention provides an automatic interactive system whereby such an interface is constructed. The database is itself loaded in, and the interactive interface-construction system then addresses a series of queries to the user's technical expert, in response to which the user must classify, which tables in the database are to be used, which attributes of particular tables in the database are key attributes, and, in particular, what the various connections between tables in the database are and what natural-language connecting phrases will describe those relations.
   














 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History Custom Search
Drawing from US Patent 4688195
Natural-language interface generating system - US Patent 4688195 Drawing
Natural-language interface generating system
Inventor     Thompson; Craig W. (Plano, TX); Ross; Kenneth M. (Bedford, TX)
Owner/Assignee     Texas Instruments Incorporated (Dallas, TX)
Patent assignment
All assignments
Company News
Publication Date     August 18, 1987
Application Number     06/461,881
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     January 28, 1983
US Classification     706/11
Int'l Classification     G06F 001/00
Examiner     Zache; Raulfe B.
Assistant Examiner    
Attorney/Law Firm     Hill; Kenneth C. Comfort; James T. , Sharp; Melvin ,
Address
Parent Case    
Priority Data    
USPTO Field of Search     364/200 MS File 364/900 MS File 364/300
Patent Tags     natural-language interface generating
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
4460975
Torkelsen
715/522
Jul,1984

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B

[0 market size comments]
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%

[0 market share comments]
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%

[0 reasonable royalty comments]
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

[0 Guesstimation of Royalty Value Comments]
License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
[0 license availability comments]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
[0 owner/assignee comments]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

[0 competitive advantage comments]
Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

[0 commercial alternatives comments]
 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


What is claimed is:

1. A method for generating a context-free grammar comprising the steps of:

(a) generating a domain-independent core grammar;

(b) generating a domain-independent core lexicon;

(c) generating a domain specification directed to a predetermined application; and

(d) inserting the domain specification into the domain-independent core grammar and lexicon to define a domain dependent context-free grammar.

2. The method of claim 1, wherein the domain dependent context-free grammar is suitable for use as a predictive grammar and parser in a natural language menu system.

3. The method of claim 2, wherein step (d) is done independently from and prior to execution of the natural language menu system.

4. A method for generating a grammar for use as a predictive grammar and parser in a natural language menu system, comprising the steps of:

(a) generating a domain-independent core grammar;

(b) generating a domain-independent core lexicon;

(c) generating a domain specification directed to a predetermined natural language menu application; and

(d) combining the results of steps (a), (b) and (c) into a domain dependent grammar by instantiating the unbound portions of the domain-independent core grammar and lexicon with the data contained in the domain specification.

5. The method of claim 4, wherein step (d) is done independently from and prior to execution of the natural language menu system.

6. A system for generating a context-free grammar suitable for use as an input to a natural language menu system, comprising:

an input file containing a definition for a domain-independent context-free grammar;

an input file containing a definition for a domain-independent lexicon;

a file containing a domain specification which is specific to a preselected application to be run under the natural language menu system;

a processor coupled to the grammar and lexicon input files and to the domain specification file, wherein the domain specification file is merged with the grammar and lexicon input files to create a domain dependent context free grammar.

7. The system of claim 6, further comprising an output file coupled to the processor for receiving the domain dependent context free grammar, wherein the output file is further used as an input file to the natural language menu system.

8. A system for generating a user customized natural language menu interface from a database supplied by the user and from inputs supplied interactively by a user, comprising:

display means for showing to a user expert a menu of options;

input means coupled to said display means, for moving a cursor's apparent position on said display means by the action of said user;

means for the system to access said data base;

means for displaying on said displaying means a generalized relation between two or more tables in said database, with generalized indications of corresponding natural-language connecting phrases;

means for interactively receiving from said user a plurality of said relations between tables in said database, together with natural-language connecting phrases corresponding to said relations between tables and said database; and

means for integrating said relations interactively selected by said user into a portable specification for natural-language-menu interface and for storing said portable specification.
 Description Submit all comments and votes
 


Included is a microfiche appendix of 4 microfiche and 385 frames

BACKGROUND AND SUMMARY OF THE INVENTION

The present invention relates to a system for user-customizing a natural language menu command system.

In general, the task of writing a natural language interface to some target computer system involves writing a natural language grammar and a corresponding lexicon and a set of semantic translations to the target system. This process has traditionally been a manual one requiring a grammar specialist and weeks or months or even years or work. Furthermore, the interface that results is not portable to new domains, not robust with respect to changes in the target system, not easy to debug, and may not cover the target system (a proof that it does so may be extremely difficult). The invention described here automates the process of generating an interface in the special case of relational databases and in the context of a menu-based natural language understanding system. In essence, the invention involves parameterizing a core grammar, lexicon and set of translations so that the user specifies only information that includes what database tables (relations and views) he wants covered, what table access rights (retrieved, insert, . . . ) the interface should allow, what joins he wants the grammar to explicitly support, what attributes are numeric and computable, etc.

The ADVANTAGES of this invention are that an end user can construct his own interface knowing nothing about grammars and lexicons so no specialist is required. The process of specifying the parameters to the MAKE-PORTABLE-INTERFACE routine takes minutes, not weeks or months. The technique is portable across domains. The user can control precisely what tables to include, etc and so can control the coverage of the natural language interface. The interface is easy to change if the user changes table descriptions or adds or deletes tables. Finally, a given core grammar can be constructed to probably cover any subportion of a target relational database interface language. And it can be guaranteed to be bug-free, since it is easy to validate the parameters to the MAKE-PORTABLE-INTERFACE routine and prove it correct.

The significance of this invention is that it makes it possible, for a MUCH broader class of users and applications, to use menu-based natural language interfaces to databases.

Natural language interfaces to computer systems are not in common use today for two main reasons: they are difficult to use and expensive to build and maintain. A companion patent application described a particular grammar-driven, menu-based software tool which, when used with a semantic grammar and lexicon allows a naive user to construct understandable sentences using a pointing device. That patent application solves many of the problems having to do with "ease-of-use" of natural language interfaces. The present invention solves the second problem, of making natural language interfaces easy to build and maintain, in the very important special case of relational databases and in the context of a grammar-driven, menu-based interface drive. (The technique can be generalized to non-relational database applications and probably to applications other than database applications.

Specifically, this invention addresses the following problems: Existing natural language interfaces to databases are seldom portable; almost all are application-specific. They take from man-weeks to man-years for a specialist to build. They are not robust with regard to changes in the data they are interfacing to. They are hard to debug. And there is no established way to guarantee that they cover the desired data or fit the functionality of the target computer system. So, using existing technology, natural language interfaces to databases will be built only for important database applications. These problems will haunt even NLMENU-driven, manually constructed interfaces, though, because simpler grammars can be used, a grammar specialist should be able to complete an interface in a few weeks, (but he would still be required to maintain the interface when changes occurred to the structure of the database).

The invention described here solves the problems listed above. END-USERS can build natural language interfaces to their own data; the interfaces are robust with respect to database changes; they can be customized to cover user-specified subsets of data and functionality in a precise manner; and they are provably correct.

In this section we will briefly discuss the general state of natural language interfaces to software systems and database systems in particular. Then we will briefly discuss menu-based grammar-driven interfaces. Then we will relate these to the invention and discuss known prior art as it concerns automatically generating interfaces to a database.

INTRODUCTION

Natural language interfaces to software fall into two categories: (1) those based on pseudo-English, where a user must learn a command language that is English-like. COBOL is a well-known computer language that employs some pseudo-English. (2) Those based on the premise that a user should be allowed to express himself in any way that is natural to him and that the system will make sense of his input. Keyword extraction techniques suffice only in simple applications. But for most other purposes, the user's natural language input must be better understood.

Natural language interfaces to computer systems that have been constructed employ a grammar which characterizes the set of acceptable input strings. A parser then access this grammar to produce a parse tree (parse trees for ambiguous input) for the input string. The parse tree is then translated into an expression (or set of expressions) which represent the meaning of the input string and which are interpretable by the target computer system for which the interface is being built. A wide variety of grammar formalisms and parsing strategies have been developed.

For all natural language systems, the user is required to type his question using the keyboard of a computer terminal. When the entire query has been received, the natural language interface processes the input. Processing typically results in either an answer to the user's query or a response indicating that the query was not well-understood.

Most natural language interfaces to database systems have been prototypes, built by the research community. The primary application for natural language interfaces has been to natural language database query systems. Some of these prototype systems include the LUNAR system (Woods et al. 1972), the PLANES system (Waltz and Goodman, 1977), the LADDER system (Hendrix, 1978), and the TQA system (Petrick, 1981). Some commercial systems exist. Larry Harris of Artificial Intelligence Corp, Roger Schank of Cognitive Systems Inc, and Gary Hendrix of Symantec are all marketing natural language interfaces to software systems.

Natural language systems are not in wide spread use today for two reasons: current systems are not easy to use nor are they easy to build and maintain.

Tennant's dissertation (1980) explores why current natural language systems are not easy to use. Tennant performed the first and only extensive evaluation of a natural language interface. In his evaluation of the PLANES system, he found that only simple queries of 6 or 7 words were input to the system and 1/3 of the queries were not understood by the system. The results showed clearly why natural language interfaces are not in wide-spread use today.

Some of the problems involving "ease of use" of natural language interfaces are:

users type poorly and often use ungrammatical constructions.--users had no way of knowing the grammatical coverage of the system (what constructions were allowed in its grammar) and they had no clear path to learn such limitations.

users had no way of knowing the semantic coverage of the system (which attributes, entities, and relationships the system knew about) and again they had no clear path to learn about the system's limitations (what kinds of things it di and did not know about).

Traditional techniques for handling these difficulties involved spelling correctors and very large grammars and lexicons, but Tennant's dissertation shows how unsuccessful these techniques have been.

The companion patent by Tennant, Ross, and Saenz describes an invention that overcomes the difficulties involving "ease of use" of natural language interfaces (and also formal language interfaces--see next subsection).

The second reason why natural language interfaces to databases are not in common use today is the large amount of time it has traditionally taken to construct a natural language interface. Current technology is such that each interface must be constructed on a case by case basis for each application. Efforts taking from 10 to 30 man-years per application are not uncommon. Thus, only applications that can justify such large expenditure of manpower are candidate for possible applications. However, given the quality of the system that results, the effort has not proven to be worthwhile.

This patent application describes an invention, which is dependent on a grammar-driven, menu-based interface driver, which solves the problems involving ease of building and maintaining natural language interfaces to databases.

MENU-BASED, GRAMMAR DRIVEN NATURAL LANGUAGE INTERFACES. The companion application (simultaneously-filed U.S. patent application Ser. No. 462,151, filed Jan. 28, 1983, which is hereby incorporated by reference, describes an interface driver (hereinafter called the NLMENU driver) that parses a user's input phrase by phrase, at every point offering him one or more menus from which to select sentence completions, so that only syntactically well-formed strings can be constructed. This guarantees grammatical coverage. A "semantic grammar" (a context-free grammar that encodes the semantics of the domain) (Burton, 1976) is used to guarantee semantic coverge. An all-paths, predictive parser interprets the grammar and is used to control what menu selections are available to a user at any point in the construction of sentences. A pointing device (or selection from a keyboard) is used to make the selections of completing phrases from a menu or menus. So an NLMENU-driven interface answers the set of objections to natural language interfaces that involve "ease of use" of the interface. One additional advantage of the NLMENU-driven interfaces is that only one or a few ways are needed to state a natural language command, instead of an arbitrarily large grammar that covers every construction that every user is likely to use including ungrammatical ones. So the grammar can be small.

The companion patent application actually describes a particular kind of grammar-driven, menu-based interface system in which a set of menus permanently resides on the screen of a CRT and menus that the user can choose from are hilighted in some way and constrained to contain only next legal choices. Unhilighted menus contain all menu choices for the menu category to indicate the scope of the interface to the user at all times. This interface is really a special case of grammar-driven, menu-based interfaces. Another particular and successful grammar-driven, not quite menu-based interface is DEC 2060 command language-like interface (TOPS-20, 1980), where a user types enough of a phrase to disambiguate it and asks the system for completion and at any point in stating a command, he can use "?" to ask for a menu-like list of next phrases.

The companion patent application is only a natural language interface if it is used with a natural language grammar and lexicon. It can equally well be used with formal natural languages like the database query language SQL, and in fact the DSG product to be released on Feb. 1, 1982, will feature a GUIDED-SQL interface that uses exactly the same driving software but a different grammar and lexicon. This point is significant since it widens the scope of the usefulness of that patent. It happens though that many advantages (noted above) accrue from using natural language grammars and lexicons with that NLMENU interface driver.

The above sections indicate that in the past, it has been expensive to build and maintain natural language interfaces to databases. This invention solves that problem for relational databases, but only in the context of some grammar-driven, menu-based driver. The grammars and lexicons produced by the MAKE-PORTABLE-INTERFACE function are for use with such a system and would be very inadequate in traditional systems. The principal reason is that they are purposely engineered to be simple, to be expressive, and to provide only a limited set of grammatical and lexical ways of expressing a statement. They are aimed at taking advantage of person's ability to understand a fragment of natural language written in a limited language and at guiding him to express himself in that limited language. So there is no intent to cover more natural language than a domain requires.

The MAKE-PORTABLE-INTERFACE function described in this patent is NOT the first and only work in the area of generating interfaces of some kind from descriptions of the data, but its purposes are different, its immplementation is simpler, and it works. As mentioned above, its purpose is to be used in conjunction with a menu-based grammar-driven interface driver and to interface to a relational database and to be easy enough to use so that a user who can build tables for his own applications can also easily build and maintain language interface to those tables. Its implementation is simpler than previous works since it does not involve a large grammar that attempts to cover English. It works--several automatically generated interfaces have been successfully tried.

In this section, I will be describing other's work in building transportable natural language interfaces to databases. That work is often not described precisely enough for another researcher to know exactly what techniques were used or how successful those techniques were. It is not surprising that techniques reported in the literature bear some general similarities to my work: an acquisition dialogue of some sort must provide the interface building mechanism with parameters of some kind. Some sort of method is needed to construct the grammar. Some sort of database query language must be the target. A big difference between the present invention and what others have done is that the MAKE-PORTABLE-INTERFACE function generates small grammars which are designed for use with a grammar-driven, menu-based system. All the other systems use large grammars, with all the attendant problems mentioned above. Using the present invention, people who have never seen a lisp machine before can formulate interesting queries using automatically generated natural language interfaces, as often happens in our demos. None of the other researchers have reported that their automatically generated natural language unterfaces are usable (with the exception of Larry Harris and he has not carefully documented his claims in the literature).

As early as 1978, a panel at the SIGMOD database conference in Austin listed five questions concerning natural language interfaces to databases. Even at that time, the question of portability was raised with the question "How much can we automate the process of changing database environments?" (SIGMOD, 1978).

Work at SRI: SRI has been an important center for research into natural language interfaces to databases for several years. Some work has concerned portable interfaces:

Two papers by Kurt Konolige (Konolige 1979) and (Konolige, 1980) are related to this work. The 1979 paper (a tech report) describes a framework for a portable natural language interface to a database. ((I don't have it)). The 1980 paper briefly describes a method of applying metal rules to context-free rules to generate other context-free rules. Konolige is interested in showing how passive and dative movement and other transformations in transformational grammar can be pre-compiled into a grammar, but roughly the same sort of rule instantiation is used as I have used (he uses symbol replacement and tests on whether to generate new rules and I use string replacement and specify explicitly when to build new rules). The method is not described with respect to a grammar for databases and there is no mention of semantic grammars. Konolige independently notices that his metarule system, because it was applied uniformly, "exposed gaps in a phase-structure grammar that was painstakingly developed over a five year period and was thought to be reasonably complete for a large subset of English."

In (Hendrix and Lewis, 1981), the authors describe TED, a prototype system for creating new domains for natural language systems without the need for specially trained experts. Hendrix described a dialogue in which information (like some of that found in the portable spec) is elicited from a user. That information is stored in tables and a "pragmatic grammar" is used to access the tables at execution time. The pragmatic grammar differs from a semantic grammar in that it contains slots (variable or procedures) that the information in the data structure fills. Hendrix does not describe exactly how the constraints work to guarantee that different slots in the same sentence are semantically related, but assuming that they are, he is describing something with the same descriptive power as a semantic grammar, but with the additional advantage that the pragmatic grammar stays small. His stated intent is to use Jane Robinson's large grammar DIAGRAM (Robinson, 1982) for his pragmatic grammar. The information he gathers involves lexical information gathering (synonyms, antonyms, verb conjugations, +-human, and also database structural information (like what attributes are numeric . . . ). Hendrix's approach offers a competing way to do the same sort of thing my approach does, but it differs in the ways noted above.

In (Grosz et al, 1982a) and (Grosz, 1982b), Grosz reports on a software component called TEAM. TEAM is an extension to Hendrix's TED. It is implemented using Robinson's DIAGRAM grammar and the acquisition dialogue has been extended to include domain-specific verbs. It is intended for use by a database expert who is not necessarily a natural language expert. In her section on future research in (Grosz, 1982b), Grosz states:

"The existing version of TEAM demonstrates the feasibility of constructing transportable systems. It has been used (in experimental settings) by people who are not experts in natural language processing to create natural language interfaces to databases. Current work on TEAM is concerned with extending the range of linguistic expressions it can handle (e.g., to include temporal expressions, a larger range of verbs, and quantified commands) and the kinds of databases for which it can provide interfaces (e.g. non-normalized databases and those with fields containing dates)."

Grosz offers no documentation to support a claim that usable, portable natural language interfaces result from TEAM. She does not describe the users or the databases they have built interfaces to or the usability of those interfaces.

Larris Harris is responsible for the first important commercial venture that is based on AI technology. He developed the ROBOT system, now called INTELLECT and marketed on ON-LINE ENGLISH by Cullinaine. A database administrator is needed to create and build a lexicon for a fixed grammar and the system interfaces to a single file. Much of what goes into the lexicon is synonym information or operational word definitions (e.g. define "woman" as "employee whose sex="F") and the grammar is large and complex. It is not clear how much effort is required to move to a new domain, but the ROBOT system has been tested in a variety of applications, which is saying more than can be said for the other efforts discussed in this section, which report on transportable systems but give no concrete evidence of a range of applications tested.

The IBM Rendezvous system. The Rendezvous prototype natural language interface to a database uses much the same kind of information as the SRI and TI systems. The information about database structure is stored in data structures that describe both database structural information and linguistic information. The grammar used is some variant of an augmented transition network and contains rules that are domain dependent, with no reported provision for automatically generating interfaces to new domains.

S Jerrold Kaplan (Kaplan, 1979) describes a portable interface which includes an augmented transition network. He restricts his grammar to handle only questions beginning with WH-words, like "who", "what", . . . which limits the syntax his grammar has to recognize, but he offers the user no path to learning to use the limited language that results. His system interfaces to heirarchically structured collections of files. In his system, all domain-specific information is stored in the lexicon. He describes an experience of porting his natural language interface to a new domain database in 5 hours, which is quite good by the standards of the other systems mentioned above, but the process still involves an expert. In his 1979? paper, he does not go into detail on exactly what is involved.

None of the previous work in the field claims the advantages of portability that the present system offers:

an end user can construct his own natural language using an interactive system.

he can finetune the interface by specidying which tables will be covered, the access rights to those tables, the attributes to be covered and the joins that the natural language system will support. He can change menu item phrasing to more natural phrasings.

the interface will be error free.

the interface can be treated as a database object and granted to other users who can then use the natural language to access their data.

Thus, it is object of the present invention to provide a method for rapidly instructing a natural-language menu interface system which is customized according to a particular users needs.

Thus, it is a further object of the present invention to provide a method and system for rapidly constructing a natural language menu interface system which is customized according to the attributes of a particular user-supplied database.

According to the present invention, there is provided:

A system for generating a user customized natural language menu interface from a database supplied by the user and from inputs supplied interactively by a user, comprising:

display means for showing to a user expert a menu of options;

input means, for moving a cursor's apparent position on said display means by the action of said user;

means for receiving a database;

means for displaying on said display means a generalized relation between two or more tables in said database, with generalized indications of corresponding natural-language connecting phrases; and

means for interactively receiving from said user a plurality of said relations between tables in said database, together with natural-language connecting phrases corresponding to said relations between tables and said database; and

means for integrating said relations interactively selected by said user into a portable specification for natural-language-menu interface, and forstoring said portable specification.

BRIEF DESCRIPTION OF THE DRAWINGS

The present invention will be discussed with reference to the accompanying drawings, wherein:

FIGS. 1-11 demonstrate construction of a natural-language interface to a relational database, as performed by an unskilled user.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

The invention provides a means of generating a semantic grammar and a semantic lexicon from a condensed collection of parameters.

The basic operation of the natural language menu interface system will first be described, and then the features of the present invention, which permit very rapid construction of a user-customized implementation of such a natural language interface system, will then be described. In essence, a natural language menu input system permits the user to provide input in a natural language subset, which has a predefined correspondence to some particular desired set of constrained outputs (e.g. executable machine commands). Partial inputs from the user are successfully parsed, and the parser indicates all possible legal next entries which could follow the input received so far from the user. The set of legal next words is then displayed to the user as active portions of the menu, so that the user need only select one of the legal next words from the menu. Thus, a 0% error rate is guranteed, and training time is greatly decreased, since the user is guided to correct inputs and to correct inputs only.

The present invention provides three major innovations which make such a natural language menu system more useful to users. First, a "build-interface" function is used, which creates a portable specification, i.e. a compact statement of the parameters which are sufficient to define the desired natural language interface to a relational database, according to the inputs received from a particular user expert. It should be noted that the user expert must be familiar with types of queries which are likely to be demanded from the database, but the user expert need not be a computer expert, and need not know any programming language nor need he be familiar with grammars or lexicons. Second, the "make-portable-interface" function derives a grammar and lexicon from a portable specification, such that the grammar and lexicon are then sufficient to implement a natural language menu system. Thus, using the make-portable-interface function, a large number of portable specifications can be stored, and each portable specification can be rapidly constructed into the required grammar and lexicon when that particular natural language interface is called upon. Thus, a large number of natural menu interfaces can be kept accessable, without requiring large amounts of storage for the grammar and lexicon of each interface, which can be somewhat bulky. Third, the present invention also provides a well-formedness test, which tests a grammar-lexicon pair to ensure that they are well-formed. This automatic debugging for an automatically generated natural-language expert system is novel and provides major advantages.

The present invention will be described with primary reference to the presently preferred embodiment, wherein data base queries and updates are addressed to a relational data base. The natural language inputs are parsed using a parser which is generally similar to that disclosed in the 1981 dissertation of Ken Ross entitled "Parsing English Phrase Structure". The system is coded in LISP, and has been demonstrated on a LISP machine from LMI Incorporated. However, as will be obvious to those skilled in the art, the present invention can alternatively be realized using a wide variety of other physical systems, and the key ideas of the invention can also be translated into other programming languages than LISP. For example, a further embodiment of the present invention, which is presently under development, realizes the present invention on a Texas Instruments Professional Computer, instructed by source code which is written in "C". It should also be noted that the particular parser used in the preferred embodiment is not necessary to the invention, but other parsers can be used. Numerous parsers have been disclosed in the prior art. However, the presently-preferred embodiment uses a parser which will be described in great detail.

As discussed above, the end result which the present invention seeks to achieve is to permit the user to input well-understood sentences in a convenient language, which is preferably a subset of a natural language (such as English). According to this invention, after the user has input each successive word of a sentence, the set of all possible next words which could form the continuation of a grammatical sentence is displayed to the user, so that he needs only to select one of the possible next words. (More broadly, some of the possible inmuts can be lumped together as a category, so that one of the displayed items would be, e.g., "(specific number)", as will be discussed below.)

In general, this is accomplished by parsing partial inputs received from the user as they are received, so that the parser generates the set of possible functional descriptions which could be the next element in a legal input, and displays a set of words (corresponding to these functional descriptions of possible next elements). Thus, the end result of the operation, after the user has input a complete sentence, is a parsed command which is in accordance with the predefined grammar, and therefore can be trivially translated into an executable instruction, as will be discussed below.

The first step in the implementation of this method which will be described in detail is the operation of the parser itself. A simplified version of the parser will first be described, and then the modifications which permit word-at-a-time parsing and prediction will be described. The preferred parser can be specified in terms of nonstandard turing machine instructions, as described by Griffiths and Petrick, which operate on the upper elements of an alpha stack and a beta stack in accordance with the current state of the upper elements of the two stacks. The topmost elements of these stacks are represented as an ordered pair such as (XY, ZW), where X is a string variable corresponding to the upper elements of the alpha stack, Y is the next element of the alpha stack after X, Z represents the upper elements of the beta stack, and W is the next element of the beta stack. The parser rules can then be stated as:

1. If one of the rules of the grammar which can be applied is of the form A.fwdarw.V.sub.1 V.sub.2 . . . V.sub.n, then we map a condition (V.sub.1 Y, X) onto (Y, V.sub.2 . . . V.sub.n t A X), and we attach V.sub.1 as a node beneath A. However, this rule is applicable only if X is in the set of "non terminals", that is if X itself represents a partial parsing tree rather than a first-level input, such a word or the immediate classification of a word (e.g. noun, adjective, determiner, etc.). The symbol "t" is merely a placeholding symbol, which is used in this formalism to permit proper application of the succeeding parser rules. Thus, this first partial rule stores the possibility that V.sub.1 is the beginning of a group which can be parsed under a higher level node A (e.g., the possibility that, if V.sub.1 is a determiner, it is the beginning of a noun phrase such "the parts"). Where an element is associated beneath another element, this provides one step in building up the tree of associated nodes which will eventually form a complete parse. Thus, a sample of a completed parse, in a grammar roughly corresponding to English, might be as follows: ##STR1##

The first association is always a direct translation from a specific word (e.g. "cat") to its directly associated category ("noun"). However, it should be preliminarily noted here that the categories used need not be as broad as familiar English-language parts of speech, but, at least for database-query applications, are preferably modified to include some semantic information. Thus, in the presently preferred embodiment, categories such as "number-noun phrase", "worker-noun phrase", "operation-noun phrase", "job card-noun phrase", and others are used. (Such grammars are known in the art as "semantic grammars".)

Intuitively, parser Rule 1 means that, where a rule of grammar might be applicable to a topmost element on stack alpha (V.sub.1), that element is removed from the top of stack alpha, and stack beta is pushed down, successively, with the hypothetical parent node of V.sub.1 (together with V.sub.1 associated below it), with a place holding symbol "t", and with the remaining symbols which would be required by the possibly-applicable rule of grammar (V.sub.2 . . . V.sub.n). For convenience, we restate parser rule 1 again:

Rule 1: If one of the available rules of grammar applied can be stated as A.fwdarw.V.sub.1 V.sub.2 . . . V.sub.n ; and if a non terminal X is on top of stack beta; then (V.sub.1 Y, X) maps onto ##STR2##

2. The second rule of the parser shows how the place holding symbol t is used to indicate that a perfected partial parse can be transferred from stack beta to stack alpha:

Rule 2: If the place holding symbol t is on top of the stack beta, and is immediately followed by a nonterminal symbol A, the place holding symbol t is deleted and the nonterminal symbol A is transferred from the top of stack beta over to stack alpha: ##STR3## Intuitively, the allocation of functions between stacks alpha and beta is that stack alpha contains only parse trees or well-formed subparts of parse trees, whereas stack beta is used to hold partially formed subpart of parse trees.

3. The third rule is used to determine when a necessary portion of the elements required to complete a parse have been found. The formal statement is:

Rule 3: ##STR4##

The operation of the parser is initiated by loading the string of words to be parsed into stack alpha, followed by a numeral sign, and by loading the higher level symbol corresponding to the root node of the tree to be constructed, followed by a numeral sign, on stack beta. That is, where (as is usual) the parser is serving to organize strings of words into properly parsed sentences, the symbol which indicates a complete sentence in the grammar (in rules such as S.fwdarw.NP VP) is loaded onto stack beta. Thus, the initial condition of the procedure can be represented as (W.sub.1 W.sub.2 . . . W.sub.n #, S#).

A successful parse has been found when the final condition of the two stacks is (S#, S#). In this case, the nodes below the S on the alpha stack provide the desired complete parse. If the root node (i.e. the object of the parsing) is not a sentence parse, but, for example, a noun phrase parse, then the condition indicating a successful parse would be correspondingly different, e.g. (NP#, NP#).

When no rule is applicable, the instant parsing path represents a bad parse, and the parser must retrace and follow a different parsing path.

When more than one parser rule can be applied, and/or when more than one rule of the grammar can be applied through parser rule 1, alternative parsing paths must be followed. Eventually, all of the alternative parsing paths will result in either successful or unseccessful parses. The set of seccessful parses represents the output of the parser, after completion of the parsing process. This has substantial advantages in the present invention, as described below.

An example of the operation of this simplified version of a parsing procedure will now be given. Suppose, for example, that the grammar to be applied is as follows:

(3.4) Grammar

S.fwdarw.A B

S.fwdarw.A B C

A.fwdarw.A C

A.fwdarw.x

B.fwdarw.C B

B.fwdarw.z

C.fwdarw.y

In this example, the input string is "xyz", which is to be parsed as an "S".

TABLE 1 ______________________________________ Parser Grammar Rule Stack Alpha Stack Beta Rule ______________________________________ x y z # S # Rule (1) y z # ##STR5## ##STR6## Rule (2) ##STR7## S # Rule (1) applicable 3 Ways: See continuations 1.1, 1.2 and 1.3 Cont. 1.1 Rule (1) y z # ##STR8## ##STR9## Rule (1) z # ##STR10## ##STR11## Rule (2) ##STR12## ##STR13## Rule (1) z # ##STR14## ##STR15## Rule (1) # ##STR16## ##STR17## Rule (2) ##STR18## ##STR19## Rule (3) # ##STR20## Rule (2) ##STR21## ##STR22## Rule (3) # ##STR23## Rule (2) ##STR24## S # PARSE FOUND Cont. 1.2 Rule (1) y z # ##STR25## ##STR26## Rule (1) z # ##STR27## ##STR28## Rule (2) ##STR29## ##STR30## Rule (1) z # ##STR31## ##STR32## Rule (1) # ##STR33## ##STR34## Rule (2) ##STR35## ##STR36## Rule (3) # ##STR37## Rule (2) ##STR38## ##STR39## Rule (3) # ##STR40## BAD PATH - NO INSTRUCTIONS APPLY Cont. 1.3 Rule (1) y z # ##STR41## A AC Rule (1) z # ##STR42## ##STR43## Rule (2) ##STR44## ##STR45## Rule (1) and Rule (3) both applicable - See continuations 2.1 and 2.2 Cont. 2.1 Rule (1), $ z # ##STR46## ##STR47## Rule (1) # ##STR48## ##STR49## Rule (2) ##STR50## ##STR51## Rule (3) # ##STR52## Rule (2) ##STR53## ##STR54## BAD PATH - NO INSTRUCTIONS APPLY Cont. 2.2- Rule (3) z # ##STR55## Rule (2) ##STR56## S # Rule (1) applicable 3 ways - See continuations 3.1, 3.2 and 3.3 Cont. 3.1 Rule (1) z # ##STR57## ##STR58## Rule (1) # ##STR59## ##STR60## Rule (2) ##STR61## ##STR62## Rule (3) # ##STR63## Rule (2) ##STR64## S # PARSE FOUND Cont. 3.2 Rule (1) z # ##STR65## ##STR66## Rule (1) # ##STR67## ##STR68## Rule (2) ##STR69## ##STR70## Rule (3) # ##STR71## BAD PATH - NO INSTRUCTIONS APPLY Cont. 3.3 Rule (1) z # ##STR72## ##STR73## Rule (1) $ # ##STR74## ##STR75## Rule (2) ##STR76## ##STR77## BAD PATH - NO INSTRUCTIONS APPLY DONE ______________________________________

The foregoing carefully-worked-out example shows in detail the operation of a parser based on a context-free grammar. It can easily be seen from this example that the formal manipulation rules as set forth above translate very simply into programmable operations. For example, the final status of stack beta, in the very last step of the preceding worked-out example, could be easily represented in Lisp by a list formatted, e.g., as follows:

(C t (A (A (A x) (C y))) S).

Note that the # is not required in the Lisp implementation, since an empty list is easily tested for.

As noted, the parser set forth above must be modified for use in the present invention. The parser used in the present invention must be able to produce the set of items which can come next, in addition to creating a