|
Claims  |
|
|
What is claimed is:
1. A knowledge based document retrieval system, comprising:
user input means for inputting, in response to at least one of a user's
key-typing and mouse operations, a series of words;
display means for displaying responses from said system and retrieved
documents;
user input analysis means for analyzing said series of words inputted by
said user through said user input means, and converting said series of
words into an internal query condition based on information related to
various concepts and permitting said user to edit said internal query
condition;
a knowledge base for storing knowledge including said concepts and
relations among said concepts;
wherein said stored knowledge is represented by concept nodes and relation
links forming a network of concepts, wherein each of said nodes represents
a concept, and wherein each of said links represents a relation among said
concepts;
information search means for identifying concept nodes that match said
internal query condition semantically; and
information retrieval means for retrieving at least one relevant document
associated with said identified concept nodes;
wherein said user input analysis means includes:
lexicon storing means having contents which are an edited version of said
concept network,
lexical analysis means for identifying concept nodes from said series of
words inputted by said user input means by consulting the contents of said
lexicon storing means,
syntactic analysis means for identifying a nominal compound in said series
of words based on said concept nodes identified by said lexical analysis
means, and
nominal compound interpretation means for mapping said identified nominal
compound relative to said concepts and relations of said knowledge base,
inferring meaning of said identified nominal compound based on said
concepts and relations of said knowledge base and generating said internal
query condition based on said meaning of said identified nominal compound.
2. A knowledge based file retrieval system according to claim 1, wherein
said nominal compound interpretation means generates all possible
interpretations, that are represented by interrelationships among said
series of nouns, such that interrelationships are relations existing in
said concept network, and evaluates likelihood of interpretations
depending on the number of instance relations existing in said concept
network, wherein instance relations represent concrete facts.
3. A knowledge based file retrieval system according to claim 1
wherein said information storage means stores files representing documents;
wherein said concept nodes includes nodes that represent said documents
stored in said information storage means; and
wherein said information retrieval means retrieves documents that are
associated with said identified concept nodes.
4. A knowledge-based document retrieval system, comprising:
user input means for inputting, in response to at least one of a user's
key-typing and mouse operations, a series of words;
user interaction means for creating a query expression and an internal
query condition from a dialogue, including said inputted series of words,
between said user and said system, wherein said query expression is a
displayed version of said internal query condition for said user to permit
editing by said user, said query expression and said internal query
condition being created based on information related to various concepts;
a knowledge base for storing knowledge including said concepts and
relations among said concepts, wherein said stored knowledge is
represented by concept nodes and relation links forming a concept network,
wherein each of said nodes represents a concept, and wherein each of said
links represents a relation among said concepts;
information search means for identifying concept nodes that match said
internal query condition semantically;
information retrieval means for retrieving at least one relevant document
associated with said identified concept nodes; and
display means for displaying responses from said system and retrieved
documents;
wherein said user interaction means includes:
query editing means for adding to said internal query condition a new
condition phrase having concepts and relations defined in said concept
network, deleting one of existing condition phrases, and changing one of
concepts in said condition phrases to a different concept in response to
said series of words inputted by said user,
query expression display means for displaying on said display means said
query expression corresponding to said internal query condition being
created and edited, and
concept tree display means for displaying on said display means part of
said concept network in a hierarchical tree, wherein said tree includes
one of the concepts appearing in said query expression.
5. A knowledge based document retrieval system according to claim 4,
wherein said user interaction means further comprises:
means for searching concepts connected by a generic relationship added to a
current concept in the query expression among the concepts connected with
said current concept through subsumption relationships, displaying on said
display means the concept belonging to the lowest rank in the subsumption
relations among these concepts and moving said current concept to the
concept being displayed.
6. A knowledge based document retrieval system according to claim 4,
wherein said user interaction means further comprises:
means for imposing conditions one after another to a concept in the query
expression, a current concept moving freely among the concepts in the
query expression.
7. A knowledge based document retrieval system according to claim 4,
wherein said user interaction means further comprises:
means for adding a root to a concept, which is to be queried in the query
expression and modifying the concept which is to be queried.
8. A knowledge based document retrieval system according to claim 4,
wherein said user interaction means further comprises:
means for making the query of the concepts, to which the condition is
imposed, in the query expression, possible one after another.
9. A document retrieval system using a conceptual network, comprising:
a knowledge base for storing a plurality of words representing concepts and
a plurality of predicates representing relations between the plurality of
words;
input means for inputting words;
display means for displaying words;
processing means, responsive to inputting of a word representing a concept
which serves as a query key from said input means, for retrieving a word
representing a concept in association with the inputted word from said
knowledge base, and for displaying on said display means at least one word
having a relation with said inputted word and at least one predicate
representing said relation with said inputted word;
editing means for selecting, in response to said user, a set of a desired
word and a predicate from those displayed on said display means and
inputting, by said user, a new word subsumed by the selected word to
produce a query condition in which said inputted word and the newly
inputted word are associated with each other by said selected predicate;
and
means for retrieving at least one relevant document related to said query
condition.
10. A document retrieving method in an intellectual retrieval system,
comprising the steps, performed by said intellectual retrieval system, of:
storing a plurality of words representing concepts and relations between
the plurality of words as a knowledge base;
displaying on a screen a query condition defined by selected words and
predicates representing relations between the selected words, and
displaying a plurality of other words and relations between the words,
said plurality of other words having a relation with a certain word in
said query condition and being stored in said knowledge base;
permitting a user of said intellectual retrieval system to select one word
and one relation from the displayed plurality of other words and
relations;
rewriting and displaying said query condition in accordance with the
selected word and the relation; and
retrieving at least one relevant document related to said query condition.
11. A method according to claim 10, wherein a certain word in said query
condition is designated on the screen, and a plurality of other words
having a relation with said designated word and stored in said knowledge
base, relations between the plurality of other words are displayed on said
screen in response to the designation.
12. A knowledge-based document retrieval system according to claim 4,
further comprising:
means for rewriting the display of query expressions by substituting a
current concept for a certain concept in said dialogue window representing
the network by the current concept.
13. A knowledge-based document retrieval system according to claim 4,
wherein said query editing means controls said query expression display
means and said concept tree display means, such that, in response to a
user selection of a concept in a query expression by means of a mouse
operation, said concept tree display means identifies a subset of said
concepts corresponding to said selected concept, and displays said subset
of concepts in a hierarchical tree.
14. A knowledge based document retrieval system according to claim 13,
wherein said query editing means control said query expression display
means and said concept tree display means, such that, in response to user
selection of a concept in said concept tree display, said query editing
means substitutes a corresponding concept in a query expression by said
selected concept, and said query expression display means displays the
updated version of said query expression.
15. A knowledge based document retrieval system according to claim 4,
wherein said query editing means controls said query expression display
means to display possible relations that can be attached to a concept in
response to a user's request to add new condition phrase to a concept in
an existing query expression being edited.
16. A knowledge-based document retrieval system, comprising:
user input means for inputting, in response to at least one of a user's
key-typing and mouse operations, a series of words;
a knowledge base for storing knowledge including concepts and relations
among said concepts, wherein said stored knowledge is represented by
concept nodes and relation links forming a concept network, wherein each
of said nodes represents a concept, and wherein each of said links
represent a relation among said concepts;
information storage means for storing said knowledge base and documents,
wherein each document has a corresponding concept node defined in said
knowledge base;
information search means for identifying concept nodes that match an
internal query condition represented in terms of concepts and relations
defined in said knowledge base;
information retrieval means for retrieving documents from said information
storage means, wherein said documents are associated with said identified
concept nodes;
display means for displaying responses from said system and retrieved
documents; and
user interaction means for creating said internal query condition from a
dialogue, including said inputted series of words, between said user and
said system, wherein said internal query condition is displayed in a query
expression to permit said user to edit;
wherein said user interaction means includes:
query editing means for adding to said internal query condition a new
condition phrase having concepts and relations defined in said knowledge
base, deleting one of existing condition phrases, and changing one of
concepts in said condition phrases to a different concept in said
knowledge base in response to said series of words inputted by said user,
query expression display means for displaying on said display means said
query expression which is a display version of said internal query
condition being created and edited, and
concept tree display means for displaying on said display means a subset of
concepts defined in said knowledge base in a hierarchical tree, wherein
said tree includes one of the concepts appearing in said query internal
condition being edited. |
|
|
|
|
Claims  |
|
|
Description  |
|
|
BACKGROUND OF THE INVENTION
This invention relates to knowledge based information retrieval system and
in particular to a human interface of an intellectual query system
permitting the end user to query efficiently information stored in a
network structure in an electronic file.
This interface can be divided into the natural language interface and the
visual interface. The natural language interface is suitable for a global
search, by which the search is effected by deduction from the natural
language and the visual interface is suitable for a search from a domain,
which can be seen by eyes, i.e. for a local search.
Heretofore, as a human interface using the natural language, there is known
a natural language interface for the database. Therefor there are
references, G. G. Hendrix et al., "Developing a Natural Language Interface
to Complex Data", ACM Trans. Database Systems, Vol. 3, 1978, pp. 105-147,
etc. In these systems a data model for the database (method, by which the
relation between different data items, which are to be memorized, is
expressed) and a grammar and a dictionary for interpreting the natural
language are set independently. That is, when the natural language
interface is added to an existing database, it is necessary to construct
newly a grammar and a dictionary. Or it has a problematical point that it
is necessary to modify the grammar and the dictionary for the natural
language interface, when the object database is changed.
Further, heretofore, a database, to which the natural language interface is
given, is a relational database and the formal language for the search
used therefor, i.e. a quasi-standardized SQL (Structural Query Language),
is weak in the capability for describing high order knowledge. Since a
query expression by the natural language is translated usually into a
formal language as such an intermediate language, it has a problematical
point that the function of the whole system is restricted by the
expressing capacity of this formal language. In particular, although the
relational database is useful for uniform data, it cannot be said that it
is satisfactorily suitable for a heterogeneous database, by which various
kinds of matters are dealt with, or an object-oriented database. For
example, it is not suitable for describing a matter, based on ambiguous
memory of a user, and querying information concerning it, based on that
description.
Further, in these systems means other than the natural language is used for
inputting data (new knowledge and information) and the data input is
carried out by a specialist. Consequently there is a problem that it is
difficult for an end user to input and register directly the data.
Furthermore, as small scale and large capacity memory devices such as
optical disk storage units have been realized, document filing devices
directed to offices, for which the end user operates directly the
processing of supervision and search of the database, capable of storing
and querying a large amount of information, which has been effected
heretofore by a specialist, have been realized.
As a method for facilitating memory and search of information in such a
filing device, e.g. JP-A-61-220027 can be referred to. This literature
discloses an information querying method enabling the end user to query
easily desired documents etc. from ambiguous and fragmental information
and at the same time to facilitate their registration. However, by this
method, it is very difficult to form query conditions, under which
information required by the user can be appropriately taken out, when the
query conditions for effecting the query from the knowledge base are
formed.
SUMMARY OF THE INVENTION
A first object of this invention is to solve the problematical points as
described above and to enable the end user to query desired information
from a description by a natural language even on the basis of fragmental
memory. Furthermore it is to enable the user itself to register new
information and knowledge similarly by using the natural language.
A second object of this invention is to provide a system, by which the user
finds a concept, which he seeks, without any feeling that he is querying,
by facilitating modification of concept in query expressions, enabling him
to modify the object to be queried, to query one after another even in the
course of formation of query expressions and to query locally the query
expressions.
In order to achieve the above first object, this invention is characterized
in that a common knowledge expression base is given to the knowledge base
and the natural language interface so that the query and the registration
of knowledge and information can be effected by using the natural
language.
Concretely speaking, this invention gives a knowledge representation method
(corresponding to a data model in the data base) called "concept relation
model" expressing a system of matters and the fact with "concept" and
"relation" as a method for constructing the knowledge base, and further
provides a method, by which knowledge of language can be memorized also in
the knowledge base. Here a "concept" means a "data item" in a computer
representing matters, events or abstract concept and a "relation" means a
"data item" defined between different concepts. The concept can be
represented by a node (apex) and the relation can be represented by a link
(side). Knowledge represented by a concept relation model constitutes
therefore a network of concepts. Here this is called a conceptual network.
That is, the knowledge base according to this invention is characterized in
that the knowledge, which is originally desired to be stored, is stored in
the conceptual network as one body together with the knowledge for
expressing it by using a language and that the natural language interface
uses the same knowledge in common. Consequently, in principle it is not
necessary to construct newly a dictionary, etc. for the natural language
interface.
Further this invention provides a natural language understanding method, by
which the meaning of a query expression expressed by a natural language is
interpreted by effecting deduction from the matters, etc. stored in the
knowledge base. In particular, it gives a method for interpreting the
meaning of nominal compounds consisting of a plurality of series of nouns,
which we use frequently. In order to interpret the meaning of the nominal
compounds, it is necessary that the system deduces relations between
different nouns and this invention gives a method, by which only
significant relations are deducted from the concepts and the relations
stored in the knowledge base.
Furthermore, the knowledge representation method according to this
invention restricts the part depending on the language so as to facilitate
the application to a plurality of languages. In addition, it makes the
coexistence of expressions by different languages possible. Consequently
this invention provides a method, by which it is possible to query and
register information in e.g. both English and Japanese in a same knowledge
base.
In order to achieve the above second object, according to this invention,
display of a superconcept of a current concept in a query expression
together with the query expression and a network display from the current
concept and changeable superconcepts to subconcepts satisfying conditions
added to the current concept are effected by utilizing multiple-window
functions. Further the object can be achieved by the fact that
displacement of the current concept due to shifting between different
concepts in the query expression, modification of the current concept
within concepts satisfying the added conditions, addition of restrictions
to the current concept, addition of a root and query of concepts
satisfying the added conditions can be always executed.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a functional block diagram representing a natural language
expression interpreting program by the method according to this invention;
FIG. 2 is a scheme of a concept network for explaining a knowledge
expressing method according to the concept relation model;
FIG. 3 is a scheme indicating names of concept, FIG. 3 and the followings
being schemes illustrating the knowledge memory according to the same
model;
FIG. 4 is a scheme indicating subsumption relations;
FIG. 5 is a scheme indicating the generic relationship definition;
FIG. 6 is a scheme indicating the relations;
FIG. 7 is a scheme for explaining the principle of the method for
interpreting the meaning of nominal compounds;
FIG. 8 is a scheme for explaining a method for analyzing the structure of
sentences;
FIGS. 9, 10 and 11 show examples of the analysis of the structure of
sentences;
FIG. 12 is a table indicating prepositions;
FIG. 13 is a table indicating relational descriptor;
FIG. 14 is a block diagram illustrating the construction of the hardware
for a system, which is an embodiment of this invention;
FIG. 15 is a scheme expressing concepts and relation knowledge stored in
the data base;
FIGS. 16 and 17 show images on a screen when concept matching is effected;
FIG. 18 shows an example of the table added, when conditions are added;
FIG. 19 shows an example of the table added, when a root is added;
FIG. 20 is a scheme illustrating the construction of a system according to
this invention;
FIGS. 21 to 32 show images on the screen appearing in the process of a
query of information; and
FIGS. 33 to 36 are flow charts for the processing according to this
invention.
DESCRIPTION OF THE PREFERRED EMBODIMENTS
Hereinbelow a concrete embodiment of this invention will be explained. At
first an embodiment concerning the natural language interface will be
described.
At first the fundamental principle of this embodiment will be explained.
The knowledge expressing method by using the concept relation model
serving as the base of this invention will be explained. FIG. 2 indicates
a part of a conceptual network. In the figure ellipses represent concepts
(nodes) and arrows represent relations (links). A node 201 "UNIVERSAL" is
a root node representing all within the knowledge base. For each of the
nodes it is possible to define more than one series of letters as the name
of the relevant concept. For example a synonym or a corresponding word in
a foreign language can be added thereto.
On the other hand, in the links connecting different nodes there are
subsumption relations (IS-A link 202) defined between two nodes, between
which a property is inherited, "generic relationships" (link 203) defined
generally between different concepts, and "instance relations" (link 204)
as concrete examples of the same generic relationships. The subsumption
relation represents the class of matters. Consequently the conceptual
network consisting of the concepts and the subsumption relations
constitutes a conceptual tree representing a taxonomic hierarchy.
For example the conceptual tree within the conceptual network indicated in
FIG. 2 represents the following knowledges;
(PAPER-MATERIAL (is-a THING))
(BOOK (is-a PAPER-MATERIAL))
(BOOK#0051 (is-a BOOK))
(LIVING-THING (is-a THING))
(PERSON (is-a LIVING-THING))
(NEWTON (is-a PERSON))
These are called frame expressions. These are expressed by the symbolic
expression used for the LISP language. These can be expressed also in
usual English and written as follows;
______________________________________
Paper material is a thing.
Book is a paper material.
etc.
______________________________________
In FIG. 2, an example of the generic relationship is the relation 203
defined between the concept 211 (BOOK) and the concept 212 (PERSON). This
means that there can be a relation "author" or "work" between "book" and
"person". This generic relationship can be read in both the directions,
i.e. towards either right or left, as follows;
(BOOK (is-written-by PERSON))
(PERSON (has-written BOOK))
The instance relation is a relation as indicated by a link 204 indicated by
a broken line in FIG. 2, which represents a concrete example (called also
instance value for the data base) of a certain generic relationship. For
example, in FIG. 2, an instance relation 204 is defined between a concept
213 and a concept 214 as a concrete example of the generic relationship
203, which is "author". With the frame expression it can be written as
follows;
______________________________________
(BOOK#0051 (is-a BOOK)
(is-written-by NEWTON))
(NEWTON (is-a PERSON)
(has-written BOOK#0051))
______________________________________
With the natural language they can be written;
______________________________________
BOOK#0051 is a book
which is written by NEWTON.
NEWTON is a person
who has written BOOK#0051.
______________________________________
The knowledge described above is memorized, according to this invention, in
a data structure, as stated below. At first the concept and the name
thereof are memorized in a concept name table 221 (FIG. 3). The same table
221 consists of three columns 222, 223 and 224. The column 222 indicates
the unique number C# of the concept and the name CNAME thereof is defined
in the column 223. The language LANG of the name is prescribed in the
column 224. For example, when the value of LANG is "J", the name is
written in Japanese and when it is "E", the name is written in English.
Further, a plurality of concept names can be defined for one language. For
this reason, for the data structure in the column 223 it is allowed to
repeat the data. For example the name of the concept C#0004 is "book" (in
Japanese) and "BOOK" and "printed matter" (in Japanese) can be added also
thereto.
Now, the subsumption relation of the concept is represented in a
subsumption relation table 231 indicated in FIG. 4. Columns 232 and 233
indicate unique numbers C# and SC# for respective concepts and represent
that the superconcept of the concept C# is the concept SC#. For example,
the second record in the table 231 indicates that the superconcept of the
concept C#0002 ("paper material") is the concept C#0001 ("thing"). The
"relation" such as a property, which is defined for each concept, is
inherited from a higher rank to a lower rank through a link of the
subsumption relation. In this case, it is possible to define a plurality
of superconcepts for one concept. Consequently multiple property
inheritance is realized.
Various kinds of relations between different concepts other than the
subsumption relation can be defined in a generic relationship defining
table 241 indicated in FIG. 5. Each of the generic relationships
represents the kind of the relation. Basically there is no limit for the
number of such kinds of relations and it is possible to define an
arbitrary number of generic relationships.
The generic relationship defining table 241 defines principally "reading",
when the "relation" is expressed by a natural language. A column 244
indicates "reading" LR, when the relevant relation is read from left to
right, and a column 245 defines "reading" RL read from right to left
contrarily thereto. In the data structure of these columns it is allowed
to repeat data so that it is possible to define a plurality of readings.
Further, just as for the concept name table, it is possible to specify the
language for the reading. Consequently it is possible to express same data
in a plurality of languages.
In the example indicated in FIG. 5 the relation "AUTHORSHIP" can be
expressed by a natural language (English in this case) as follows;
PERSON who is author of BOOK
PERSON who is the author of BOOK
PERSON who wrote BOOK
PERSON who has written BOOK
or
BOOK whose author is PERSON
BOOK by PERSON
BOOK from PERSON
BOOK of PERSON
This is the same also for the Japanese expression (not described).
The existance of relations between different concepts is memorized
according to a relation table 251 indicated in FIG. 6. As explained above,
in the relation links, there are generic relationships and instance
relations. These are distinguished by a column 256 in the table 251. When
the value in a column CLASS is GR, it is a generic relationship and when
the value is INST, it is an instance relation. In the example indicated in
FIG. 6, the first record represents the generic relationship 203 in FIG. 2
and the second record indicates the instance relation 204 in FIG. 2.
Further a column C#L defines the concept on the left side and a column C#R
the concept on the right side. In this case, on which side a certain
concept is located, right or left, depends on the definition and as far as
there is no contradiction for the tables 241 and 251, it may be defined on
either side.
Now the principle of the natural language understanding method based on the
knowledge expressing method described above will be explained.
At first the method for understanding nominal compounds, which is the most
important in the object-oriented knowledge base will be explained. Here a
nominal compound means a noun phrase consisting of a series of nouns
including partially adjectives. For example, the following are examples of
the nominal compounds;
______________________________________
supercomputer article (1)
ElectronicsWeek article (2)
Japanese personal computer company
(3)
Americal personal computer
(4)
software packages
______________________________________
In this case understanding the significance means to obtain positively the
relation among these adjectives and nouns.
For example, although the nominal compounds (1) and (2) have a same
structure, they have different significances. They should be interpreted
as follows; (1) means; "article whose subject is supercomputer" and (2)
means; "article which is part of ElectronicsWeek". That is, it is
necessary to deduce that in (1) "article" and "supercomputer" are combined
through a relation "subject-is" and that in (2) "article" and
"ElectronicsWeek" are combined through a relation "is-part-of".
Understanding the significance is to extract automatically following
structures, when they are described in the frame form;
(ARTICLE (subject-is SUPERCOMPUTER)) (5)
(ARTICLE (is-part-of ElectronicsWeek)) (6)
By the method for natural language understanding according to this
invention, the significance is interpreted, as follows, on the basis of
the knowledge indicated in FIG. 7. At first, as knowledge making this
deduction possible, relations RS#0011 as generic relationships;
(ARTICLE (subject-is UNIVERSAL)) (7a)
(UNIVERSAL (is-subject-of ARTICLE)) (7b)
and relations RS#0012
(ARTICLE (is-part-of JOURNAL)) (8a)
(JOURNAL (has-part-of ARTICLE)) (8b)
should be defined. That is, it is necessary that "anything can be a subject
of an article" and "the article is a part of a journal (the article is
published in a part of a journal)" are memorized as knowledge.
Further, as a subsumption,
(SUPERCOMPUTER (is-a THING)) (9)
(THING (is-a UNIVERSAL)) (10)
(ElectronicsWeek (is-a JOURNAL)) (11)
should be memorized.
By using these memories it is possible to interpret "supercomputer
article". At first, following the subsumption relation towards the higher
rank from SUPERCOMPUTER, it is possible to understand;
(SUPERCOMPUTER (is-a UNIVERSAL))
(UNIVERSAL (is-subject-of ARTICLE)).
As the result, by the property inheritance, it is deduced that there can be
relations;
(SUPERCOMPUTER (is-subject-of ARTICLE))
or
(ARTICLE (whose Subject-is SUPERCOMPUTER)).
That is, it is deduced that "a supercomputer can be a subject of an
article". In this case, since there is no other interpretation, the
interpretation;
"article whose subject is supercomputer"
is adopted.
The interpretation of the significance of the nominal compound (2) is a
little more complicated.
In this case, since
(ElectronicsWeek (is-a JOURNAL))
(JOURNAL (has-part-of ARTICLE))
and at the same time
(ElectronicsWeek (is-a UNIVERSAL))
(UNIVERSAL (is-subject-of ARTICLE))
as it can be clearly seen from FIG. 7, it is deduced that there can be two
relations;
(ElectronicsWeek (has-part-of ARTICLE))
and
(ElectronicsWeek (is-subject-of ARTICLE)).
That is, it is understood that there can be two interpretations;
"article which is part of ElectronicsWeek"
and
"article whose subject is ElectronicsWeek"
In the case where there exist a plurality of candidates of interpretation,
the method according to this invention utilizes a heuristic method, by
which the likelihood of the interpretations is evaluated, depending on
which interpretation has more concrete examples.
Concretely speaking, in the preceding example, the numbers of instance
relations for the relation RS#0011 and the relation RS#0012, which are
registered, are counted, respectively, while querying the subconcepts of
the concept "ARTICLE" and the concept "ElectronicsWeek", including
themselves. In the example indicated in FIG. 7, O for the former and one
concrete relation for the latter are registered. That is, there is no
article, whose subject is "ElectronicsWeek", but there is one article,
ARTICLE #0101, which is published in "ElectronicsWeek". Consequently the
relation RS#0012 (is-part-of) is selected as the more suitable
interpretation. That is, it is interpreted as follows;
"article which is part of ElectronicsWeek".
As explained above, the interpretation of nominal compounds is based on a
deduction processing of the relation between 2 nouns. That is, the basic
processing of the interpretation of a nominal compound consisting of more
than 3 words, as explained below, consists of extracting the relation
between 2 words described above. This will be explained below, taking the
nominal compound (3) as an example.
At first, the concept corresponding to each of the words is selected, while
examining whether there are concept names consisting of a composite word
among the words constituting the nominal compound or not. That is, the
words are cut-off one after another from the beginning and it is examined
whether they are registered or not, referring to the concept name table.
In the case of the nominal compound (3), partial series of words such as;
______________________________________
"Japanese"
"Japanese personal"
"Japanese personal computer"
"Japanese personal computer company"
"personal"
"personal computer"
"personal computer company"
"computer company"
"company"
______________________________________
are cut-off and it is examined whether each of them is a concept name or
not.
At this time, the method according to this invention is characterized in
that an adjective is registered as a synonym of the concept, whose name is
the noun form corresponding thereto, and the adjective is dealt with as a
same concept as the noun. For example, the adjective "Japanese" is
registered as a synonym of the concept "JAPAN" or the concept "Japanese
people" and dealt with as the same concept.
Consequently, as the result, supposing that "personal computer" is defined
as a concept name "PERSONAL-COMPUTER" the nominal compound (3) is at first
recognized as;
(JAPAN PERSONAL-COMPUTER COMPANY)
(JAPANESE-PEOPLE PERSONAL-COMPUTER COMPANY).
However, in the following explanation, in order to facilitate
understanding, explanation will be made, omitting the latter, for which it
is understood finally to be a meaningless interpretation.
That is, at this step, it is understood that the nominal compound is a
combination of substantially three concepts. This can be expressed by
using parentheses as follows;
(Japanese (personal computer) company) (12)
Therefore the following processing is to examine how these three concepts
are related with each other. In this case it can be seen that there are
the following two possibilities;
(Japanese ((personal computer) company)) (13)
((Japanese (personal computer)) company) (14)
At first, in the case of (13), it is necessary to deduct two relations,
which can be connected between COMPANY and PERSONAL-COMPUTER and between
COMPANY and JAPAN. In this case, by the method described previously for
deducing the relation, following relations;
______________________________________
(COMPANY (15a)
(produces PERSONAL-COMPUTER)
(is-located-in JAPAN))
(COMPANY (15b)
(has-developed PERSONAL-COMPUTER)
(is-located-in JAPAN))
______________________________________
are extracted. Here, in order to evaluate the priority (likelihood) of a
plurality of interpretations, the total numbers of concrete examples of
the two relations between COMPANY and PERSONAL-COMPUTER and between
COMPANY and JAPAN (concrete relations defined in the subconcepts) are
counted for (15a) and (15b), respectively, so as to obtain weights for
these relations. In order to obtain the evaluation for all the relations,
it is normalized by dividing each of the numbers of the instance relations
(weight of the relation) by the number of the generic relationships. For
the examples of (15a) and (15b) the number of the generic relationships is
2.
Then, the relation is extracted for the second possibility (14). In this
case two relations between COMPANY and PERSONAL-COMPUTER and between
PERSONAL-COMPUTER and JAPAN should be obtained. For the former, two
relations;
(COMPANY (produces PERSONAL-COMPUTER)) (16a)
and (COMPANY (has-developed PERSONAL-COMPUTER)) (16b)
can be found (on the presumed knowledge base). In the same way, for the
latter two relations
______________________________________
(PERSONAL-COMPUTER (17a)
(is-produced-by
(COMPANY (is-located-in JAPAN)))
and (PERSONAL-COMPUTER (17b)
(was-developed-by
(COMPANY (is-located-in JAPAN)))
______________________________________
are found. At this time, since there is no relation connecting directly
PERSONAL-COMPUTER and JAPAN, the concept COMPANY relating indirectly these
two is found automatically.
By the method according to this invention, when no relation relating
directly two concepts is found, as stated abo | | |