|
Claims  |
|
|
We claim:
1. Directory assistance apparatus for a telephone system comprising: a
voice processing unit having at least one lexicon of lexemes potentially
recognizable by the unit and data predetermined for each of said lexemes,
means for issuing messages to a caller making a directory assistance call
to prompt the caller to utter one of said lexemes; means for detecting an
identifier for the call source from whence a directory assistance call was
received; means responsive to the identifier detected, and to said data,
for computing a probability index for each lexeme representing the
likelihood of that particular one of said lexemes being that uttered by
the caller; and speech recognition means for recognizing, on the basis of
the acoustics of the caller's utterance and the probability indexes, a
lexeme corresponding to that uttered by the caller.
2. Apparatus as claimed in claim 1, wherein the detecting means serves to
detect as said identifier comprising at least a portion of a calling
number from whence the directory assistance call was made.
3. Apparatus as claimed in claim 1, further comprising means transmitting
to the caller a message giving a directory number determined using the
recognized lexeme.
4. Apparatus as claimed in claim 1, wherein the lexemes comprise names of
localities within said predetermined area; the data comprise the size of
each locality and the distance between each pair of localities; and the
means for computing the probability index computes for each locality, the
likelihood of the caller requesting that locality based upon the distance
between that locality and the caller's locality and upon the size of that
locality for which the probability index is being computed.
5. Apparatus as claimed in claim 4, wherein the size of each locality is
determined by the number of local active directory numbers in the
locality.
6. Apparatus as claimed in claim 1, wherein the voice processing unit has
one or more additional lexicons, each lexicon comprising a group of
lexemes having a common characteristic, and the speech recognition means
accesses the lexicons selectively in dependence upon one or more messages
previously issued to the caller.
7. Apparatus as claimed in claim 1, wherein the voice processing unit has
one or more additional lexicons, each lexicon comprising a group of
lexemes having a common characteristic, the computing means computes said
index for the lexicons selectively depending upon one or more messages
previously played to the caller and the speech recognition means accesses
the lexicons selectively in dependence upon one or more messages
previously issued to the caller.
8. Apparatus as claimed in claim 1, wherein the lexemes comprise names
businesses and the data comprise the nature of the businesses.
9. Directory assistance apparatus for a telephone system, including a voice
processing unit having a lexicon of lexemes potentially recognizable by
the unit, said lexemes including lexemes corresponding to localities in a
predetermined area served by the directory assistance apparatus and
lexemes corresponding to localities not in the predetermined area, the
unit including:
means for issuing to a directory assistance caller a message inviting the
caller to utter the name of a locality;
means for recognizing one of said lexemes from the utterance;
means for determining whether or not the recognized lexeme is one of said
lexemes corresponding to localities not in the predetermined area served
by the directory assistance apparatus; and
means for playing a message to the caller inviting the caller to direct the
directory assistance request to a directory assistance apparatus for an
alternative area including the locality corresponding to the recognized
lexeme in the event that the recognized lexeme is not in the predetermined
area.
10. A method of at least partially automating directory assistance in a
telephone system in which directory assistance apparatus comprises a voice
processing unit having a lexicon of lexemes potentially recognizable by
the unit and data predetermined for each lexeme, the method comprising the
steps of:
issuing messages to a caller making a directory assistance call to prompt
the caller to utter one of said lexemes;
detecting an identifier for a call source from whence the directory
assistance call was received;
computing, in response to the identifier and said data, a probability index
for each lexeme representing the likelihood that the lexeme will be that
uttered by the caller, and employing speech recognition means to
recognize, on the basis of the acoustics of the caller's utterance and the
probability index, a lexeme corresponding to that uttered by the caller.
11. A method as claimed in claim 10, wherein the identifier comprises at
least a portion of a calling number of the call source.
12. A method as claimed in claim 10, further comprising the step of
transmitting a message to the caller giving a directory number determined
using the recognized lexeme.
13. A method as claimed in claim 10, wherein the lexemes comprise names of
localities within said area; the data comprise the size of each locality
and the distance between each pair of localities; and the computing the
probability index computes for each locality, the likelihood of the caller
requesting that locality based upon the distance between that locality and
the caller's locality and upon the size of [the caller's]that locality for
which the probability index is being computed.
14. A method as claimed in claim 13, wherein the size of a locality is
determined by the number of active directory numbers in the locality.
15. A method as claimed in claim 10, wherein the voice processing unit has
one or more additional lexicons, each lexicon comprising a group of
lexemes having a common characteristic and the speech recognition means is
employed to access the plurality of lexicons selectively in dependence
upon one or more messages previously issued to the caller.
16. A method as claimed in claim 10, wherein the voice processing unit has
one or more additional lexicons, each lexicons comprising a group of
lexemes having a common characteristic, the computing means computes said
index for lexemes in the different lexicons selectively, in dependence
upon one or more messages previously issued to the caller and the speech
recognition means is employed to access the plurality of lexicons
selectively in dependence upon one or more messages previously issued to
the caller.
17. A method as claimed in claim 10, wherein the lexemes comprise names of
businesses and the data comprise the nature of the business.
18. A method of at least partially automating directory assistance in a
telephone system having directory assistance apparatus serving a
predetermined area and comprising a voice processing unit having a lexicon
of lexemes potentially recognizable by the unit, said lexemes including
lexemes corresponding to localities in a predetermined area served by the
directory assistance apparatus and lexemes corresponding to localities not
in the predetermined area, the method including the steps of:
using the voice processing unit to issue to a directory assistance caller a
message inviting the caller to utter a name of a locality;
recognizing: one of said lexemes in the utterance;
determining whether or not the recognized lexeme is one of said lexemes
corresponding to localities not in said predetermined area served by the
apparatus; and [,]
playing a message to the caller inviting the caller to direct the directory
assistance request to a different directory assistance area in the event
the recognized lexeme is not in the predetermined area.
19. Directory assistance apparatus, for a telephone system, comprising: a
voice processing unit having at least one lexicon of lexemes potentially
recognizable by the unit and data grouping the lexemes into predetermined
subsets, each subset comprising lexemes preselected to give greater
recognition accuracy for calls from a particular source; means for issuing
messages to a caller making a directory assistance call to prompt the
caller to utter one of said lexemes; means for detecting an identifier for
the call source from whence the directory assistance call was received;
means responsive to the detected identifier for selecting one of said
predetermined subsets; and speech recognition means limited to the
selected subset for recognizing, on the basis of the acoustics of the
caller's utterance, a lexeme from said subset corresponding to that
uttered by the caller.
20. A method of at least partially automating directory assistance in a
telephone system in which directory assistance apparatus comprises a voice
processing unit having a lexicon of lexemes potentially recognizable by
the unit, and data grouping the lexemes into predetermined subsets, each
subset preselected as giving greater recognition accuracy for calls from a
particular source, the method comprising the steps of:
issuing messages to a caller making a directory as a distance call to
prompt the caller to utter one or more utterances;
detecting an identifier for a call source from whence the directory
assistance call was received;
selecting on the basis of the identifier one of said predetermined subsets;
and
employing speech recognition means to recognize, from the selected subset
and on the basis of the acoustics of the caller's utterance, a lexeme
corresponding to that uttered by the caller. |
|
|
|
|
Claims  |
|
|
Description  |
|
|
BACKGROUND OF THE INVENTION
1. Technical Field
The invention relates to a method and apparatus for providing directory
assistance, at least partially automatically, to telephone subscribers.
2. Background Art
In known telephone systems, a telephone subscriber requiring directory
assistance will dial a predetermined telephone number. In North America,
the number will typically be 411 or 555 1212. When a customer makes such a
directory assistance call, the switch routes the call to the first
available Directory Assistance (DA) operator. When the call arrives at the
operator's position, an initial search screen at the operator's terminal
will be updated with information supplied by the switch, Directory
Assistance Software (DAS), and the Operator Position Controller (TPC). The
switch supplies the calling number, the DAS supplies the default locality
and zone, and the TPC supplies the default language indicator. While the
initial search screen is being updated, the switch will connect the
subscriber to the operator.
When the operator hears the "customer-connected" tone, the operator will
proceed to complete the call. The operator will prompt for locality and
listing name before searching the database. When a unique listing name is
found, the operator will release the customer to the Audio Response Unit
(ARU), which will play the number to the subscriber.
Telephone companies handle billions of directory assistance calls per year,
so it is desirable to reduce labour costs by minimizing the time for which
a directory assistance operator is involved. As described in U.S. Pat. No.
5,014,303 (Velius) issued May 7, 1991, the entire disclosure of which is
incorporated herein by reference, a reduction can be achieved by directing
each directory assistance call initially to one of a plurality of speech
processing systems which would elicit the initial directory assistance
request from the subscriber. The speech processing system would compress
the subscriber's spoken request and store it until an operator position
became available, whereupon the speech processing system would replay the
request to the operator. The compression would allow the request to be
replayed to the operator in less time than the subscriber took to utter
it. Velius mentions that automatic speech recognition also could be
employed to reduce the operator work time. In a paper entitled
"Multiple-Level Evaluation of Speech Recognition Systems", the entire
disclosure of which is incorporated herein by reference, John F. Pitrelli
et al disclose a partially automated directory assistance system in which
speech recognition is used to extract a target word, for example a
locality name, from a longer utterance. The system strips off everything
around the target word so that only the target word is played back to the
operator. The operator initiates further action.
U.S. Pat. No. 4,797,910 (Daudelin) issued Jan. 10, 1989, the entire
disclosure of which is incorporated herein by reference, discloses a
method and apparatus in which operator involvement is reduced by means of
a speech recognition system which recognizes spoken commands to determine
the class of call and hence the operator to which it should be directed.
The savings to be achieved by use of Daudelin's speech recognition system
are relatively limited, however, since it is not capable of recognizing
anything more than a few commands, such as "collect", "calling card",
"operator", and so on.
These known systems can reduce the time spent by a directory assistance
operator in dealing with a directory assistance call, but only to a very
limited extent.
SUMMARY OF THE INVENTION
The present invention seeks to eliminate, or at least mitigate, the
disadvantages of the prior art and has for its object to provide an
improved new directory assistance apparatus and method capable of
reducing, or even eliminating, operator involvement in directory
assistance calls.
To this end, according to one aspect of the present invention, there is
provided directory assistance apparatus for use in a telephone system,
comprising a voice processing unit having at least one lexicon of lexemes
potentially recognizable by the unit and data representing a predetermined
relationship between each of said lexemes and each of a plurality of call
sources in an area served by the directory assistance apparatus. The unit
also has means for issuing messages to a caller making a directory
assistance call to prompt the caller to utter a required one of the
lexemes, and means for detecting an identifier, for example a portion of a
calling number, for the call source from whence the directory assistance
call was received, means responsive to the identifier detected and to the
data for computing a probability index for each lexeme representing the
likelihood of that particular one of said lexemes being that uttered by
the caller, and speech recognition means for selecting from the lexicon,
on the basis of the acoustics of the caller's utterance and the
probability index, a lexeme corresponding to that uttered by the caller.
A lexeme is a basic lexical unit of a language and comprises one or several
words, the elements of which do not separately convey the meaning of the
whole.
Preferably, the voice processing unit has several lexicons, each comprising
a group of lexemes having a common characteristic, for example name,
language, geographical area, and the speech recognition means accesses the
lexicons selectively in dependence upon a previous user prompt.
Computation of the probability index may take account of a priori call
distribution. A priori call distribution weights the speech recognition
decision to take account of a predetermined likelihood of a particular
locality containing a particular destination being sought by a caller. The
apparatus may use the caller's number to identify the locality from which
the caller is making the call.
The probability index might bias the selection in favour of, for example,
addresses in the same geographical area, such as the same locality.
In preferred embodiments of the present invention the voice processing unit
elicits a series of utterances by a subscriber and, in dependence upon a
listing name being recognized, initiates automatic accessing of a database
to determine a corresponding telephone number.
The apparatus may be arranged to transfer or "deflect" a directory
assistance call to another directory assistance apparatus when it
recognizes that the subscriber has uttered the name of a locality which is
outside its own directory area. In such a situation, the above-mentioned
predetermined relationship between the corresponding lexeme and the call
source is that the lexeme relates to a locality which is not served by the
apparatus.
Thus, embodiments of the invention may comprise means for prompting a
subscriber to specify a locality, means for detecting a place name uttered
in response, means for comparing the uttered place name with the lexicon
and in dependence upon the results of the comparison selecting a message
and playing the message to the subscriber. If the place name has been
identified precisely as a locality name served by the apparatus, the
message may prompt the caller for more information. Alternatively, if the
locality name is not in the area served by the apparatus, the message
could be to the effect that the locality name spoken is in a different
calling or directory area and include an offer to give the subscriber the
directory assistance number to call. In that case, the speech recognition
system would be capable of detecting a positive answer and supplying the
appropriate area code. Another variation is that the customer could be
asked if the call should be transferred to the directory assistance
operator in the appropriate area. If the subscriber answered in the
affirmative, the system would initiate the call transfer.
The caller's responses to the speech recognition system may be recorded. If
the system disposed of the call entirely without the assistance of the
operator, the recording could be erased immediately. On the other hand, if
the call cannot be handed entirely automatically, at the point at which
the call is handed over to the operator, the recording of selected
segments of the subscriber's utterances could be played back to the
operator. Of course, the recording could be compressed using the prior art
techniques mentioned above.
According to a second aspect of the invention, a method of at least
partially automating directory assistance in a telephone system using
directory assistance apparatus comprising a voice processing unit having a
lexicon of lexemes potentially recognizable by the unit and data
representing a predetermined relationship between each of the lexemes and
a calling number in an area served by the automated directory assistance
apparatus, comprises the steps of:
issuing messages to a caller making a directory assistance call to prompt
the caller to utter one or more utterances, detecting an identifier, such
as a calling number originating a directory assistance call, computing, in
response to the identifier and said data, a probability index for each
lexeme representing the likelihood that the lexeme will be selected, and
employing speech recognition means to recognize, on the basis of the
acoustics of the caller's utterance and the probability index, a lexeme
corresponding to that uttered by the caller.
Preferably, the voice processing unit has several lexicons, each having
lexemes grouped according to certain characteristics e.g. names,
localities, languages and the method includes the steps of issuing a
series of messages and limiting the recognition process to a different one
of the lexicons according to the most recent message.
The various objects, features, aspects and advantages of the present
invention will become more apparent from the following detailed
description, in conjunction with the accompanying drawings, of a preferred
embodiment of the invention.
BRIEF DESCRIPTION OF DRAWINGS
FIG. 1 is a general block diagram of a known telecommunications system;
FIG. 2 is a simplified block diagram of parts of a telecommunications
system employing an embodiment of the present invention;
FIGS. 3A and 3B are a general flow chart illustrating the processing of a
directory assistance call in the system of FIG. 2;
FIG. 4 is a chart illustrating the frequency with which certain localities
are requested by callers in the same or other localities; and
FIG. 5 is a graph of call distribution according to distance and normalized
for population of the called locality.
BEST MODE FOR CARRYING OUT THE INVENTION
FIG. 1 is a block diagram of a telecommunications system as described in
U.S. Pat. No. 4,797,910. As described therein, block 1 is a
telecommunications switch operating under stored program control. Control
10 is a distributed control system operating under the control of a group
of data and call processing programs to control various parts of the
switch. Control 10 communicates via link 11 with voice and data switching
network 12 capable of switching voice and/or data between inputs connected
to the switching network. One such input is trunk 31 which connects a
calling terminal 40 to the network 12 by way of interconnecting network
30. Another calling terminal 46 is shown connected in like manner by
interconnecting network 32. A third calling terminal is connected to the
network 12 by customer line 44. An automated voice processing unit 14 is
connected to the switching network 12 and controlled by control 10. The
automated voice processing unit receives input signals which may be either
voice or dual tone multifrequency (DTMF) signals and is capable of
determining whether or not the DTMF signals are allowable DTMF signals and
initiating action appropriately. In the system described in U.S. Pat. No.
4,797,910, the voice processing unit is capable of distinguishing between
the various elements of a predetermined list of spoken responses. The
voice processing unit 14 also is capable of generating tones and voice
messages to prompt a customer to speak or key information into the system
for subsequent recognition by the speech recognition system. In addition,
the voice processing unit 14 is capable of recording a short customer
response for subsequent playback to a human operator. The voice processing
unit 14 generates an output data signal, representing the result of the
voice processing. This output data signal is sent to control 10 and used
as an input to the program for controlling establishment of connections in
switching network 12 and for generating displays for operator position 24
coupled to the network 12 via line 26. In order to set up operator
assistance calls, switch 1 uses two types of database system. Local
database 16 is directly accessible by control 10 via switching network 12.
Remote database system 20 is accessible to control 10 via switching
network 12 and interconnecting data network 18. A remote database system
is typically used for storing data that is shared by many switches. For
example, a remote database system might store data pertaining to customers
for a region. The particular remote database system 20 that is accessed
via data network 18 would be selected to be the remote database associated
with the region of the called terminal. Interconnecting data network 18
can be any well known data network and specifically could be a common
channel signalling system such as the international standard
telecommunications signalling system CCS 7.
A transaction recorder 22, connected to control 10, is used for recording
data about calls for subsequent processing. Typically, such data is
billing data. The transaction recorder 22 is also used for recording
traffic data in order to engineer additions properly and in order to
control traffic dynamically.
The present invention will be employed in a telecommunications system which
is generally similar to that described in U.S. Pat. No. 4,797,910. FIG. 2
is a simplified block diagram of parts of the system involved in a
directory assistance call, corresponding parts having the same reference
numbers in both FIG. 1 and FIG. 2. As shown in FIG. 2, block 1 represents
a telecommunications switch operating under stored program control
provided by a distributed control system operating under the control of a
group of data and call processing programs to control various parts of the
switch. The switch 1 comprises a voice and data switching network 12
capable of switching voice and/or data between inputs and outputs of the
switching network. As an example, FIG. 1 shows a trunk circuit 31
connected to an input of the network 12. A caller's station apparatus or
terminal 40 is connected to the trunk circuit 31 by way of network
routing/switching circuitry 30 and an end office 33. The directory number
of the calling terminal, identified, for example, by automatic number
identification, is transmitted from the end office switch 33 connecting
the calling terminal 40 to switch 1.
An operator position controller 23 connects a plurality of operator
positions 24 to the switch network 12. Each operator position 24 comprises
a terminal which is used by an operator to control operator assistance
calls. Data displays for the terminal are generated by operator position
controller 23. A data/voice link 27 connects an automated voice processing
unit 14A to the switching network 12. The automated voice processing unit
14A will be similar to that described in U.S. Pat. No. 4,797,910 in that
it is capable of generating tones and voice messages to prompt a customer
to speak or key dual tone multifrequency (DTMF) information into the
system, determining whether or not the DTMF signals are allowable DTMF
signals, initiating action appropriately and applying speech recognition
to spoken inputs. In addition, the voice processing unit 14A is capable of
recording a short customer response for subsequent playback to a human
operator.
Whereas, in U.S. Pat. No. 4,797,910, however, the voice processing unit 14
merely is capable of distinguishing between various elements of a very
limited list of spoken responses to determine the class of the call and to
which operator it should be directed, voice processing unit 14A of FIG. 2
is augmented with software enabling it to handle a major part, and in some
cases all, of a directory assistance call.
In order to provide the enhanced capabilities needed to automate directory
assistance calls, at least partially, the voice processing unit 14A will
employ flexible vocabulary recognition technology and a priori
probabilities. For details of a suitable flexible vocabulary recognition
system the reader is directed to Canadian patent application number
2,069,675 filed May 27, 1992 and laid open to the public Apr. 9, 1993, the
entire disclosure of which is incorporated herein by reference.
A priori probability uses the calling number to determine a probability
index which will be used to weight the speech recognition result. The
manner in which the a priori probabilities are determined will be
described in more detail later with reference to FIGS. 4 and 5.
While it would be possible to use a single lexicon to hold all of the
lexemes which it is capable of recognizing, the voice processing unit 14A
has several lexicons, for example, a language lexicon, a locality lexicon,
a YES/NO lexicon and a business name lexicon. Hence, each lexicon
comprises a group of lexemes having common characteristics and the voice
processing unit 14A will use a different lexicon depending upon the state
of progress of the call, particularly the prompt it has just issued to the
caller.
As shown in FIGS. 3A and 3B, in embodiments of the present invention, when
the voice processing unit 14A receives a directory assistance call, it
determines in step 301 whether or not the number of the calling party is
known. If it is not, the voice processing unit immediately redirects the
call for handling by a human operator in step 302. If the calling number
is known, in step 303 the voice processing unit 14A issues a bilingual
greeting message to prompt the caller for the preferred language and
compares the reply with a lexicon of languages. At the same time, the
message may let the caller know that the service is automated, which may
help to set the caller in the right frame of mind. Identification of
language choice at the outset determines the language to be used
throughout the subsequent process, eliminating the need for bilingual
prompts throughout the discourse and allowing the use of a less complex
speech recognition system. If no supported language name is uttered, or
the answer is unrecognizable, the voice processing unit 14A hands off the
call to a human operator in step 304 and plays back to the operator
whatever response the caller made in answer to the prompt for language
selection. It will be appreciated that the voice processing unit 14A
records the caller's utterances for subsequent playback to the operator,
as required.
If the caller selects French or English, in step 305 the voice processing
unit 14A uses the calling number to set a priori probabilities to
determine the likelihood of the locality names in the voice processing
unit's locality lexicon being requested. The locality lexicon comprises
the names of localities it can recognize, as well as a listing of
latitudes and longitudes for determining geographical distances between
localities and calling numbers. In step 305, the voice processing unit 14A
computes a priori probabilities for each lexeme in the locality lexicon
based upon (i) the population of the locality corresponding to the lexeme;
(ii) the distance between that locality and the calling number; and (iii)
whether or not the calling number is within that locality. The manner in
which these a priori probabilities can be determined will be described
more fully later.
In step 306, the voice processing unit 14A issues the message "For what
city?" to prompt the caller to state the name of a locality, and tries to
recognize the name from its locality lexicon using speech recognition
based upon the acoustics, as described in the afore-mentioned Canadian
patent application number 2,069,675. The voice processing unit will also
use the a priori probabilities to influence or weight the recognition
process. If the locality name cannot be recognized, decision steps 307 and
308 cause a message to be played, in step 309, to prompt the caller for
clarification. The actual message will depend upon the reason for the lack
of recognition. For example, the caller might be asked to speak more
clearly. Decision step 308 permits a limited number of such attempts at
clarification before handing the call off to a human operator in step 310.
The number of attempts will be determined so as to avoid exhausting the
caller's patience.
If the locality name is recognized, the voice processing unit 14A
determines in step 311 whether or not the locality is served by the
directory assistance office handling the call. If it is not, the voice
processing unit will play a "deflection" message in step 12 inviting the
caller to call directory assistance for that area. It is envisaged that,
in some embodiments of the invention, the deflection message might also
give the area code for that locality and even ask the caller if the call
should be transferred. It should be appreciated that, although some
localities for other areas are in the lexicon, and hence recognizable,
there is no corresponding data relating them to the calling numbers served
by the apparatus since the apparatus cannot connect to them. The
"predetermined relationship" between the localities for other areas and
the calling numbers is simply that they are not available through the
automated directory assistance apparatus which serves the calling numbers.
If the requested locality is served by the directory assistance office
handling the call, in step 313 the voice processing unit will transmit a
message asking the caller to state whether or not the desired listing is a
business listing and employ speech recognition and a YES/NO lexicon to
recognize the caller's response. If the response cannot be recognized,
decision steps 314 and 315 and step 316 will cause a message to be played
to seek clarification. If a predetermined number of attempts at
clarification have failed to elicit a recognizable response, decision step
315 and step 317 hand the call off to a human operator. If a respons | | |