|
|
|
| United States Patent | 5638425 |
| Link to this page | http://www.wikipatents.com/5638425.html |
| Inventor(s) | Meador, III; Frank E. (Eldersburg, MD);
Casey; Kathleen M. (Rockville, MD);
Curry; James E. (Herndon, VA);
McAllister; Alexander I. (Wheaton, MD);
Tressler; Robert C. (Dunkirk, MD);
Hayden, III; James B. (Burke, VA);
Hanle; John P. (Silver Spring, MD) |
| Abstract | A mechanized directory assistance system for use in a telecommunications
network includes multiple speech recognition devices comprising a word
recognition device, a phoneme recognition device, and an alphabet
recognition device. Also provided is a voice processing unit and a
computer operating under stored program control. A database is utilized
which may comprise the same database as used for operator directory
assistance. The system operates as follows: A directory assistance caller
is prompted to speak the city or location desired. The response is
digitized and simultaneously inputted to the word and phoneme recognition
devices which each output a translation signal plus a probability level
signal. These are compared and the highest probability level translation
is selected. The caller is prompted to speak the name of the sought party.
The response is processed in the same manner as the location word. In the
event that the probability level fails to meet a predetermined standard
the caller is prompted to spell all or part of the location and/or name.
The resulting signal is inputted to the alphabet device. When translations
are obtained having a satisfactory probability level the database is
accessed. If plural listings are located these are articulated and the
caller is prompted to respond affirmatively or negatively as to each. When
a single directory number has been located a signal is transmitted to the
caller to articulate this number. The system also includes provision for
DTMF keyboard input in aid of the spelling procedure. |
|
|
|
Title Information  |
|
|
|
|
|
Drawing from US Patent 5638425 |
|
|
Automated directory assistance system using word recognition and phoneme
processing method |
|
|
|
|
|
| Publication Date |
June 10, 1997 |
|
|
|
|
|
| Filing Date |
November 2, 1994 |
|
|
|
|
|
|
|
|
|
|
|
| Parent Case |
This is a Continuation-In-Part application of Ser. No. 07/992,207 filed
Dec. 17, 1992, now abandoned. |
|
|
|
|
|
|
|
|
|
|
|
|
|
Title Information  |
|
|
Claims  |
|
|
We claim:
1. A method of providing subscriber telephone numbers to telephone users in
an automated fashion comprising the steps of:
(a) connecting a telephone user to an automated directory assistance
station upon a user dialing a predetermined number on a telephone;
(b) instructing the user with a stored message upon said connection to
respond by speaking a name of a location of a sought subscriber;
(c) encoding a first response from the user into a first digital signal
compatible with means for speech recognition;
(d) transmitting said first digital signal to first means for speech
recognition by word recognition and second means for speech recognition by
phoneme recognition;
(e) decoding first output signals from said first and second means for
speech recognition to produce first and second decoded signals and a
probability level signal associated with each decoded signal, said
probability level signals being indicative of a probability that a
respective decoded signal is correct with respect to the first response;
(f) combining said probability level signals associated with said first and
second decoded signals according to a predetermined function to derive a
plurality of combined probability level signals;
(g) comparing each said combined probability level signal to a
predetermined threshold;
(h) selecting one of said first and second decoded signals associated with
the highest combined probability level signal to provide a first selected
signal;
(i) addressing a database for a location indicated by said first selected
signal;
(j) instructing the user with a stored message to respond by speaking the
last name of the sought subscriber;
(k) encoding a second response comprising the last name from the user into
a second digital signal compatible with said first and second means for
speech recognition;
(l) transmitting said second digital signal to the first and second means
for speech recognition;
(m) decoding second output signals from said first and second means for
speech recognition to produce third and fourth decoded signals and a
probability level signal associated with each of said third and fourth
decoded signals with respect to said second response;
(n) selecting one of said third and fourth decoded signals having the
highest probability level to provide a second selected signal;
(o) searching said database using said second selected signal to obtain a
directory number corresponding thereto; and
(p) transmitting a message to the user articulating said directory number.
2. A method according to claim 1 including the steps of:
responding to locating multiple matches in said database as a result of
steps (i) or (o) by transmitting to the user a voiced selection of words
with a request for an affirmative or negative response.
3. A method according to claim 2 wherein said user is requested to indicate
said affirmative response or said negative response by voicing "YES" or
"No".
4. A method according to claim 2 wherein said user is requested to indicate
said affirmative response or said negative response by striking a DTMF key
designated in said request.
5. A method according to claim 1 including the steps of:
responding to locating multiple matches in said database as a result of
steps (i) or (o) by requesting the user to speak an additional indicia of
the sought subscriber;
encoding a third response from the user into a third digital signal
compatible with said first and second means for speech recognition;
transmitting said third digital signal to said first and second means for
speech recognition;
decoding and translating a fifth decoded signal responsive to the
transmitting of said third digital signal; and
wherein the step (o) further comprises searching said database using said
fifth decoded and translated signal.
6. A method of providing subscriber telephone numbers to telephone users in
an automated fashion comprising the steps of:
(a) connecting a telephone user dialing a predetermined number on a
telephone to an automated directory assistance means;
(b) instructing the user with a stored message upon said connection to
respond by speaking an indicia of the location of a sought subscriber;
(c) encoding a first response from the user into a first digital signal
compatible with means for speech recognition;
(d) transmitting said first digital signal to first means for speech
recognition by phoneme recognition and second means for speech recognition
by word recognition;
(e) generating a plurality of candidate words from said second speech
recognition means, said candidate words being expressed as encoded signals
output by said second speech recognition means related to said first
response;
(f) decoding signals output by said first and second speech recognition
means responsive to said first digital signal to produce first and a
plurality of second decoded signals and a probability level signal
associated with each decoded signal, said probability level signals being
indicative of a probability that a respective decoded signal is correct
with respect to the first response;
(g) comparing said probability level signals from said second speech
recognition means to a predetermined threshold level and, if one of said
probability level signals is equal to or greater than said threshold
level, addressing a database of directory number data for a first location
indicated by the decoded signal associated with the one probability level
signal greater than or equal to said threshold level;
(h) if all said probability level signals associated with said second
plurality of decoded signals are less than said threshold level, combining
said probability level signals from said first and second speech
recognition means to obtain a plurality of combined probability level
signals;
(i) selecting one of said first and second decoded signals having the
highest combined probability level signal;
(j) addressing a database of directory number data for a second location
indicated by the one selected signal;
(k) instructing the user with a stored message to respond by speaking an
indicia of a name of the sought subscriber;
(l) encoding a second response spoken by the user into a second digital
signal compatible with said first speech recognition means;
(m) responsive to input of said second digital signal decoding an output of
said first speech recognition means to produce a third decoded signal;
(n) using said third decoded signal searching said database for said second
location to obtain a directory number corresponding to the second
response; and
(o) transmitting a message to the user articulating said directory number.
7. A method according to claim 6 including the steps of:
responding to locating multiple matches in said database as a result of
steps (j) or (n) by transmitting to the user an audio selection of words
with a request for an affirmative or negative response.
8. A method according to claim 7 wherein said user is requested to indicate
said affirmative response or said negative response by voicing "YES" or
"NO".
9. A method according to claim 7 wherein said user is requested to indicate
said affirmative response or said negative response by striking a key
designated in said request.
10. An automated directory assistance system for use in a
telecommunications network connected to a plurality of telephone stations
comprising:
means for speech recognition, means for voice processing, and computer
means for controlling said voice processing means and said speech
recognition means;
a database associated with said computer means, said speech recognition
means and said voice processing means;
said speech recognition means including means for word recognition and
means for phoneme recognition, each encoding voice signals inputted
thereto and providing an output candidate signal indicative of a sought
word signified by said voice signals and an associated probability level
signal indicative of a probability level in the accuracy of the word
signified by the output candidate signal, said speech recognition means
further including means for calculating a combined probability value
signal by combining the probability level signals associated with the
output candidate signals from the respective word recognition means and
phoneme recognition means to obtain a combined probability level signal
for each output candidate signal, the combined probability level signal
indicating a recognition accuracy between said output candidate signal of
the word recognition means and an associated output candidate signal from
said phoneme recognition means;
said database including stored word and phoneme data;
said voice processing means including stored voice messages;
comparator and selector means associated with said word recognition means
and said phoneme recognition means for comparing the probability level
signals associated with respective the output candidate signals of said
word recognition means and said phoneme recognition means responsive to
said voice signals and selecting one of said output candidate signals
having an associated combined probability level signal with the highest
value;
said computer means comprising means for:
(a) causing said voice processing means to transmit, to a user having
dialed directory assistance, instructions to speak a word or words
indicative of an identity of a subscriber whose directory number is
sought;
(b) causing said voice signals resulting from a response from the user to
be inputted to said word recognition means and said phoneme recognition
means;
(c) causing said word recognition means and said phoneme recognition means
to encode said voice signals to produce said respective output candidate
signals and said corresponding probability level signals;
(d) to cause said comparator and selector means to compare said probability
level signals, the combined probability level signal with the highest
value, and access said database using the corresponding selected one
output candidate signal to identify a directory number in said database;
and
(e) causing said means for voice processing to direct to said user a signal
indicative of said directory number.
11. An automated caller assistance system for use in a telecommunications
network connected to a plurality of telephone stations comprising:
means for speech recognition, means for voice processing, and computer
means for controlling said voice processing means and said speech
recognition means;
a database associated with said computer means, said speech recognition
means and said voice processing means;
said speech recognition means including means for word recognition and
means for phoneme recognition, said word recognition means comprising
means for generating a plurality of candidate words for each word spoken
by a user, each said candidate word represented by an output candidate
signal with an associated candidate probability level signal, said phoneme
recognition means comprising means for generating a plurality of phonemes
associated with each of said candidate words, each said phoneme being
represented by an output phoneme signal and an associated phoneme
probability level signal, said speech recognition means further comprising
means for combining the candidate probability level signal for each said
word candidate with the phoneme probability level of each of said phonemes
associated with a respective candidate word according to a predetermined
function to obtain a plurality of combined probability level signals for
each said candidate word;
said database including stored word and phoneme data;
said voice processing means including stored voice messages;
comparator and selector means, associated with said word recognition means
and said phoneme recognition means, for comparing the candidate and
phoneme output signals responsive to each word spoken and selecting the
candidate word having the corresponding combined probability level signal
with the highest value;
said computer comprising means for:
(a) causing said voice processing means to transmit instructions prompting
the user to speak a word or words indicative of the nature of the
assistance which is sought;
(b) causing said each word resulting from a response from the user to be
inputted to said word recognition means and said phoneme recognition
means;
(c) causing said word recognition means and said phoneme recognition means
to encode said each word to produce the respective candidate and phoneme
output signals and candidate and phoneme probability level signals;
(d) causing said comparator and selector means to compare said probability
level signals, identify the combined probability level signal with the
highest indicia of probability, and associate the candidate word
corresponding thereto with said database to identify an assistance
associated therewith in said database; and
(e) causing said means for voice processing to direct to said user an audio
signal indicative of said assistance.
12. A method for automatically providing subscriber telephone numbers to
telephone users over a telephone line, comprising the steps of:
(a) connecting a telephone user dialing a predetermined number to an
automated directory assistance station;
(b) transmitting a first response from said user to a speech recognition
device comprising means for recognizing a word from among a plurality of
words and means for recognizing a phoneme string for association with a
respective word;
(c) obtaining a plurality of word candidates related to said first response
from the word recognizing means, each of said word candidates having a
probability value indicating a probability that said each word candidate
is correct with respect to said first response;
(d) selecting one of the word candidates having a highest probability
value;
(e) comparing said highest probability value of the selected word candidate
to a first threshold value;
(f) if said highest probability value is equal to or greater than said
first threshold value, accessing a first database to obtain first
information corresponding to said selected word candidate;
(g) transmitting said first information to said user;
(h) if said highest probability value is less than said first threshold
value, obtaining at least one phoneme string associated with each said
word candidate, each said phoneme string having a probability value
indicating a probability that said each phoneme string is correct with
respect to said first response;
(i) combining said probability value for each said word candidate and the
probability value of the corresponding at least one phoneme string
according to a first predetermined function to obtain at least one
combined probability value for each said word candidate;
(j) selecting one of said word candidates having the highest combined
probability value;
(k) comparing said highest combined probability value to a second threshold
value to determine a satisfactory level;
(l) if said highest combined probability value is equal to or greater than
said second threshold value, accessing said first database to obtain said
first information corresponding to the selected word candidate having the
highest combined probability value; and
(m) transmitting said first information to said user.
13. The method of claim 12, further comprising the steps of:
(n) if said highest combined probability value is less than said second
threshold value, prompting said user to provide at least a partial
spelling of said first response;
(o) transmitting a response including said spelling to means for
recognizing individual letters of the alphabet to output word candidate
data related to said spelling, said word candidate data having a
probability value indicating a probability that said word candidate data
is correct with respect to said spelling;
(p) comparing said probability value of said word candidate data output
from the recognizing letters means to a third threshold value;
(q) if the probability value associated with said word candidate data from
said recognizing letters means is equal to or greater than said third
threshold value, accessing said first database to obtain the first
information corresponding to said word candidate data.
14. The method of claim 13, further comprising steps of:
(r) if said probability value associated with said word candidate data
output from said recognizing letters means is less than said third
threshold value, prompting said user to input second alphabetic data via a
keypad for transmission to a third database;
(s) accessing said first database to obtain second information
corresponding to said second alphabetic data; and
(t) transmitting said second information to said user.
15. The method of claim 12, wherein each said word candidate has a single
corresponding phoneme string associated therewith, and said phoneme string
is derived based upon phonetic spelling of said associated word candidate.
16. The method of claim 15, wherein each said phoneme string is derived
based upon a plurality of pronunciations of said associated word
candidate.
17. A method for automatically providing subscriber telephone numbers to
telephone users over a telephone line, comprising the steps of:
(a) connecting a telephone user dialing a predetermined number to an
automated directory assistance station;
(b) transmitting a first response from said user to a speech recognition
device comprising means for recognizing a word from among a plurality of
words and means for recognizing a phoneme string for association with a
respective word;
(c) obtaining a first plurality of word candidates related to said first
response from the word recognizing means and a first plurality of phoneme
strings for each said word candidate from the phoneme recognizing means,
each of said word candidates and each of said phoneme strings related
thereto having probability values indicating a probability that said each
word candidate and said each phoneme string are correct with respect to
said first response, respectively;
(d) combining said probability value for said each word candidate with said
each of said phoneme strings related thereto according to a first
predetermined function to obtain a first plurality of combined probability
values for said each word candidate;
(e) selecting one of said word candidates having the highest combined
probability value;
(f) accessing a first database to obtain first information corresponding to
said selected word candidate;
(g) instructing said user to provide a second response related to said
first response;
(h) transmitting said second response to said speech recognition device;
(i) accessing said first database to obtain a plurality of second word
candidates related to said second response, each of said second word
candidates having a plurality of second phoneme strings related thereto,
each of said second word candidates and said second phoneme strings having
probability values indicating a probability that said each second word
candidate and said each second phoneme string are correct with respect to
said second response, respectively;
(j) combining said probability value for said each second word candidate
with said each of said second phoneme strings related thereto according to
said first predetermined function to obtain a second plurality of combined
probability values for each said second word candidate;
(k) selecting one of said second word candidates having the highest
combined probability of value of said second plurality;
(l) accessing said first database to obtain second information
corresponding to the selected one of said second word candidates; and
(m) transmitting said second information to said user.
18. The method of claim 17, wherein said first response is related to a
location and said second response is related to the identity of a
subscriber.
19. A method for automatically providing directory assistance information
including a subscriber telephone number to a telephone user over a
telephone line, comprising the steps of:
(a) connecting the telephone user to an automated directory assistance
station;
(b) transmitting a first response from said user to means for recognizing
speech, said speech recognizing means comprising means for recognizing a
word from among a plurality of words;
(c) obtaining a plurality of word candidates related to said first
response, each of said word candidates having a probability value
indicating a probability that said each word candidate is correct with
respect to said first response;
(d) comparing the probability values of said respective word candidates to
a first threshold value;
(e) accessing a first database in response to identifying one of the word
candidates having a probability value greater than or equal to a first
threshold value, and obtaining first directory assistance information
corresponding to the one identified word candidate;
(f) transmitting said first information to said user;
and wherein if none of said probability values is greater than or equal to
said first threshold:
(g) obtaining at least one phoneme string associated with each said word
candidate, said at least one phoneme string having a probability value
indicating a probability that said at least one phoneme string is correct
with respect to said first response;
(h) combining said probability value for said each word candidate with the
probability value of the corresponding at least one phoneme string
according to a first predetermined function to obtain at least one
combined probability value for said each word candidate;
(i) selecting one of said word candidates having the highest combined
probability value;
(j) comparing said highest combined probability value to a second threshold
to determine a satisfactory level;
(k) if said highest combined probability value is greater than or equal to
said second threshold, accessing said first database to obtain second
directory assistance information corresponding to said selected word
candidate; and
(l) transmitting said second information to said user.
20. A method for automatically providing subscriber telephone numbers to
telephone users over telephone line, comprising the steps of:
(a) connecting a telephone user to an automated directory assistance
station;
(b) transmitting a first response from said user to means for recognizing
speech, said means for recognizing speech comprising means for recognizing
a word from among a plurality of words and means for recognizing a phoneme
string for association with a respective word;
(c) obtaining a plurality of word candidates related to said first
response, each of said word candidates having a probability value
indicating a probability that said each word candidate is correct with
respect to said first response;
(d) comparing the probability value of a first of the word candidates with
a first threshold;
(e) if said probability value of said first of the word candidates is equal
to or greater than said first threshold value, accessing a database to
obtain first information corresponding to said first of the word
candidates;
(f) instructing the user to provide a second response related to said first
response;
(g) transmitting said second response to said means for recognizing speech;
(h) obtaining a plurality of secondary word candidates related to said
second response, each of said secondary word candidates having a
probability value indicating a probability that said each secondary word
candidate is correct with respect to said second response, and obtaining
at least one phoneme string associated with said each secondary word
candidate, each said phoneme string having a probability value indicating
a probability that said each phoneme is correct with respect to said
second response;
(i) combining said probability value for said each secondary word candidate
with the probability value of the corresponding at least one phoneme
string according to a first predetermined function to obtain at least one
combined probability value for said each word candidate;
(j) selecting one of said secondary word candidates having a highest
combined probability value;
(k) comparing said highest combined probability value to a second threshold
to determine a satisfactory level;
(l) if said highest combined probability value is equal to or greater than
said second threshold value, accessing said database to obtain second
information corresponding to the one secondary word candidate; and
(m) transmitting said second information to said user.
21. An automated directory system for use in a telecommunications network
connected to a plurality of telephone stations, comprising:
(a) means for recognizing a word spoken by a user from among a plurality of
words, said recognizing means comprising means for generating a plurality
of candidate words responsive to the spoken word, and means for assigning
a probability value to each of the generated candidate words indicative of
a probability that said each candidate word is correct with respect to
said spoken word;
(b) means for recognizing phoneme string comprising means for generating a
plurality of phoneme strings related to each of said candidate words, and
means for generating a probability value for each said phoneme string
indicative of a probability that said each phoneme string is correct with
respect to said spoken word;
(c) means for combining the probability values of the phoneme strings with
probability values of the respective candidate words to obtain a combined
probability value for each said candidate word;
(d) comparison means for determining when one of the combined probability
values is equal to or greater than a predetermined threshold value;
(e) a database containing information related to the plurality of word
candidates; and
(f) means for applying the candidate word corresponding to the one combined
probability value to access said first database to obtain information to
be sent to said user.
22. The system of claim 21, wherein said means for combining the
probability values comprises means for correlating the phoneme strings
corresponding to said each candidate word to combine the probability level
of said each word candidate with the probability level of each of said
corresponding phoneme strings to provide a plurality of combined
probability values for said each word candidate.
23. A method for automatically providing directory assistance information
including a subscriber telephone number to a telephone user, comprising:
connecting a telephone operated by the telephone user to a directory
assistance station;
transmitting a first spoken input from the telephone user to a word
recognition device and a phoneme recognition device;
in response to the first spoken input, outputting a plurality of word
candidates from the word recognition device and a plurality of phoneme
strings for each of the word candidates from the phoneme recognition
device, each of the word candidates having a candidate probability value
and each of the phoneme strings having a string probability value, each
candidate probability value and string probability value indicating a
probability that the corresponding word candidate and phoneme string is
correct with respect to the first spoken input, respectively;
combining each candidate probability value, in accordance with a
predetermined function, with the corresponding plurality of string
probability values of the respective plurality of phoneme strings for said
each word candidate to obtain a plurality of combined probability values
for said each word candidate;
selecting one of the word candidates having a highest combined probability
value; and
accessing a database in response to the one word candidate to obtain the
directory assistance information corresponding to the first spoken input.
24. The method of claim 23, further comprising determining if the highest
combined probability value exceeds a threshold.
25. The method of claim 24, further comprising transmitting the directory
assistance information if the highest combined probability value exceeds
the threshold. |
|
|
|
|
Claims  |
|