|
Claims  |
|
|
What is claimed:
1. A telephony system for establishing voice communication over a
communication network between a caller and a call recipient, comprising:
means for initiating a call from a caller's location to a call destination
based on a voice utterance made by said caller; and
means for reproducing, at said call destination, an audible likeness of
said voice utterance made by said caller subsequent to the call being
initiated by said initiating means, so as to enable said call recipient to
identify said caller on the basis of voice characteristics of said caller.
2. A system according to claim 1, wherein said initiating means comprises
an automatic speech recognition system for detecting and recognizing
spoken voice utterances.
3. A system according to claim 2, wherein said initiating means comprises
means for determining said call destination based on a detection and
recognition of said caller's voice utterance by said automatic speech
recognition system.
4. A system according to claim 3, wherein said determining means includes a
list stored in a memory device for determining said call destination based
on the detection and recognition of said caller's voice utterance by said
automatic speech recognition system.
5. A system according to claim 4, wherein said voice utterance comprises a
recognizable name of said call recipient.
6. A system according to claim 3, wherein said reproducing means comprises
means for converting said voice utterance from an acoustic form into an
electrical form, and means for forwarding said voice utterance in said
electrical form to said call destination which is determined by said
determining means.
7. A system according to claim 6, wherein said means for converting said
voice utterance from said acoustic form into said electrical form includes
a microphone provided at said call destination.
8. A system according to claim 6, wherein said reproducing means further
comprises means for converting said voice utterance forwarded by said
forwarding means in said electrical form into an acoustic form.
9. A system according to claim 8, wherein said means for converting said
voice utterance in said electrical form into said acoustic form includes a
speaker provided at said call destination.
10. A system according to claim 1, further comprising means for processing
the call based upon a voice utterance made by a call recipient, said
processing means enabling said call recipient to optionally accept the
call and establish two-way voice communication subsequent to said voice
utterance made by said caller being reproduced by said reproducing means.
11. A system according to claim 10, further comprising means for
reproducing an audible likeness of said voice utterance made by said call
recipient at said caller's location based upon the acceptance of the call.
12. A system according to claim 1, wherein said call destination comprises
one of a call recipient's location and network address.
13. A telephony system for establishing voice communication over a
communication network between a caller and a call recipient, comprising:
means for initiating a call from a caller's location to a call destination
based on a voice utterance made by said caller, said initiating means
establishing one-way voice communication between said caller's location
and said call destination;
means for reproducing, at said call destination, said voice utterance made
by said caller subsequent to the call being initiated by said initiating
means, so that said call recipient can identify said caller; and
means for processing the initiated call based upon a voice utterance made
by said call recipient, said processing means enabling said call recipient
to optionally accept the call and establish two-way communication with
said caller.
14. A system according to claim 13, wherein said reproducing means
comprises means for converting said voice utterance made by said caller in
an acoustic form into an electrical form, and means for forwarding said
voice utterance in said electrical form to said call destination.
15. A system according to claim 14, wherein said means for converting said
voice utterance in said acoustic form into said electrical form includes a
microphone provided at said caller's location.
16. A system according to claim 14, wherein said reproducing means further
comprises means for converting said voice utterance forwarded by said
forwarding means in said electrical form into an acoustic form.
17. A system according to claim 16, wherein said means for converting said
voice utterance in said electrical form into said acoustic form includes a
speaker provided at said call destination.
18. A system according to claim 13, wherein said processing means comprises
an automatic speech recognition system for detecting and recognizing
spoken voice utterances.
19. A system according to claim 18, wherein said processing means comprises
means for enabling said call recipient to optionally select among a
plurality of call processing operations and means for performing a
selected call processing operation when a predetermined voice command made
by said call recipient is detected and recognized by said automatic speech
recognition system.
20. A system according to claim 19, wherein said call processing operations
include a delay call operation, whereby acceptance of the initiated call
is delayed by a predetermined amount of time when said delay call
operation is selected by said call recipient by voice command.
21. A system according to claim 19, further comprising a device for
recording a voice message and a device for reproducing a recorded voice
message.
22. A system according to claim 21, wherein said call processing operations
include a voice message playback operation, whereby a recorded voice
message is reproduced by said reproducing device and forwarded to said
caller's location when said voice message playback operation is selected
by said call recipient by voice command.
23. A system according to claim 21, wherein said call processing operations
include a voice message record operation, whereby a voice message made by
said caller is recorded by said recording device.
24. A system according to claim 13, further comprising means for screening
the initiated call, said screening means including a speaker identity
recognition system for determining the identity of said caller by
detecting and recognizing said voice utterance made by caller, and
screening the call based on the determined identity of the caller.
25. A system according to claim 13, further comprising means for
disconnecting a call based upon a voice utterance made by said caller or
said call recipient, so that said caller or said call recipient can
disconnect the call after the call has been accepted by said call
recipient.
26. A system according to claim 13, wherein said call destination comprises
one of a call recipient's location and network address.
27. A system according to claim 13, further comprising means for
reproducing an audible likeness of said voice utterance made by said call
recipient at said caller's location based upon the acceptance of the call.
28. A method of selectively establishing voice communication in a telephony
system, comprising the steps of:
initiating a call from a caller's location to a call destination over a
communications network based upon a voice utterance made by a caller;
reproducing, at said call destination, said voice utterance made by said
caller after the call has been initiated;
thereafter processing the call in response to detection of a voice
utterance made by a call recipient, so that said call recipient can
optionally accept the call and establish two-way voice communication with
said caller.
29. A method according to claim 28, wherein said call destination comprises
one of a call recipient's location and network address.
30. A telephony system for establishing two-way voice communication over a
communications network between a caller and a call recipient, comprising:
means for initiating a call and establishing one-way voice communication
from a caller's location to a call destination in accordance with a voice
utterance made by said caller;
means for reproducing, at said call destination, said caller's voice
utterance subsequent to the establishment of one-way voice communication
by said initiating means, so that said call recipient can identify said
caller; and
means for processing the initiated call in response to detection of a voice
utterance made by said call recipient, said processing means enabling said
call recipient to optionally accept the call and establish two-way
communication with said caller.
31. A system according to claim 30, wherein said initiating means comprises
an automatic speech recognition system for detecting and recognizing
spoken voice utterances.
32. A system according to claim 31, wherein said initiating means further
comprises means for determining said call destination based on a detection
and recognition of said caller's voice utterance by said automatic speech
recognition system.
33. A system according to claim 32, further comprising means for recording
said caller's voice utterance, said reproducing means reproducing said
voice utterance recorded by said recording means at said call destination
subsequent to the determination of said call destination by said
determining means.
34. A system according to claim 32, wherein said initiating means further
comprises connection means for establishing a one-way voice communication
between said caller and said call recipient based upon the determination
of said call destination by said determining means.
35. A system according to claim 34, further comprising means for recording
said caller's voice utterance, said reproducing means reproducing said
voice utterance recorded by said recording means at said call destination
subsequent to the establishment of one-way voice communication by said
connecting means.
36. A system according to claim 30, further comprising means for
reproducing a synthesized voice message at said call destination
subsequent to the call being initiated by said initiating means.
37. A system according to claim 30, further comprising means for
reproducing a recorded audio message at said call destination subsequent
to the call being initiated by said initiating means.
38. A system according to claim 30, wherein said call destination comprises
one of a call recipient's location and network address.
39. A system according to claim 30, further comprising means for
reproducing an audible likeness of said voice utterance made by said call
recipient at said caller's location based upon the acceptance of the call. |
|
|
|
|
Claims  |
|
|
Description  |
|
|
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention is related to an apparatus and method for hands-free
telephony. More particularly, the present invention relates to an
apparatus and method for transparent telephony that utilizes, amongst
other things, speech-based signaling for initiating and handling calls.
2. Background Information
In conventional telephone communication systems, a protocol consisting of a
series of known tasks or operations must be followed by users in order to
initiate and establish two-way voice communication. This protocol has
traditionally imposed a burden on users of telephony, and especially on
those placing or originating an outgoing call.
Typically, when a caller desires to place a telephone call, the caller
first must activate the system by lifting a handset at his or her
location. After determining the system status by detecting a dial tone,
the caller then inputs an arbitrary code (e.g., an individual's telephone
number) to specify the desired call recipient. Thereafter, the system
status is again monitored by the caller by listening for any one of a
number of predetermined tone signals indicating, e.g., ringing, line busy,
system busy, or network intercept. Upon receipt and acceptance of the call
by the call recipient, the caller then normally introduces him or herself.
Traditional telephony protocol has also imposed significant burdens on the
call recipient, albeit to a lesser extent. After a call has been initiated
by the caller, the call recipient at the other end hears an anonymous
ringing signal. If the call recipient is present and decides to accept the
initiated telephone call, the call recipient may establish two-way voice
communication over the system by lifting a handset at his or her location
and acknowledging receipt by saying something like "hello". Normally, once
the calling party has identified him or herself, the call recipient may
identify the caller and determine how to further process the call (e.g.,
converse, take message, terminate call, etc.).
Recently, there have been attempts to simplify the process for initiating
and processing telephone calls and, hence, lessen the burdens of
traditional telephony protocol. For example, in order to minimize the
burden of memorizing and time for inputting a code for specifying a
particular call destination, telephone sets have been designed with
one-button dialing wherein stored telephone code sequences are recalled
and automatically dialed. Further, some telephone companies have begun to
offer speed calling services, wherein a call can be initiated by dialing a
"shortened" code sequence (comprising, for example, two or three digits)
that represents a longer code sequence (comprising, for example, seven to
twelve digits) defined by the customer. Still, others have attempted to
simplify various tasks of telephony protocol by designing telephone
equipment that utilizes speech recognition to interpret voice commands.
For example, U.S. Pat. No. 4,870,686 to GERSON et al. and U.S. Pat. No.
4,731,811 to DUBUS disclose voice dialing systems for mobile radio
telephones in vehicles, and U.S. Pat. No. 4,945,570 to GERSON et al.
discloses a method for terminating a telephone call by voice command.
In addition, answering machines have been introduced for automatically
answering call attempts and taking messages when the intended call
recipient is unavailable. Further, caller identification services and
equipment are available for displaying the caller's telephone number at
the call recipient's location, so that the caller may be identified prior
to acceptance of the call.
However, despite these advances, telephony users are for the most part
still burdened by the existing constraints of telephony protocol.
Telephony users, in making or receiving a call, still must not only
tactilely interact with the telephone system, but also must spend time to
interface with the system. Recent and past attempts to address these
problems still fail to provide totally hands-free communication wherein
users can converse with one another as if they were in the same room, by
the use of simple verbal exchanges rather than conventional push-buttons,
numbers, beeps, tones and/or rings. Such features would be highly
desirable, for example, in an office environment or situations where
frequent communications are required.
SUMMARY OF THE INVENTION
In view of the foregoing, the present invention, through one or more of its
various aspects, embodiments and/or specific features or subcomponents
thereof, is thus intended to bring about one or more of the objects and
advantages as specifically noted below.
A general object of the present invention is to provide an apparatus and
method for transparent telephony that overcomes the traditional burdens of
telephony protocol (e.g., lifting a handset, detecting a dial tone,
inputting an arbitrary code, etc.).
More particularly, an object of the present invention is to provide an
apparatus and method for transparent telephony that utilizes speech based
signaling for initiating and processing calls, and that provides totally
hands-free communication for both the caller and the call recipient.
Another object of the present invention is to provide a transparent
telephony system in which a caller's voice is used to initiate a call, and
the caller's utterance is forwarded and reproduced at the call recipient's
location in order to serve as a form of caller identification to the call
recipient.
Still another object of the present invention is to provide a transparent
telephony system that creates the perception for users that communication
is being carried out as if they are closely situated with respect to one
another, e.g., as if they were in the same room or location, and that
provides a "transparent" quality to the communications network of the
system.
Yet another object of the present invention is to provide a transparent
telephony system that eliminates the use of dial tones and ringing, and
that indicates the presence of an incoming call to a call recipient by
reproducing an audible likeness of the caller's voice at the call
recipient's location. Further, an object of the present invention is to
enable the call recipient to identify the caller and to optionally accept
or refuse the incoming call by voice command, subsequent to the caller's
voice being reproduced at the call recipient's location, and before the
caller knows whether the call recipient is present.
Another object of the present invention is to provide a transparent
telephony system that automatically detects and recognizes voice
utterances, and that enables a call to be initiated and/or processed
(e.g., accepted or refused) by voice command, including identifying the
destination of the call in response to the caller's voice utterance.
According to one embodiment of the present invention, a transparent
telephony system is provided for establishing voice communication over a
communication network between a caller and a call recipient. The
transparent telephony system comprises means for initiating a call from a
caller's location to a call destination based on the voice utterance made
by the caller, and means for reproducing an audible likeness of the voice
utterance made by the caller at the call destination subsequent to the
call being initiated by the initiating means, so that the call recipient
may identify the caller on the basis of the caller's voice
characteristics.
The initiating means may include an automatic speech recognition system for
detecting and recognizing spoken voice utterances made by the caller.
Further the initiating means may include means for determining the call
destination based on the detection and recognition of the caller's voice
utterance by the automatic speech recognition system.
In addition, the determining means may include a dialing list stored in a
memory device for determining the call destination based on the voice
utterance detected and recognized by the automatic speech recognition
system.
Further, in accordance with an aspect of the present invention, the call
destination may be the call recipient's network address or location.
According to another embodiment of the present invention, a transparent
telephony system is provided for establishing voice communications over a
communications network between a caller and a call recipient. The
transparent telephony system includes means for initiating a call from a
caller's location to a call destination, means for reproducing a voice
utterance made by the caller at the call destination subsequent to the
call being initiated by the initiating means, so that the call recipient
may identify the caller, and means for processing the initiated call based
upon the voice utterance made by the call recipient. The processing means
enables the call recipient to optionally accept the call and establish
two-way voice communication.
The processing means may include means for converting the voice utterance
made by the caller in an acoustic form into an electrical form and means
for forwarding the voice utterance in the electrical form to the call
destination. The reproducing means may further include means for
converting the voice utterance forwarded by the forwarding means in the
electrical form into an acoustic form.
In addition, the processing means may include an automatic speech
recognition system for detecting and recognizing spoken voice utterances
made by the call recipient. The processing means may further include means
for enabling the call recipient to optionally select among a plurality of
call processing operations, each of the call processing operations being
initiated based on a predetermined voice command made by the call
recipient, and detected and recognized by the automatic speech recognition
system.
In accordance with another aspect of the present invention, the transparent
telephony system may further include means for screening the initiated
call, wherein the screening means includes a speaker identity recognition
system for determining the identity of the caller by detecting and
recognizing the voice utterance made by the caller, and screening the call
based on the determined identity of the caller.
Further, the transparent telephony system may be provided with means for
disconnecting the call based upon a voice utterance made by the caller or
the call recipient, whereby the caller or the call recipient may
disconnect the call after the call has been accepted by the call
recipient.
In addition, in accordance with an aspect of the present invention, the
call destination may be the call recipient's network address or location.
According to still another aspect of the present invention, a method of
transparent telephony is provided. A call is initiated from a caller's
location to a call destination based on the voice utterance made by a
caller. The voice utterance made by the caller is reproduced at the call
destination after the call has been initiated, and thereafter the call is
processed in response to detection of a voice utterance made by a call
recipient so that the call recipient may optionally accept the call and
establish two-way voice communication with the caller.
In accordance with yet another aspect of the present invention, a
transparent telephony system is provided for establishing two-way voice
communication over a communication network between a caller and call
recipient. The transparent telephony system includes means for initiating
a call from a caller's location to a call destination in accordance with a
voice utterance made by the caller, and means for processing the initiated
call in response to detection of a voice utterance made by the call
recipient. The processing means enables the call recipient to optionally
accept the call and establish two-way voice communication with the caller.
The initiating means may include an automatic speech recognition system for
detecting and recognizing spoken voice utterances, and means for
determining the call destination based on a detection and recognition of
the caller's voice utterance by the automatic speech recognition system.
The initiating means may further include connection means for establishing
a one-way voice communication between the caller and the call recipient
based upon the determination of the call destination by the determining
means.
In addition, the transparent telephony system may include means for
recording the caller's voice utterance and means for reproducing the voice
utterance recorded by the recording means at the call destination
subsequent to the determination of the call destination by the determining
means.
Further, the transparent telephony system may include means for reproducing
a synthesized voice message or a recorded audio message at the call
destination subsequent to the call being initiated by said initiating
means.
The above-listed and other objects, features and advantages of the present
invention will be more fully set forth hereinafter.
BRIEF DESCRIPTION OF THE DRAWINGS
The present invention is further described in the detailed description
which follows, by reference to the noted plurality of drawings by way of
non-limiting examples of preferred embodiments of the present invention,
in which like reference numerals represent similar parts throughout the
several views of the drawings, and wherein:
FIGS. 1A and 1B illustrate a transparent telephony system in accordance
with one embodiment of the present invention with network-based and
customer premise equipment (CPE)-based speech processing, respectively;
FIG. 2 illustrates a second embodiment of a transparent telephony system
according to the present invention;
FIG. 3 illustrates a third embodiment of a transparent telephony system in
accordance with the present invention, utilizing CPE-based speech
processing;
FIG. 4 illustrates the high-level software architecture of an
implementation of the transparent telephony system of the present
invention shown in FIG. 3.;
FIG. 5 is a logical flow diagram of the initialization procedure of the
transparent telephony system of the present invention shown in FIG. 3;
FIG. 6 illustrates exemplary activation events for the command vocabularies
of the transparent telephony system of the present invention depicted in
FIG. 3; and
FIG. 7 illustrates a logical flow diagram of the transparent telephony
system following the initialization procedure illustrated in FIG. 5.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
Referring now to the accompanying drawings, FIGS. 1A and 1B illustrate a
general block diagram of the transparent telephone system in accordance
with a first embodiment of the present invention.
A communications network 32 is provided for interconnecting and
communicating voice signals between a plurality of customers at N
locations (where N is an integer greater than 1). At each of the customer
locations, a specialized station set 12 is provided for inputting and
outputting audio signals, including voice commands and utterances. Each
station set 12 is coupled to communications network 32 through a speech
processing system 22. As indicated in FIGS. 1A and 1B, respectively,
speech processing system 22 of the present invention may either be
network-based or customer premise equipment (CPE)-based, or a hybrid
combination of the same, e.g., depending on where the speech processing
system is located. However, it should be noted that where the complete
transparent telephony system is implemented within a local environment,
e.g., within an office building, the CPE/network distinction may become
less significant in terms of implementation.
Station set 12 includes, at each location, a microphone 16 for converting
voice utteran | | |