|
Description  |
|
|
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates generally to a spoken-instruction controlled
system for an automotive vehicle, and more specifically to a system for an
automotive vehicle which can immediately stop or reliably return, to the
original state, a vehicle device, for instance, such as a door window
opening/closing device which is being operated by a motor erroneously and
dangerously in response to a driver's spoken instruction.
2. Description of the Prior Art
Conventionally, there is a well-known speech recognizer which can activate
various actuators in response to human spoken instructions. When this
speech recognizer is mounted on an automotive vehicle, the headlight, for
instance, can be turned on or off in response to a driver's spoken
instruction such as "Headlight on" or "Headlight off". The speech
recognizer is very convenient because various spoken instructions can be
recognized in order to control various actuators, without depressing
switches; however, there are some problems involved in applying this
system to an automotive vehicle.
One of the problems is as follows: in the speech recognizer, provided that
a predetermined spoken instruction is uttered clearly and correctly, the
system can recognize the spoken instruction accurately; however, when a
spoken instruction is uttered unclearly or incorrectly or when an intense
noise is included within the spoken instruction, in spite of the fact that
a spoken instruction to actuate a car radio is uttered, for instance, the
uttered spoken instruction is erroneously recognized and therefore a door
window may be actuated and some passenger's fingers may be pinched between
a moving door window and a window frame. In such a dangerous state as
described above, the driver must first depress a recognition switch, next
utter a spoken instruction to stop the moving door window, thirdly utter a
spoken instruction to move the door window reversely, and fourthly utter a
spoken instruction to stop the door window moving reversely. Additionally,
after the door window stops moving reversely, the driver must utter the
same spoken to actuate a car radio again correctly while depressing the
recognition switch. That is to say, in the prior-art speech recognizer,
there exists a problem in that in case an erroneous recognition occurs,
the operation is complicated and takes much time.
A more detailed description of a typical speech recognizer or a prior-art
spoken instruction controlled system will be made with reference to the
attached drawing in conjunction with the present invention under DETAILED
DESCRIPTION OF THE PREFERRED EMBODIMENTS.
SUMMARY OF THE INVENTION
With these problems in mind therefore, it is the primary object of the
present invention to provide a spoken-instruction controlled system for an
automotive vehicle which can easily be immediately stop a device, for
instance, such as a door window opening/closing device which is being
operated by a motor erroneously in a dangerous state due to an erroneous
spoken instruction recognition; in more detail, which can stop the moving
device immediately when the recognition switch is depressed again within a
predetermined time period after the device has been actuated. Further, in
the system according to the present invention, the system can recognize a
newly-inputted spoken instruction again if a correct spoken phrase is
inputted within a predetermined time period after the device moving
erroneously has been stopped.
It is another object of the present invention to provide a
spoken-instruction controlled system for an automotive vehicle which can
automatically return, to the original position or original conditions, a
device, for instance, such as a door window opening/closing device which
is being operated by a motor erroneously in a dangerous state due to
erroneous spoken instruction recognition; in more detail, which can first
return the erroneously-moving device automatically to its original
position when a reset switch is depressed within a predetermined time
period after the device has been actuated.
In the spoken-instruction controlled system for an automotive vehicle
according to the present invention; therefore, it is possible to
facilitate a necessary action to be taken when a spoken instruction is
erroneously recognized and thus a device is dangerously actuated against
the driver's will.
To achieve the above-mentioned primary object, the spoken-instruction
controlled system for an automotive vehicle according to the present
invention comprises means for outputting a stop command signal to the
vehicle actuator for a predetermined time period when the recognition
switch is turned on again within another predetermined time period after
the speech recognizer has outputted a recognition command signal to the
vehicle device actuator and means for enabling the speech recognizer to
recognize spoken instructions for another predetermined time period after
the stop command signal outputting means has been disabled. The
above-mentioned two means are mainly made up of a plurality of timer
units, OR gates, AND gates, inverters, etc.
To achieve the above-mentioned another object, the spoken-instruction
controlled system for an automotive vehicle according to the present
invention further comprises a reset switch, means for storing the original
operating conditions of the vehicle device actuators whenever the reset
switch is turned on within a predetermined time period after the speech
rcognizer has outputted a recognition command signal, and means for
returning the operating conditions of the vehicle device actuator to the
original operating conditions another predetermined time period after the
reset switch has been turned on. The above-mentioned two means are mainly
made up of a clock pulse generator, a counter, a latch circuit, a
programmable subtract counter, a flip-flop etc. in the case where time
interval during which the vehicle device actuator has been operated is
important, for instance, in the case of a door window opening/closing
device; however, in the case where on-off state in which the vehicle
device actuator has been operated is important, for instance, in the case
of a car-radio actuator, the above-mentioned two means are made up of a
latch circuit, a decoder, etc.
BRIEF DESCRIPTION OF THE DRAWINGS
The features and advantages of the spoken-instruction controlled system for
an automotive vehicle according to the present invention will be more
clearly appreciated from the following description taken in conjunction
with the accompanying drawings in which like reference numerals designate
corresponding elements or sections throughout the drawings and in which;
FIG. 1 is a schematic block diagram of a typical speece recognizer for
assistance in explaining the operation thereof;
FIG. 2 is a schematic block diagram of a first embodiment of the
spoken-instruction controlled system for an automotive vehicle according
to the present invention, by which a door window and a car radio are
actuated in response to spoken instructions;
FIG. 3 is a more-detailed circuit diagram of the actuator for a door window
opening/closing device and a car radio, which is shown in FIG. 2 above;
FIG. 4 is a more-detailed circuit diagram of a doow window opening/closing
device shown in FIG. 3 above;
FIG. 5 is a schematic block diagram of a second embodiment of the
spoken-instruction controlled system for an automotive vehicle according
to the present invention, by which a door window opening/closing device is
operated in response to spoken instructions;
FIG. 6 is a timing chart showing the waveforms at each essential position
in the second embodiment shown in FIG. 5 above; and
FIG. 7 is a schematic block diagram of a third embodiment of the
spoken-instruction controlled system for an automotive vehicle according
to the present invention, by which an electronic tuner-type car radio is
actuated in response to spoken instructions.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
To facilitate understanding of the present invention, a brief reference
will be made to the principle or operation of a typical prior-art speech
recognizer, with reference to FIG. 1.
FIG. 1 shows a schematic block diagram of a typical speech recognizer 100.
To use the speech recognizer, the user must first record a plurality of
predetermined spoken instructions. Specifically, in this spoken
instruction recording mode (reference mode), the user first depresses a
record switch 1 disposed near the user. When the record switch 1 is
depressed, a switch input interface 4 detects the depression of the record
switch 2 and outputs a signal to a controller 5 via a wire 4a. In response
to this signal, the controller 5 outputs a recording mode command signal
to other sections in order to preset the entire speech recognizer to the
recording mode. In the spoken instruction recording mode, when the user
says a phrase to be used as a spoken instruction, such as "open doors",
near a microphone 2, the spoken phrase is transduced into a corresponding
electric signal through the microphone 2, amplified through a speech input
interface 6 consisting mainly of a spectrum-normalizing amplifier,
smoothed through a root-mean-square (RMS) smoother 15 including a
rectifier and a smoother, and finally inputted to a voice detector 7. This
voice detector 7 detects whether or not the magnitude of the spoken phrase
signals exceeds a predetermined level for a predetermined period of time
(150 to 250 ms) in order to determine the start of the spoken phrase input
signals and whether or not the magnitude of the signals drops below a
predetermined level for a predetermined period of time in order to
determine the end of the signals. Upon detection of the start of the
signals, this voice detector 7 outputs another recording mode command
signal to the controller 5. In response to this command signal, the
controller 5 activates a group of bandpass filters 8, so that the spoken
phrase signal from the microphone 2 is divided into a number of
predetermined frequency bands. Given to a parameter extraction section 9,
the frequency-divided spoken phrase signals are squared or rectified
therein in order to derive the voice power spectrum for each of the
frequency bands and then converted into corresponding digital time-series
matrix-phonetic pattern data (explained later). These data are then stored
in a memory unit 10. In this case, however, since the speech recognizer is
set to the spoken instruction recording mode by the depression of the
record switch 1, the time-series matrix-phonetic pattern data are
transferred to a reference pattern memory unit 11 and stored therein as
reference data for use in recognizing the speech instructions.
After having recorded the reference spoken instructions, the user can input
speech instructions, such as "open doors", to the speech recognizer
through the microphone 2 while depressing a recognition switch 3.
When this recognition switch 3 is depressed, the switch input interface 4
detects the depression of the recognition switch 3 and outputs a signal to
the controller 5 via a wire 4b. In response to this signal, the controller
5 outputs a recognition mode command signal to other sections in order to
preset the entire speech recognizer to the recognition mode. In this
spoken phrase recognition mode, when the user says an instruction phrase
similar to the one recorded previously near the microphone 2 and when the
voice detector 7 outputs a signal, the spoken instruction is transduced
into a corresponding electric signal through the microphone 2, amplified
through the speech input interface 6, filtered and divided into voice
spectra across the frequency bands through the band pass filters 8,
squared or rectified and further converted into corresponding digital
time-series matrix-phonetic pattern data through the parameter extraction
section 9, and then stored in the memory unit 10, in the same manner as in
the recording mode.
Next, the time-series matrix-phonetic pattern data stored in the memory
unit 10 in the recognition mode are sequentially compared with the
time-series matrix-phonetic pattern data stored in the reference pattern
memory until 11 in the recording mode by a resemblance comparator 12. The
resemblance comparator 12 calculates the level of correlation of the
inputted speech instruction to the reference speech instruction after time
normalization and level normalization to compensate for variable speaking
rate (because the same person might speak quickly and loudly at one time
but slowly and in a whisper at some other time). The correlation factor is
usually obtained by calculating the Tchebycheff distance (explained later)
between recognition-mode time-series matrix-phonetic pattern data and
recording-mode time-series matrix-phonetic pattern data. The correlation
factor calculated by the resemblance comparator 12 is next given to a
resemblance determination section 13 to determine whether or not the
calculated values lie within a predetermined range, that is, to evaluate
their cross-correlation. If within the range, a command signal, indicating
that a recognition-mode spoken instruction having adequate resemblance to
one of the recorded instruction phrases, is outputted to one of actuators
14 in order to open the vehicle doors, for instance. The above-mentioned
operations are all executed in accordance with command signals outputted
from the controller 5.
Description has been made hereinabove of the case where the speech
recognizer 100 comprises various discrete elements or sections; however,
it is of course possible to embody the speech recognizer 100 with a
microcomputer including a central processing unit, a read-only memory, a
random-access memory, a clock oscillator, etc. In this case, the voice
detector 7, the parameter extraction section 9, the memory 10, the
reference pattern memory 11, the resemblance comparator 12 and the
resemblance determination section 13 can all be incorporated within the
microcomputer, executing the same or similar processes, calculations
and/or operations as explained hereinabove.
The digital time-series matrix-phonetic pattern data and the Tchebycheff
distance are defined as follows:
In the case where the number of the bandpass filters is four and the number
of time-series increments for each is 32, the digital recording-mode time
series matrix-phonetic pattern data can be expressed as
##EQU1##
where A designates a first recording-mode speech instruction (reference)
(e.g. OPEN DOORS), i denotes the filter index, and j denotes time-series
increment index.
If a first recognition-mode speech instruction (e.g. OPEN DOORS) is denoted
by the character "B", the Tchebycheff distance can be obtained from the
following expression:
##EQU2##
In view of the above description and with reference to the attached
drawings, the background and the embodiments of the present invention will
be explained with respect to its application to a door window
opening/closing device and a car-radio operating device used for an
automotive vehicle.
FIG. 2 is a schematic block diagram showing a first embodiment of the
present invention, by which a door window and a car radio can be actuated
in response to spoken instructions and the door window can be stopped
immediately from moving by depressing a recognition switch.
First, the system configuration will be described hereinbelow. In the
figure, the reference numeral 2 denotes a microphone for transducing a
predetermined spoken instruction uttered by the driver into an electric
signal, the reference numeral 3 denotes a recognition switch such as a
push-button switch turned on while a spoken instruction is uttered. To
this recognition switch 3, a power supply voltage +Vc divided by two
resistors R.sub.1 and R.sub.2 is applied. Therefore, when the recognition
switch 3 is turned off, the switch output is at a logically-high voltage
level; when the recognition switch 3 is turned on, a logically-low voltage
level output signal is produced. Further, a zener diode ZD connected in
parallel with the resistor R.sub.2 serves to absorb the surge voltages
generated due to chattering of the recognition switch 3.
The refrence numeral 100 denotes a speech recognizer switch compares a
spoken instruction signal inputted through the microphone 2 with a
plurality of previously-stored reference spoken instruction data, and
outputs a command signal when the spoken instruction signal coincides with
or corresponds to one of the reference data within a predetermined range.
In this embodiment, a power-operated vehicle door window 20 and a car radio
21 can be actuated. Therefore, when a spoken instruction "Open window" is
recognized by the speech recognizer 100, a command signal to open the door
window 20 is applied to an actuator 14 via a signal lines 100a; when
"Close window" is recognized, another command signal to close the door
window is applied to the actuator 14 via a signal line 100b; when
"Car-radio on" is recognized, another command signal to turn on the car
radio 21 is applied to the actuator 14 via a signal line 100c; when
"Car-radio off" is recognized, the other command signal to turn off the
car radio 21 is applied to the actuator 14 via a signal line 100d. The
actuator provided with relays for actuating the door window 20 or the car
radio 21 in response to these command signals from the speech recognizer
100 will be described later in more detail with reference to FIGS. 5 and
4.
On the other hand, the output of the recognition switch 3 is applied to an
inverter 22. This inverter 22 outputs a L-level output signal when the
recognition switch 3 is off and a H-level output signal when the
recognition switch 3 is on. The output signal of the inverter 22 is given
to one input terminal of a first AND gate 25 and one input terminal of a
second AND gate 23 of a recognition inhibitor 24. This recognition
inhibitor 24 made up of the second AND gate 23 and an inverter 26 serves
to inhibit the operation of the speech recognizer 100.
The respective signal lines 100a and 100b from the speech recognizer 100
for producing the command signals for opening or closing the door window
20 are connected to the first OR gate 27 separately. The output signal of
this first OR gate 27 is inputted to a first timer unit 28 for outputting
a H-level signal for a predetermined time period T.sub.1. When either of
signal lines 100a or 100b changes to a H-level by a command signal from
the speech recognizer 100, the first timer unit 28 is activated via the
first OR gate 27. Therefore, the first timer unit 28 has a function to
hold a command signal corresponding to the signal lines 100a and 100b for
a predetermined time period T.sub.1. In this embodiment, it is also
possible to design the first timer unit 28 so as to keep outputting a
H-level signal while either of command signals develops in either of
signal lines 100a or 100b.
The output of the first timer unit 18 is applied to the second AND gate 23
via the inverter 26 of the recognition inhibitor 24 and the first AND gate
25. Therefore, a second timer unit 29 outputs a H-level signal for another
predetermined time period T.sub.2 when the recognition switch 3 is
depressed within a predetermined time period T.sub.1 after the first timer
unit 28 has been actuated. When the output of this second timer unit 29
changes from a H-level to a L-level, this L-level signal is applied to the
actuator 14 to stop the operation of the door window 20. Therefore, the
time period T.sub.2 determined by the second timer unit 29 is so preset as
to be sufficient to stop the operation of the door window 20, completely.
The output of the second timer unit 29 is also applied to a third timer
unit 30 for outputting a H-level signal for a predetermined time period
T.sub.3. This third timer unit 30 is activated by an operation-stop
command signal generated when the output of the second timer unit 29
changes from a H-level to a L-level. The H-level signal from the third
timer unit 30 is applied to the speech recognizer 100 via the second OR
gate 31 during the predetermined time period T.sub.3. Therefore, the
speech recognizer 100 is enabled to operate for a predetermined time
period T.sub.3 after the stop command signal has been stopped from being
outputted from the second timer unit 29, so that spoken instructions from
the microphone 2 can be recognized.
Now, follows a description of operation of the first embodiment of the
present invention shown in FIG. 2.
When the recognition switch 3 is turned on and a spoken instruction "Radio"
is inputted through the microphone 2 in order to hear radio broadcasting,
the output of the inverter 22 becomes a H-level as described already. On
the other hand, since the first timer unit 28 is not yet activated and
therefore the output thereof is at a L-level, the inverter 26 in the
recognition inhibitor 24 outputs a H-level output signal. As a result, the
second AND gate 23 applies a H-level signal to the speech recognizer 100
via the second OR gate 31 to enable the speech recognizer 100 to operate.
Therefore, the spoken instruction signal transduced into an electric signal
through the microphone 2 to turn on the power supply of the car radio 21
is recognized in the speech recognizer 100. If recognized to be correct, a
H-level command signal is produced in the signal line 100c and a power
supply is supplied to the car radio 21 through the actuator 14. In
contrast with this, if an erroneous recognition of a spoken instruction is
made in the speech recognizer 100 for some reason such as an unclear
spoken instruction, a command signal, for instance, to open the door
window 20 may be developed in the signal line 100a. As a result, the
actuator 14 operates so as to lower the glass of the door window 20
erroneously. In this case, the H-level command signal through the signal
line 100a due to erroneous recognition of spoken instruction is inputted
to the first timer unit 28 via the first OR gate 27 to activate the first
timer unit 28 for a predetermined time period T.sub.1 so as to output a
H-level output signal. This H-level output signal from the first timer
unit 28 is inverted into a L-level signal by the inverter 26 of the
recognition inhibitor 24 to inhibit the H-level output signal outputted
from the second AND gate 23 which would otherwise be generated when the
recognition switch 3 is turned on. Therefore, when the recognition switch
3 is turned on, the speech recognizer 100 is disabled and any spoken
instruction phrases are not recognized for the time period T.sub.1 ;
however, the first AND gate 25 outputs a H-level signal because the
H-level output signal from the inverter 22 generated when the recognition
switch 3 is turned on is inputted to the first AND gate 25, so that the
second timer unit 29 is activated.
Therefore, in the case where the door window 20 is operated erroneously, if
the recognition switch 3 is turned on again within the predetermined time
period T.sub.1 during which the first timer unit 28 is outputting a
H-level signal, a H-level output signal from the inverter 22 is given to
the second timer unit 29 via the first AND gate 25 to activate it, so that
a L-level stop command signal is outputted over the predetermined time
period T.sub.2. This stop command signal (L-level) from the second timer
unit 29 is given to the actuator 14 to stop the door window now in
operation.
The L-level output signal from the second timer unit 29 is also given to
the third timer unit 30. Since this third timer unit 30 is activated when
the input changes from a H-level to a L-level, this third timer unit 30
generates a H-level output signal over the predetermined time period
T.sub.3. This H-level output signal from the third timer unit 30 is given
to the speech recognizer 100 via the second OR gate 31 to enable the
speech recognizer 14 to operate during which the third timer unit 30 is
outputting a H-level output signal. Therefore, after the door window 20
has been stopped, when the same spoken instruction "Radio" for supplying a
power supply to the car radio is inputted without depressing the
recognition switch 3 again, this spoken instruction is recognized by the
speech recognizer 100 as that this spoken instruction is the one for
turning on the car radio 21. Therefore, a H-level command signal is
produced through the signal line 100c to activate the actuator 14, so that
a power supply is supplied to the car radio 21. In this case, it is also
possible to utter a spoken instruction for closing the window.
In the system according to the present invention as described above, in the
case where the device is erroneously operated in response to an
erroneously recognized command signal, if the recognition switch 3 is
turned on again within the time period T.sub.1 (one to three seconds)
predetermined by the first timer unit 28, it is possible to immediately
stop the device now in operation erroneously. Additionally, when the
recognition switch 3 is depressed, since the speech recognizer 100 becomes
operative again in response to the stop command signal, without depressing
the recognition switch 3 again, it is possible to activate the actuator 14
correctly by uttering the same instruction phrase again clearly. That is
to say, it is possible to easily stop the device actuated erroneously and
to input a correct spoken instruction again after the device has been
stopped, by depressing the recognition switch only once.
FIG. 3 shows an actual circuit configuration of the actuator 14 for use
with the first embodiment shown in FIG. 2, to which the signal lines 100a
to 100d from the speech recognizer 100 and the signal line 29a from the
second timer unit 29 are connected, respectively. The signal line 100a
through which a recignition command signal to open the door window is
connected to the base of a transistor T.sub.10. When the signal line 100a
becomes a H-level, the transistor T.sub.10 is turned on to energize the
first relay coil 40c of the first relay 40, so that the first relay
contacts 40a and 40b are both closed. If these first relay contacts 40a
and 40b are closed, since the terminals a and c in the door window
opening/closing 50 are grounded, a motor in the door window device 50 is
rotated in the direction to lower the window glass, as explained in more
detail later with reference to FIG. 4. In the same way, with respect to
the signal line 100b through which a recognition command signal to close
the door window is outputted, since a transistor T.sub.20 and a second
relay 42 are provided, when a recognition command signal to close the door
window is outputted from the speech recognizer 100, the transistor
T.sub.20 is turned on to energize the second coil 42c of the second relay
42, so that the second relay contacts 42a and 42b are both closed. If
these second relay contacts 42a and 42b are closed, since the terminals a
and b in the door window device 50 are grounded, the motor in the door
window device 50 is rotated in the direction to lift the | | |