|
Description  |
|
|
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to personal organizers for electronically
storing messages, reminders, phone numbers, addresses, and other such
data, and more particularly to personal organizers which are voice
activated.
2. History of the Prior Art
Many types of electronic personal organizers or "data banks" are presently
available. Such organizers range from simple devices that allow for
storage of phone numbers, addresses and appointments, to more complicated
devices that approach the capabilities of small computers. All such
devices require data to be entered using a keyboard. Simpler devices may
use a calculator type keyboard, while more complex devices typically
require a computer/typewriter type of keyboard.
In presently available electronic personal organizers, the user typically
selects a function by pressing one or more keys on the keyboard. The user
then enters data using the keyboard, usually filling out a predefined form
for the function selected. For example, a phone directory entry typically
requires the user to type the name of the person or organization to be
added to the directory, in a specific field. The user then indicates by
keystroke when the filling of the field is finished. The organizer then
automatically moves to the next field, where the user inputs the phone
number. This field may be further subdivided into area code and phone
number. When entry of the information is finished, the user indicates by
keypress that the entry is to be saved.
Retrieval of data is accomplished by similar keypress operations, in
conventional electronic personal organizers. The user again begins by
selecting the function, following which a search for the desired
information is begun. In the case of the phone directory previously
referred to, the user may simply scroll through the directory looking for
the desired entry, with a single keypress being used to advance from one
entry to the next. A more sophisticated search is often provided, by which
the user may type the first letter (or perhaps more) of the name. This
causes the directory to skip to the appropriate alphabetical region.
In conventional electronic personal organizers, a second type of data is
often stored. Instead of being stored for later retrieval at the user's
initiation, this data is interpreted by the organizer so as to ultimately
cause the organizer to take a particular action at a later time, with no
further action on the part of the user being required. For example, a time
and date can be entered in a reminder function. The organizer keeps track
of the time in order to automatically alert the user when the selected
time arrives, with no user intervention being required. Thus, an alarm
function is performed. Typically, a message is associated with the alarm
function to provide the reminder with some context. The message as well as
the alarm time are entered by keystrokes in a form similar to that
described previously in connection with the phone directory example.
When using conventional electronic personal organizers in the manner
described, the user must type in information using a small keypad. The
keypad must be of minimum size in order for the keys to be usable. This
conflicts with the need to make the organizer as small and portable as
possible. Elimination of the need for a complete keypad for data input and
retrieval would eliminate the need for compromise, allowing the organizer
to be made small and portable and at the same time easily used. As
previously noted, conventional electronic personal organizers typically
require a computer/typewriter type keyboard for complete flexibility in
entry of number data, such as phone numbers, times and dates, and text
data, such as memos. This requires a certain level of skill on the user's
part, and can be quite time consuming. Also, the large number of keys
required results in the unit being relatively large.
For this reason, voice activation and other voice recognition techniques
have provided a useful alternative to the need for elaborate user
interfacing through use of a large keyboard, in certain electronic
devices. Examples of voice activated electronic devices include remote
controls which utilize sophisticated electronics to recognize spoken
words, translate the commands of the user into traditional digital remote
control signals, and transmit the control signals to a controlled device.
Examples of such systems are provided by co-pending application Ser. No.
07/915,112 of Bissonnette et al., entitled Voice Operated Remote Control
Device, by co-pending application Ser. No. 07/915,938 of Bissonnette et
al., entitled Voice Recognition Apparatus and method, and by co-pending
application Ser. No. 07/915,114 of Fischer, entitled Remote Control
Device. All three applications were filed on Jul. 17, 1992 and are
commonly assigned with the present application.
A further example of a voice operated remote control system is provided by
co-pending application Ser. No. 08/113,394 of Fischer et al., entitled
Voice Operated Remote Control System. The Fischer et al. application,
which was filed Aug. 27, 1993 and which is commonly assigned with the
present application, describes a system which includes a remote control
device responsive to the voice commands of the user to transmit
representations of the voice commands to a controlled device. The
controlled device produces voice signals in response to the transmitted
representations, and includes voice recognition circuitry for recognizing
the transmitted voice commands and executing action routines denoted
thereby.
Voice recognition techniques have also been applied to systems capable of
performing organizer type functions. Typically, such systems are very
large in size in order to accommodate the data storage and other
functions. This severely limits their applicability to small, portable,
hand-held applications. An example of such systems is provided by U.S.
Pat. No. 5,014,317 of Kita et al., which describes recording and
reproducing apparatus in which externally input voice commands stating an
alarm time are converted into voice data for storage in a memory together
with an associated message. When the alarm time is reached, the
corresponding voice data stored in the memory is read out and audibly
reproduced so as to sound the alarm time and play back the associated
message.
The system described in Kita et al. is exemplary of extremely complex
systems which are difficult and expensive to implement, and yet which are
limited in terms of their flexibility in changing or correcting data and
in terms of the functions which they otherwise can perform. Such systems
typically carry out voice recognition and voice recording simultaneously,
thereby requiring a substantial amount of hardware.
In addition to the large, elaborate, computer type systems such as that
described in Kita et al., voice recognition techniques have been applied
to smaller systems where the functions may be simpler and easier to
perform in compact environments. An example of this is provided by U.S.
Pat. No. 4,882,685 of van der Lely, which patent describes a calculator
responsive to certain action words such as "add", "subtract", "multiply",
and "divide". Other examples of such systems are provided by patents
relating to automatic telephone dialers. Such patents include U.S. Pat.
No. 4,644,107 of Clowes et al., U.S. Pat. No. 5,007,081 of Schmuckal et
al., U.S. Pat. No. 4,928,302 of Kaneuchi et al., and U.S. Pat. No.
4,864,622 of Iida et al.
In developing electronic personal organizers, it has become apparent that
digital voice recording is a significantly easier and more natural method
than text to input and store data. Furthermore, input data in the form of
numbers, dates, times, and the like, can be handled in a more natural and
simpler way by utilizing voice recognition technology. However, while such
techniques greatly simplify use of the organizer, they do so at the
expense of considerably greater complexity in the implementation of the
organizer. This is a particular problem if the organizer is to be produced
in a small, portable form. Thus, whereas a text memo typed into a
conventional organizer using a keyboard will typically require 7 or 8 bits
for each character, and a simple message such as "Call the office and
speak with Bob" will typically require 238-272 bits, plus several
additional "overhead" bits to keep the stored information organized, an
organizer utilizing digital voice recording and voice recognition of data
input will typically require 16,000-32,000 bits for proper storage of a
sentence requiring only 1-2 seconds to speak. In addition to such storage
requirements, there is the added requirement of providing the electronics
for voice input and playback, including a microphone, a speaker, and
appropriate amplifiers.
In such organizers utilizing voice recognition, there is the increased
requirement of additional componentry for implementing the voice
recognition process. There is also the need for sufficient processing
power to enable a voice recognition algorithm to be run, and increased
memory requirements both for the program memory, typically a ROM, for
storage of the recognition algorithm and other parts of the organizer
program, as well as a read-write type memory, typically a RAM, for storage
of information pertaining to the voice of the user. At that, such added
memory requirements are but a small fraction of the memory requirements
for voice recording.
To make the implementation practical, the voice recognition requirements
must be limited. The method of use should provide context for the
recognition function, to allow for voice recognition with the limited
processing power obtainable in a small, portable device. For example,
requiring recognition during a continuous stream of speech on a few key
words placed at varying points within the utterance would require a large,
complex computer system.
Accordingly, there is a need for an electronic personal organizer capable
of digitally storing reasonable amounts of message data, and facilitating
the use thereof through appropriate voice recognition techniques. At the
same time, such an organizer must be capable of implementation in a small,
portable, hand-held package in order for it to be practical and to lend
itself to large-scale use.
BRIEF SUMMARY OF THE INVENTION
Briefly stated, the present invention provides an electronic personal
organizer which provides for data entry and retrieval using voice for
commands as well as data input. Two types of voice interaction are
provided; digital voice recording and voice recognition. A minimal number
of buttons or other manual controls are required, enabling a less
complicated device with ease of user interface.
In electronic organizers according to the invention, voice recognition is
performed on words spoken by the user to input data into the organizer. At
the same time, voice messages from the user are recorded in the organizer.
The organizer follows a set routine so that it can readily be determined
when voice input from the user comprises input data for the voice
recognition process and when the voice input is a message to be stored.
The voice messages are preferably compressed and then converted into
digital signals for storage in a memory. The spoken words and the voice
messages may be input using a microphone.
In electronic organizers according to the invention, voice recognition is
carried out by implementing a voice recognition algorithm in conjunction
with templates previously made from a user's voice and stored. When
setting up the organizer for use, the user is required to speak each of a
limited vocabulary of key words into the organizer, for creation and
storage of the digital templates corresponding to the user's spoken words.
Thereafter, as the user speaks the various words, the spoken words are
compared with the stored templates in search of matches which denote
recognition of certain ones of the key words. The various templates are
trained until acceptable matches with the user's voice can be confirmed.
Thereafter, the templates can be periodically corrected or retrained as
appropriate.
The voice messages stored in the electronic organizer are selectively
played back by converting such messages into analog signals and amplifying
and filtering such signals before application to a speaker to produce the
audio sounds corresponding thereto.
The electronic organizer includes a liquid crystal display or similar
display together with a limited keypad. The keypad provides for manual
entry of a limited number of selections and commands in connection with
the voice recognition process. The display provides information feedback
to the user, to facilitate interaction between the user and the organizer.
The electronic organizer includes a microcontroller having a plurality of
different memories for storage of information together with a
microprocessor and a stored program. The program establishes a set
operating routine for the organizer, whereby various different
predetermined functions may be carried out. By having a set operating
routine, the organizer can determine which voice inputs require voice
recognition in accordance with the limited vocabulary of key words and
which voice inputs comprise voice messages to be stored.
Various functions which the electronic organizer is capable of performing
include memo record, reminder, manual reminder, timer setting, message
review, phone group select, number retrieval, add phone number, security,
and "no" logic.
BRIEF DESCRIPTION OF THE DRAWINGS
A better understanding of the invention may be had by reference to the
following specification in conjunction with the accompanying drawings, in
which:
FIG. 1 is a block diagram of a voice activated personal organizer in
accordance with the invention;
FIG. 2 is a plan view of the voice activated personal organizer
electronically represented by the block diagram of FIG. 1 and showing the
limited keypad made possible in accordance with the invention;
FIG. 3 is a pictorial representation of the different types of data stored
in the DRAM of the voice activated personal organizer of FIG. 1;
FIG. 4 is a flow diagram of the idle mode/select operation mode software
routine implemented in the control program of the voice activated personal
organizer of FIG. 1;
FIG. 5 is a flow diagram of the set clock operation mode software routine
implemented in the control program of the voice activated personal
organizer of FIG. 1;
FIG. 6 is a flow diagram of the voice training operation mode software
routine implemented in the control program of the voice activated personal
organizer of FIG. 1;
FIG. 7 is a flow diagram of the memo record operation mode software routine
implemented in the control program of the voice activated personal
organizer of FIG. 1;
FIG. 8 is a flow diagram of the reminder setting operation mode software
routine implemented in the control program of the voice activated personal
organizer of FIG. 1;
FIG. 9 is a flow diagram of the manual reminder operation mode software
routine implemented in the control program of the voice activated personal
organizer of FIG. 1;
FIG. 10 is a flow diagram of the timer setting operation mode software
routine implemented in the control program of the voice activated personal
organizer of FIG. 1;
FIG. 11 is a flow diagram of the message review operation mode software
routine implemented in the control program of the voice activated personal
organizer of FIG. 1;
FIG. 12 is a flow diagram of the waiting message operation mode software
routine implemented in the control program of the voice activated personal
organizer of FIG. 1;
FIG. 13 is a flow diagram of the calendar operation mode software routine
implemented in the control program of the voice activated personal
organizer of FIG. 1;
FIG. 14 is a flow diagram of the phone group select operation mode software
routine implemented in the control program of the voice activated personal
organizer of FIG. 1;
FIG. 15 is a flow diagram of the number retrieval operation mode software
routine implemented in the control program of the voice activated personal
organizer of FIG. 1;
FIG. 16 is a flow diagram of the add phone number operation mode software
routine implemented in the control program of the voice activated personal
organizer of FIG. 1;
FIG. 17 is a flow diagram of the security operation mode software routine
implemented in the control program of the voice activated personal
organizer of FIG. 1;
FIG. 18 is a flow diagram of the "No" logic operation mode software routine
implemented in the control program of the voice activated personal
organizer of FIG. 1; and
FIGS. 19A-19G are illustrations of different visual displays provided by
the voice activated personal organizer of FIG. 1 during various operations
thereof.
DETAILED DESCRIPTION
FIG. 1 is a block diagram of a voice activated personal organizer 10 in
accordance with the invention. The description of FIG. 1 and of the
various flow diagrams in subsequent figures of the drawings which relate
thereto are provided by way of example only. Accordingly, it will be
apparent to those skilled in the art that other arrangements and software
routines are possible in accordance with the invention.
In the present example, the organizer 10 includes a microcontroller 12,
which is the key component of the organizer 10 inasmuch as it manages
operation of the overall system of the organizer 10 in addition to
operating the voice recognition algorithms. The microcontroller 12
includes a ROM (read only memory) 14 which stores a program for operating
the organizer 10 as well as static data used in implementing the functions
of the organizer 10. The ROM 14 is shown in FIG. 1 as an internal part of
the microcontroller 12, but it will be understood that the ROM 14 and
other components like it can alternatively comprise separate components
located external to the microcontroller 12.
Also contained within the microcontroller 12 is a RAM (random access
memory) 16 which is used for local temporary storage of data necessary for
the microcontroller 12 to fully implement the functions required. The
microcontroller 12 further includes an A/D (analog-to-digital) converter
18 for converting inputted voice signals to digital form. The
microcontroller 12 includes LCD (liquid crystal display) drive circuitry
for driving an LCD 20.
In addition to the RAM 16 which is internal to the microcontroller 12, the
organizer 10 has a larger amount of RAM memory external to the
microcontroller 12, for more permanent storage of several types of data.
This is accomplished by a DRAM 22, although it should be understood that
other types of memories can also be used. As shown in dotted outline,
additional expansion memories can be provided as necessary. The DRAM 22
stores "voice templates" that are collected during the set-up process to
enable recognition of a specific user's voice. The DRAM 22 is also used to
store the dates and times for any reminders, as well as phone numbers for
the phone directory function. The DRAM 22 also contains "flags" for each
such item indicating, for example, that a phone number is a home number or
a work number, or that a reminder is to occur weekly, or daily, or one
time only. The bulk of the DRAM 22 is used for the storage of digital
voice recordings.
In the organizer 10 of FIG. 1, a sound transducer for incoming voice
commands and messages from the user is provided by a microphone 24. The
microphone 24 converts the acoustic waves generated by the user's voice
into analog electronic signals, which are amplified and filtered by an
analog input amplifier and filter 26. The analog input amplifier and
filter 26 amplifies and filters the signals from the microphone 24 in such
a way as to optimize the capabilities of the voice recognition algorithms
employed by the microcontroller 12. At the same time, such analog signal
is also amplified and filtered by the analog input amplifier and filter 26
in such a way as to optimize the recording quality. Consequently, the
overall transfer function of the signal path from the microphone 24 to a
voice compression and decompression circuit 28 is different from the
transfer function of the path from the microphone 24 to the A/D converter
18 to implement the voice recognition algorithms. The difference is
necessary because the optimal signals for the two processes, namely
recording of messages and voice recognition, are different. Such
differences are in part made necessary by the hardware approach to
compression and decompression provided by the voice compression and
decompression circuit 28.
As previously noted, the voice signal received by the microphone 24 and
processed by the analog input amplifier and filter 26 is applied directly
to the A/D converter 18 of the microcontroller 12 for voice recognition.
The A/D converter 18, which could be external to the microcontroller 12
instead of forming an internal component thereof as shown, converts the
analog voice signal into a digital signal which the microcontroller 12 can
use. At the same time, voice signals to be recorded are provided by the
analog input amplifier and filter 26 to the voice compression and
decompression circuit 28. In the present example, the voice compression
and decompression circuit 28 implements a Continuously Variable Slope
Delta Modulation (CVSD) compression and decompression algorithm.
Consequently, the circuit 28 is a form of A/D converter, but at the same
time one that significantly processes and thus compresses the amount of
digital data that results for conversion of the analog voice signal to the
digital voice signal. This allows a minimum of memory to be used for
recording the voice messages. The data comprising such voice messages is
stored in the DRAM 22. The CVSD compression and decompression algorithm
also converts the stored compressed digital voice signals back into analog
voice signals for playback, via an analog output amplifier and filter 30
and a speaker 32. The analog output amplifier and filter 30 optimizes the
sound quality for reproduction by the speaker 32.
The LCD (liquid crystal display) 20 is utilized to visually feed back
information to the user of the organizer 10. As shown in FIG. 2 as well as
in FIG. 1, the organizer 10 has a keypad 34 of limited size, to enable the
user to interact with the organizer 10. The keypad 34 has but 12 keys,
which are denoted and used to perform functions as follows:
______________________________________
Key Function
______________________________________
record used for making voice recordings
phone used for phone directory input and retrieval
select used to select functions for review/use
time used for voice input of times/dates and other data
play used for playing back recordings
next used to advance to the next item
prev used to move to the previous item
stop used to abort the present operation
train used for training the organizer to the user's
voice
save used to store information in the RAM
erase used to eliminate information from the RAM
edit used for entering editing and manual input modes
______________________________________
As shown in FIG. 1, the organizer 10 is powered by a primary battery
circuit 36 which is comprised of several rechargeable batteries coupled in
series, together with a voltage regulator, and two voltage comparators
which provide an indication of the status of the batteries in order to
warn the user of the need for recharging the batteries, and so that the
microcontroller 12 can shut down all operations other than maintenance of
the time of day and the memory contents if the batteries become
dangerously low. Whenever the batteries in the primary battery circuit 36
become low, a backup battery circuit 38 connects non-rechargeable backup
batteries to power the organizer 10. If a comparator within the primary
battery circuit 36 determines that the primary batteries therein are
almost out of sufficient charge, the regulator shuts down, and only the
backup batteries within the backup battery circuit 38 are used. In that
instance, the microcontroller 12 immediately stops all operations other
than minimal maintenance to prevent loss of the memory contents. A battery
charging circuit 40 provides a regulated current to the primary batteries
in the primary battery circuit 36, when an external charger is plugged
into a charger jack 42. The battery charging circuit 40 automatically
senses when a charger is plugged into the jack 42 and signals the
microcontroller 12 accordingly.
The DRAM 22 stores data which is generated as the user uses the organizer
10. As shown in FIG. 3, the DRAM 22 is divided into two basic storage
areas. A first such area 44, comprising the vast majority of the DRAM 22,
is used for voice recordings. The remainder of the DRAM 22, as represented
by a second area 46, is divided into five separate areas. A first one 48
of the five areas is an "overhead" storage area used in the operation of
the personal organizer 10. The area 48 stores data used in maintaining the
state of operation of the personal organizer 10. The area 48 is of fixed
size, and the various data fields thereof are also fixed within such area.
A second one 50 of the five areas within the storage area 46 is used to
store voice templates which are created when the user trains the personal
organizer 10 to his or her voice. Because the number of words stored for
recognition purposes is known, the size of the area 50 is fixed.
A third one 52 of the five areas within the storage area 46 contains data
pertaining to reminders and memos, which are described hereafter. The area
52 is divided into 255 small segments, one for each memo and reminder
allowed. There is status information indicating whether the item is a
reminder, recurring reminder, or memo, as well as an indication of which
recording it is associated with. The time of recording a memo, or the due
time for a reminder, and the period of recurrence for a recurring
reminder, are also stored in this area. The storage area 52 is of fixed
length.
A fourth one 54 of the five areas within the storage area 46 contains data
pertaining to the phone directory, described hereafter. For each entry in
the phone directory, there is space for two voice templates for the name,
together with space for four phone numbers which may be up to 20 digits in
length, and an indication of which recording is associated with the
directory entry. This storage area is also of fixed length.
A fifth one 56 of the five areas in the storage area 46 comprises a data
table used to indicate where in the voice recording memory space each
recording resides. This table is similar to file allocation tables
utilized in managing disc drives in small computers.
The voice recording storage area 44 is logically divided into fixed blocks
58 that are 512 bytes long. Only a few of the blocks 58 are shown in FIG.
3, for simplicity of illustration. Each block 58 corresponds to
approximately one-fourth second of recording time. Each recording is
therefore a multiple of one-fourth second in length. As a recording is
made, the starting one of the blocks 58 thereof is noted in the table. The
data for each reminder, memo and the like "points | | |