|
Claims  |
|
|
What is claimed is:
1. A voice-generating document making apparatus comprising:
a talking way data storing means for storing therein talking way data
comprising character string information comprising words, clauses, or
sentences; phoneme string information comprising of phonemes each
corresponding to a character in said character string information; a
length of duration of each phoneme in said phoneme string information;
pitch information for specifying a relative pitch of said phoneme string
information at an arbitrary point of time; and velocity information for
specifying a volume of each phoneme in said phoneme string information for
each group of talking way data having the same character string
information according to character string information in said talking way
information;
a character string input means for inputting character strings each
comprising one of a word, a clause, or a sentence;
a retrieving means for retrieving groups, each having the same character
string information as said character string from said talking way storing
means, by using a character string inputted from said character string
input means;
a voice tone data storing means for storing therein a plurality of voice
tone data each for adding a voice tone to a voice to be synthesized;
a voice synthesizing means for successively reading out talking way data in
the groups retrieved by said retrieving means and synthesizing a voice by
using the phoneme string information, duration length, pitch information,
and velocity information in the talking way data read out as well as one
of said plurality of voice tone data stored in said voice tone data
storing means;
a voice selecting means for selecting a desired voice from voices
synthesized by said voice synthesizing means; and
a voice-generating document storing means for storing therein the talking
way data corresponding to the voice selected by said voice selecting means
as a voice-generating document in correlation to the character string
inputted from said character string input means.
2. A voice-generating document making apparatus according to claim 1
further comprising:
a regeneration specifying means for specifying regeneration of the
voice-generating document stored in said voice-generating document storing
means; wherein, when regeneration of said voice-generating document is
specified, said voice synthesizing means successively reads out talking
way data in said voice-generating document to synthesize a voice.
3. A voice-generating document making apparatus according to claim 2,
wherein said regeneration specifying means is operative to specify
arbitrary units of character string, units of sentence, units of page in
said voice-generating document, or the entire voice-generating document as
an area in which said voice-generating document is to be regenerated.
4. A voice-generating document making apparatus according to claim 1,
wherein said plurality of voice tone data comprises voice tone data each
of which can be identified respectively through a human sense and
comprises at least one of a male's voice, a female's voice, a child's
voice, an old person's voice, a husky voice, a clear voice, a deep voice,
a thin voice, a strong voice, a gentle voice, and a mechanical voice.
5. A voice-generating document making apparatus according to claim 1,
wherein said character string input means has a kana (Japanese
character)--kanji (Chinese character) converting function, and a character
string inputted by said character string input means is a text with kanji
and kana mixed therein having been converted by using said kana-kanji
converting function.
6. A voice-generating document making apparatus comprising:
a talking way data storing means for storing therein talking way data
comprising character string information comprising words, clauses, or
sentences; phoneme string information comprising phonemes each
corresponding to a character in said character string information; a
length of duration of each phoneme in said phoneme string information;
pitch information for specifying a relative pitch of said phoneme string
information at an arbitrary point of time; and velocity information for
specifying a relative volume of said phoneme string information at an
arbitrary point of time for each group of talking way data having the same
character string information according to character string information in
said talking way data;
a character string input means for inputting character strings each
comprising one of a word, a clause, or a sentence;
a retrieving means for retrieving groups, each having the same character
string information as said character string from said talking way data
storing means, by using a character string inputted from said character
string input means;
a voice tone data storing means for storing therein a plurality of voice
tone data each for adding a voice tone to a voice to be synthesized;
a voice synthesizing means for successively reading out talking way data in
the groups retrieved by said retrieving means and synthesizing a voice by
using the phoneme string information, duration length, and pitch
information and velocity information in the talking way data read out as
well as one of said plurality of voice tone data stored in said voice tone
data storing means;
a voice selecting means for selecting a desired voice from voices
synthesized by said voice synthesizing means; and
a voice-generating document storing means for storing therein the talking
way data corresponding to the voice selected by said voice selecting means
as a voice-generating document as a voice-generating document in
correlation to the character string inputted from said character string
input means.
7. A voice-generating document making apparatus according to claim 6
further comprising:
a regeneration specifying means for specifying regeneration of the
voice-generating document stored in said voice-generating document storing
means; wherein, when regeneration of said voice-generating document is
specified, said voice synthesizing means successively reads out talking
way data in said voice-generating document to synthesize a voice.
8. A voice-generating document making apparatus according to claim 7,
wherein said regeneration specifying means is operative to specify
arbitrary units of character string, units of sentence, units of page in
said voice-generating document, or the entire voice-generating document as
an area in which said voice-generating document is to be regenerated.
9. A voice-generating document making apparatus according to claim 6,
wherein said plurality of voice tone data comprises voice tone data each
of which can be identified respectively through a human sense and
comprises at least one of a male's voice, a female's voice, a child's
voice, an old person's voice, a husky voice, a clear voice, a deep voice,
a thin voice, a strong voice, a gentle voice, and a mechanical voice.
10. A voice-generating document making apparatus according to claim 6,
wherein said character string input means has a kana (Japanese
character)--kanji (Chinese character) converting function, and a character
string inputted by said character string input means is a text with kanji
and kana mixed therein having been converted by using said kana-kanji
converting function.
11. A voice-generating document making apparatus comprising:
a talking way data storing means for storing therein talking way data
comprising character string information comprising words, clauses, or
sentences; phoneme string information comprising of phonemes each
corresponding to a character in said character string information; a
length of duration of each phoneme in said phoneme string information,
pitch information for specifying a relative pitch of said phoneme string
information at an arbitrary point of time; and velocity information for
specifying a volume of each phoneme in said phoneme string information for
each group of talking way data having the same character string
information according to character string information in said talking way
data;
a character string input means for inputting character strings each
comprising one of a word, a clause, or a sentence;
a retrieving means for retrieving groups, each having the same character
string information as said character string from said talking way data
storing means, by using a character string inputted from said character
string input means;
a voice tone data storing means for storing therein a plurality of voice
tone data each for adding a voice tone to a voice to be synthesized;
a voice tone data specifying means for specifying one of the voice tone
data stored in said voice tone data storing means;
a voice synthesizing means for successively reading out talking way data in
the groups retrieved by said retrieving means and synthesizing a voice by
using the phoneme string information, duration length, and pitch
information and velocity information in the talking way data read out as
well as voice tone data specified by said voice tone data specifying
means;
a voice selecting means for selecting a desired voice from voices
synthesized by said voice synthesizing means; and
a voice-generating document storing means for storing therein the talking
way data and the voice tone data as a voice-generating document each
corresponding to the voice selected by said voice selecting means in
correlation to the character string inputted from said character string
input means.
12. A voice-generating document making apparatus according to claim 11
further comprising:
a talking way data making/registering means for making said talking way
data and registering the information in said talking way data storing
means.
13. A voice-generating document making apparatus according to claim 12,
wherein said talking way data making/registering means comprises:
a voice waveform data input means for receiving voice waveform data
previously recorded or a natural voice pronounced by a user, and
displaying the voice waveform data;
a duration length setting means for analyzing phonemes each obtained by
receiving the voice from the user or of said voice waveform data and
setting a duration length of each phoneme for displaying it;
a phoneme string information adding means for adding phoneme string
information corresponding to said set duration length;
a pitch curve displaying means for analyzing a pitch of said voice waveform
data and displaying a pitch curve;
a pitch information generating means for generating pitch information by
adjusting or adding thereto a relative pitch value of said phoneme string
information at an arbitrary point of time according to said displayed
pitch curve and phoneme string information;
a velocity information generating means for adjusting a volume of each
phoneme in said phoneme string information and generating velocity
information;
a character string information setting means for receiving a character
string corresponding to said voice waveform data and setting character
string information; and
a registering means for registering said character string information,
phoneme string information, duration length, and pitch information and
velocity information as talking way data in appropriate groups in said
talking way data storing means according to said character string
information.
14. A voice-generating document making apparatus according to claim 11
further comprising:
a regeneration specifying means for specifying regeneration of the
voice-generating document stored in said voice-generating document storing
means; wherein, when regeneration of said voice-generating document is
specified, said voice synthesizing means successively reads out talking
way data as well as voice tone data in said voice-generating document for
synthesizing a voice.
15. A voice-generating document making apparatus according to claim 14,
wherein said regeneration specifying means is operative to specify
arbitrary units of character string, units of sentence, units of page in
said voice-generating document, or the entire voice-generating document as
an area in which said voice-generating document is to be regenerated.
16. A voice-generating document making apparatus according to claim 11,
wherein said apparatus comprises a display means to display the
voice-generating document stored in said voice-generating document storing
means, specify an arbitrary character string of said displayed
voice-generating document, and change or input again said specified
character string by using said character string input means; and further,
wherein said apparatus comprises means to change talking way data and
voice tone data corresponding to said specified character string by
retrieving the information with said retrieving means, specifying voice
tone data with said voice tone data specifying means, and synthesizing a
voice with said voice synthesizing means as well as selecting a voice with
said voice selecting means by using said changed or re-inputted character
string.
17. A voice-generating document making apparatus according to claim 11,
wherein said plurality of voice tone data comprises voice tone data each
of which can be identified respectively through a human sense and
comprises at least one of a male's voice, a female's voice, a child's
voice, an old person's voice, a husky voice, a clear voice, a deep voice,
a thin voice, a strong voice, a gentle voice, and a mechanical voice.
18. A voice-generating document making apparatus according to claim 11,
wherein said character string input means has a kana (Japanese
character)--kanji (Chinese character) converting function, and a character
string inputted by said character string input means is a text with kanji
and kana mixed therein having been converted by using said kana-kanji
converting function.
19. A voice-generating document making apparatus according to claim 11
further comprising:
a classified type specifying means for specifying a classified type of said
talking way data;
wherein said talking way data has type information indicating classified
types of talking way data respectively in addition to said character
string information, phoneme string information, duration length, and pitch
information and velocity information;
said retrieving means retrieves talking way data which is a group having
the same character string information as said character string and has the
same type information as said specified classified type from said talking
way data storing means by using the character string inputted by said
character string input means as well as the classified type specified by
said classified type specifying means, when a classified type is specified
through said classified type specifying means; and
said voice synthesizing means reads out talking way data retrieved by said
retrieving means and synthesizes a voice by using phoneme string
information, a duration length, pitch information, and velocity
information in said read out talking way data as well as voice tone data
specified by said voice tone data specifying means.
20. A voice-generating document making apparatus according to claim 19,
wherein said classified types indicate types in which voices, each
corresponding to talking way data, respectively, are classified according
to pronunciation types each specific to a particular geographic area.
21. A voice-generating document making apparatus according to claim 19,
wherein said classified types indicate types in which voices, each
corresponding to talking way data, respectively, are classified according
to pronunciation types each specific to a person's age group.
22. A voice-generating document making apparatus according to claim 11,
wherein said character string input means comprises a display section
which is operative to change a font or a decorative method of a character
string to be displayed, and is operative to display the character string
on said display section according to voice tone data specified for each
character string of said voice-generating document.
23. A voice-generating document making apparatus comprising: a talking way
data storing means for storing therein talking way data comprising
character string information consisting of words, clauses, or sentences;
phoneme string information comprising phonemes each corresponding to a
character in said character string information; a length of duration of
each phoneme in said phoneme string information; pitch information for
specifying a relative pitch of said phoneme string information at an
arbitrary point of time; and velocity information for specifying a
relative volume of said phoneme string information at an arbitrary point
of time for each group of talking way data having the same character
string information according to character string information in said
talking way data;
a character string input means for inputting character strings each
comprising one of a word, a clause, or a sentence;
a retrieving means for retrieving groups, each having the same character
string information as said character string from said talking way data
storing means, by using a character string inputted from said character
string input means;
a voice tone data storing means for storing therein a plurality of voice
tone data each for adding a voice tone to a voice to be synthesized;
a voice tone data specifying means for specifying one of the voice tone
data stored in said voice tone data storing means;
a voice synthesizing means for successively reading out talking way data in
the groups retrieved by said retrieving means and synthesizing a voice by
using the phoneme string information, duration length, and pitch
information and velocity information in the talking way data read out as
well as voice tone data specified by said voice tone data specifying
means;
a voice selecting means for selecting a desired voice from voices
synthesized by said voice synthesizing means; and
a voice-generating document storing means for storing therein the talking
way data and the voice tone data as a voice-generating document each
corresponding to the voice selected by said voice selecting means in
correlation to the character string inputted from said character string
input means.
24. A voice-generating document making apparatus according to claim 23
further comprising:
a talking way data making/registering means for making said talking way
data and registering the information in said talking way data storing
means.
25. A voice-generating document making apparatus according to claim 24,
wherein said talking way data making/registering means comprises:
a voice waveform data input means for receiving voice waveform data
previously recorded or a natural voice pronounced by a user, and
displaying the voice waveform data;
a duration length setting means for analyzing phonemes each obtained by
receiving the voice from the user or of said voice waveform data and
setting a duration length of each phoneme for displaying it;
a phoneme string information adding means for adding phoneme string
information corresponding to said set duration length;
a pitch curve displaying means for analyzing a pitch of said voice waveform
data and displaying a pitch curve;
a pitch information generating means for generating pitch information by
adjusting or adding thereto a relative pitch value of said phoneme string
information at an arbitrary point of time according to said displayed
pitch curve and phoneme string information;
a velocity information generating means for adjusting a volume of each
phoneme in said phoneme string information and generating velocity
information;
a character string information setting means for receiving a character
string corresponding to said voice waveform data and setting character
string information; and
a registering means for registering said character string information,
phoneme string information, duration length, and pitch information and
velocity information as talking way data in appropriate groups in said
talking way data storing means according to said character string
information.
26. A voice-generating document making apparatus according to claim 23
further comprising:
a regeneration specifying means for specifying regeneration of the
voice-generating document stored in said voice-generating document storing
means; wherein, when regeneration of said voice-generating document is
specified, said voice synthesizing means successively reads out talking
way data as well as voice tone data in said voice-generating document for
synthesizing a voice.
27. A voice-generating document making apparatus according to claim 26,
wherein said regeneration specifying means can specify arbitrary units of
character string, units of sentence, units of page in said
voice-generating document, or the entire voice-generating document as an
area in which said voice-generating document is to be regenerated.
28. A voice-generating document making apparatus according to claim 23,
wherein said apparatus comprises a display means to display the
voice-generating document stored in said voice-generating document storing
means, specify an arbitrary character string of said displayed
voice-generating document, and change or input again said specified
character string by using said character string input means; and further,
wherein said apparatus comprises means to change talking way data and
voice tone data corresponding to said specified character string by
retrieving the information with said retrieving means, specifying voice
tone data with said voice tone data specifying means, and synthesizing a
voice with said voice synthesizing means as well as selecting a voice with
said voice selecting means by using said changed or re-inputted character
string.
29. A voice-generating document making apparatus according to claim 23,
wherein said pluralities of voice tone data are voice tone data each of
which can be identified respectively through a human sense.
30. A voice-generating document making apparatus according to claim 23,
wherein said character string input means has a kana (Japanese
character)--kanji (Chinese character) converting function, and a character
string inputted by said character string input means is a text with kanji
and kana mixed therein having been converted by using said kana-kanji
converting function.
31. A voice-generating document making apparatus according to claim 23
further comprising:
a classified type specifying means for specifying a classified type of said
talking way data;
wherein said talking way data has type information indicating classified
types of talking way data respectively in addition to said character
string information, phoneme string information, duration length, and pitch
information and velocity information;
said retrieving means retrieves talking way data which is a group having
the same character string information as said character string and has the
same type information as said specified classified type from said talking
way data storing means by using the character string inputted by said
character string input means as well as the classified type specified by
said classified type specifying means, when a classified type is specified
through said classified type specifying means; and
said voice synthesizing means reads out talking way data retrieved by said
retrieving means and synthesizes a voice by using phoneme string
information, a duration length, pitch information, and velocity
information in said read out talking way data as well as voice tone data
specified by said voice tone data specifying means.
32. A voice-generating document making apparatus according to claim 31,
wherein said classified types indicate types in which voices each
corresponding to talking way data, respectively, are classified according
to pronunciation types each specific to a particular geographic area.
33. A voice-generating document making apparatus according to claim 31,
wherein said classified types indicate types in which voices each
corresponding to talking way data, respectively, are classified according
to pronunciation types each specific to a person's age group.
34. A voice-generating document making apparatus according to claim 23,
wherein said character string input means comprises display section,
change and is operative to a font or a decorative method of a character
string to be displayed, and to display the character string on said
display section according to voice tone data specified for each character
string of said voice-generating document.
35. A voice-generating document making method comprising:
inputting character strings each constituting a word, a clause, or a
sentence;
retrieving a group having the same character string information as the
character string inputted in said inputting activity by consulting a
database storing therein talking way data comprising character string
information consisting of words, clauses, or sentences, phoneme string
information consisting of phonemes each corresponding to a character in
said character string information, a length of duration of each phoneme in
said phoneme string information, pitch information for specifying a
relative pitch of said phoneme string information at an arbitrary point of
time, and velocity information for specifying a volume of each phoneme in
said phoneme string information for each group of talking way data having
the same character string information according to character string
information in said talking way data;
specifying voice tone data for adding a voice tone to a voice to be
synthesized;
successively reading out talking way data in the groups retrieved in said
retrieving activity and synthesizing a voice by using the phoneme string
information, a duration length, pitch information, and velocity
information in the talking way data read out as well as voice tone data
specified in said specifying activity;
selecting a desired voice from voices synthesized in said reading out
activity; and
storing the talking way data corresponding to the voice selected in said
selecting activity as a voice-generating document in correlation to the
character string inputted by said inputting activity.
36. A voice-generating document making method according to claim 35 further
comprising:
specifying regeneration of the voice-generating document stored in said
storing activity; and
successively reading out talking way data and voice tone data in said
voice-generating document when regeneration of said voice-generating
document is specified and synthesizing a voice.
37. A voice-generating document making method according to claim 36,
wherein, in said specifying regeneration activity, arbitrary units of
character string, units of sentence, units of page in said
voice-generating document, or the entire voice-generating document can be
specified as an area in which said voice-generating document is to be
regenerated.
38. A voice-generating document making method according to claim 35 further
comprising:
displaying the voice-generating document stored in said storing activity,
specifying an arbitrary character string of said displayed
voice-generating document, and changing or inputting again said specified
character string; wherein said voice-generating document can be changed by
executing again said retrieving activity, voice specifying activity,
reading out activity, voice selecting activity, and storing activity with
the character string changed or re-inputted in said inputting again
activity.
39. A voice-generating document making method comprising:
a first step of inputting character strings each constituting a word, a
clause, or a sentence;
a second step of retrieving a group having the same character string
information as the character string inputted in said first step by
referring to a database storing therein talking way data comprising
character string information consisting of words, clauses, or sentences,
phoneme string information consisting of phonemes each corresponding to a
character in said character string information, a length of duration of
each phoneme in said phoneme string information, pitch information for
specifying a relative pitch of said phoneme string information at an
arbitrary point of time; and velocity information for specifying a
relative volume of said phoneme string information at an arbitrary point
of time for each group of talking way data having the same character
string information according to character string information in said
talking way data;
a third step of specifying voice tone data for adding a voice tone to a
voice to be synthesized;
a fourth step of successively reading out talking way data in the groups
retrieved in said second step and synthesizing a voice by using the
phoneme string information, duration length, and pitch information and
velocity information in the talking way data read out as well as voice
tone data specified in said third step;
a fifth step of selecting a desired voice from voices synthesized in said
fourth step; and
a sixth step of storing therein the talking way data corresponding to the
voice selected in said fifth step as a voice-generating document in
correlation to the character string inputted in said first step.
40. A voice-generating document making method according to claim 39 further
comprising:
a seventh step of specifying regeneration of the voice-generating document
stored in said sixth step; and
an eighth step of successively reading out talking way data and voice tone
data in said voice-generating document when regeneration of said
voice-generating document is specified and synthesizing a voice.
41. A voice-generating document making method according to claim 40,
wherein, in said seventh step, arbitrary units of character string, units
of sentence, units of page in said voice-generating document, or the
entire voice-generating document can be specified as an area in which said
voice-generating document is to be regenerated.
42. A voice-generating document making method according to claim 39 further
comprising:
a ninth step of displaying the voice-generating document stored in said
sixth step, specifying an arbitrary character string of said displayed
voice-generating document, and changing or inputting again said specified
character string; wherein said voice-generating document can be changed by
executing again said second step, third step, fourth step, fifth step, and
sixth step with the character string changed or re-inputted in said ninth
step.
43. A computer-readable medium from which a computer can read out a program
enabling execution by the program of a sequence for making a
voice-generating document used in the computer-readable medium; wherein
said storage medium stores therein a program comprising:
a first sequence for inputting character strings each constituting a word,
a clause, or a sentence;
a second sequence for retrieving a group having the same character string
information as the character string inputted in said first sequence by
referring to a database storing therein talking way data comprising
character string information consisting of words, clauses, or sentences,
phoneme string information consisting of phonemes each corresponding to a
character in said character string information, a length of duration of
each phoneme in said phoneme string information, pitch information for
specifying a relative pitch of said phoneme string information at an
arbitrary point of time; and velocity information for specifying a volume
of each phoneme in said phoneme string information for each group of
talking way data having the same character string information according to
character string information in said talking way data;
a third sequence for specifying voice tone data for adding a voice tone to
a voice to be synthesized;
a fourth sequence for successively reading out talking way data in the
groups retrieved in said second sequence and synthesizing a voice by using
the phoneme string information, duration length, pitch information, and
velocity information in the talking way data read out as well as voice
tone data specified in said third sequence;
a fifth sequence for selecting a desired voice from voices synthesized in
said fourth sequence; and
a sixth sequence for storing therein the talking way data corresponding to
the voice selected in said fifth sequence as a voice-generating document
in correlation to the character string inputted in said first sequence.
44. A computer-readable medium from which a computer can read out a program
enabling execution by the program of a sequence for making
voice-generating documents used in the computer-readable medium according
to claim 43; wherein said storage medium stores therein a program further
comprising: a seventh sequence for specifying regeneration of the
voice-generating document stored in said sixth sequence; and
an eighth sequence for successively reading out talking way data and voice
tone data in said voice-generating document when regeneration of said
voice-generating document is specified, and synthesizing a voice.
45. A computer-readable medium from which a computer can read out a program
enabling execution by the program of a sequence for making
voice-generating documents used in the computer-readable medium according
to claim 44; wherein, in said seventh sequence, arbitrary units of
character string, units of sentence, units of page in said
voice-generating document, or the entire voice-generating document can be
specified as an area in which said voice-generating document is to be
regenerated.
46. A computer-readable medium from which a computer can read out a program
enabling execution by the program of a sequence for making
voice-generating documents used in the computer-readable medium according
to claim 43; wherein said storage medium stores therein a program further
comprising:
a ninth sequence for displaying the voice-generating document stored in
said sixth sequence, specifying an arbitrary character string of said
displayed voice-generating document, and changing or inputting again said
specified character string;
wherein said voice-generating document can be changed by executing again
said second sequence, third sequence, fourth sequence, fifth sequence, and
sixth sequence with the character string changed or re-inputted in said
ninth sequence.
47. A computer-readable medium from which a computer can read out a program
enabling execution by the program of a sequence for making
voice-generating documents used in the storage medium; wherein said
computer-readable medium stores therein a program comprising:
a first sequence for inputting character strings each constituting a word,
a clause, or a sentence;
a second sequence for retrieving a group having the same character string
information as the character string inputted in said first se | | |