|
Description  |
|
|
BACKGROUND OF THE INVENTION
This invention relates to a speech signal processor.
Attention has been drawn to techniques for extracting feature parameters
such as spectral information and excitation source information from the
speech signal to transmit them with reduced transmission bit rate. Of
these techniques, the linear predictive coding (LPC) technique is
extensively used because of its simple processing. The LPC technique
involves extracting linear predictive coefficients as spectral information
and predictive residual as excitation source information from the speech
signal on the transmission side, and on the receiver side, determining
weight coefficient with spectral information and exciting a synthesizing
filter by the excitation source information to synthesize reproduced
speech. The speech synthesizer for such an LPC technique is usually
provided with a synthesizing filter including a feedback loop. This makes
the circuit construction complex and reduces the stability of the
synthesizing filter due to transmission error and other causes.
Under the circumstances, Sagayama et al., proposed a structurally very
simple synthesizer needing to filter. Reference is made, for example, to
"Composite Sinusoid Modeling Applied to Spectrum Analysis of Speech" Data
S79-06 (May, 1979) and "Speech Synthesis by Composite Sinusoidal Wave"
Data S79-39 (Oct., 1979) Laboratory of Speech. The Acoustical Society of
Japan. This technique is termed CSM (acronym for Composite Sinusoid
Model).
The CSM represents the speech signal as the summation or combination of a
set of sinusoidal waves each having amplitude and frequency as parameters
freely selectable. The number of these sinusoidal waves suitable for use
is predetermined to be at the largest 4-6. For CSM analysis, frequency and
amplitude (CSM parameters) of each sinusoidal wave are determined every
analysis frame so that the lowest N order autocorrelation coefficients
directly calculated from the speech signal is equal to the lowest N order
autocorrelation coefficients of the corresponding synthesized wave.
Simple summation (combination) of the CSM signals of every frequency cannot
reproduced the corresponding original speech. For reproducing original
speech, it is necessary to attach pitch structure and impart a pich
synchronous envelope to the summed CMS signal. The term "attachment of the
pitch structure" means that the phase of sinusoidal wave is initialized to
"0" every pitch period for voiced speech. This is done to make the line
spectrum structure spread approach the natural speech spectrum. Also for
unvoiced speech, line spectrum structure is spread by random phase
initialization. The signal imparted with pitch structure as mentioned
above is useful to obtain synthesized sound like speech. Initialization of
sinusoidal wave phase to zero is accompanied by discrete jumps in the
waveform. To smoothen out such jumps, the synthesized speech signal is
multiplied an envelope synchronous with the pitch of the speech signal,
such an envelope attenuation curve according to an exponential function.
Additionally, it is problematic whether the interval for phase
initialization mentioned above is too narrow or wide. Too narrow
initialization interval causes whitening, and in turn no occurrence of a
spectrum envelope, while too wide initialization interval is associated
with an insufficient frequency spread to obtain an appropriate spectral
envelope. There has been problems in the conventional CSM technique also
in that because of the application of random phase initialization for
production of unvoiced sound, initialization is inevitably performed both
at too narrow and too wide intervals with a resulting failure in obtaining
good unvoiced speech.
In the conventional CSM technique, CSM parameters yielded by the analysis
such as frequency and amplitude representing characteristics of the
individual sinusoidal waves are quantized separately, leaving relationship
between parameters out of consideration. This reflects in inadequate
quantization to utilize characteristics of CSM parameters, and produces
problems in quantization efficiency.
At present digital privacy telephone system are widely used in which
generally the analog speech signal is converted into digital codes,
followed by a specified coding, to maintain information of the original
speech secret before transmission, and the received signals are decoded
just inversely to the coding, followed by D/A conversion to reproduce the
corresponding original speech signal. Such a digital communication system
has the disadvantage of requiring high performance of the transmission
line, such as transmission capacity and error rate.
There is also, for example, an analog privacy telephone system of
subjecting the speech signal to spectral inversion or to spectral division
and interchange of relative positions before transmission. It generally
requires low transmission rates but the spectrum envelope of the original
speech signal remains in some form, which contributes to defeat the
privacy of the system.
SUMMARY OF THE INVENTION
Accordingly, it is an object of the invention to provide a CSM synthesizer
for reproducing better quality unvoiced speech.
Another object of the invention is to provide a CSM speech processor with
remarkably improved quantization efficiency.
A further object of the invention is to provide an analog telephone set
with a high privacy.
A further object of the invention is to provide an analog telephone set
with an improved privacy.
A further object of the invention is to provide a CSM synthesizer having
simplified structure and reproducing better quality unvoiced speech.
A further object of the invention is to provide a speech processor having
simplified structure without a filter and performing analysis and
synthesis of speech.
A further object of the invention is to provide a speech processor with a
high stability.
According to one aspect of the invention there is provided a speech signal
processor comprising, an extractor from a speech signal for extracting
amplitudes and frequencies of a set of sinusoidal wave signals
representative of said speech, a sinusoidal wave generator for generating
a set of sinusoidal wave signals having the extracted amplitudes and
frequencies, combination means for combining the set of sinusoidal wave
signals from the sinusoidal wave generator, a random code generator for
generating random code signals having a distribution defined by
predetermined finite upper and lower values, and a phase resetter for
phase-resetting the sinusoidal wave signals in response to the pitch of
the speech signal when the speech signal is voiced and at a period
determined in accordance with a random code signal when the speech signal
is unvoiced.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a block diagram of the basic construction of speech signal
processor according to the invention;
FIG. 2 is an example of speech characteristic of vector pattern showing the
relationship among CSM parameter m.sub.i, .omega..sub.i and time;
FIG. 3 is a graph showing the relationship between CSM line spectrum and
LPC spectrum envelope obtained from the same speech sample.
FIGS. 4A and 4B are a spectrum distribution graph reflecting the summation
of a set of sinusoidal wave signals yielded by CSM analysis, and a
spectrum distribution graph associated with the frequency spread caused by
phase-resetting of the sinusidal signals, respectively;
FIGS. 5A and 5B are waveforms of the outputs of the window function
generator 27 shown in FIG. 1;
FIGS. 6 is a detailed block diagram of a variable frequency oscillator 24
shown in FIG. 1;
FIG. 7 is a detailed block diagram of a variable gain amplifier 25 of FIG.
1;
FIG. 8 is a detailed block diagram of a random code generator 23 shown in
FIG. 1;
FIGS. 9A and 9B are a detailed block diagram of a period calculator 22
shown in FIG. 1 and a distribution diagram of its output, respectively;
FIG. 10 is a detailed block diagram of a window function generator 27 shown
in FIG. 1;
FIG. 11 is a block diagram of the structure of the transmitter part of an
alternative embodiment according to the invention;
FIG. 12 is a detailed block diagram illustrating the functions of a CSM
quantizer 14 and a power quantizer 15 shown in FIG. 11;
FIGS. 13A and 13B represent bit distribution and bit allocation,
respectively, for explaining quantization of the CSM quantizer 14 shown in
FIG. 11;
FIGS. 14A and 14B are structural block diagrams of a further embodiment in
accordance with the invention;
FIGS. 15A through 15D, are illustrations of the first parameter conversion
in the embodiment of FIG. 14;
FIGS. 16A and 16B are illustrations of the second parameter conversion in
the embodiment shown in FIG. 14; and
FIGS. 17 and 18 are a block diagram of another embodiment in accordance
with the invention and the output waveform from the sawtooth pulse
generator 51 therein, respectively.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
FIG. 1 is a block diagram illustrating analyzer and synthesizer parts in an
embodiment of the invention. The fundamental structure is composed of the
transmitter part T where CSM analysis is performed and a receiver part R
where reproduction of original speech on the basis of received CSM
parameters is performed. Before making concrete description referring to
FIG. 1, the basic principle of the invention will be described.
The number n, frequencies .omega..sub.i (i=1, 2, . . . , n), and amplitudes
m.sub.i of sinusoidal waves to be combined and the CSM synthesized wave
y.sub.t are related by
##EQU1##
r.sub.l representing the autocorrelation coefficient of tap l is easily
given by
##EQU2##
Letting x.sub.t be a sample of the speech signal, the autocorrelation
coefficient v.sub.l of tap l is:
##EQU3##
where M is the number of samples per analysis frame.
CSM analysis determines m.sub.i and .omega..sub.i so that r.sub.l is equal
to v.sub.l with respect to the N lower orders, namely, r.sub.l =v.sub.l
(l=0, 1, 2, . . . . , N). The concrete description of this method will be
given later. Herein it is assumed that m.sub.i and .omega..sub.i are in
sequence obtained in response to given speech signals every analysis
frame.
FIG. 2 shows a speech characteristic vector pattern giving the relationship
between the thus obtained CSM parameters, m.sub.i and .omega..sub.i
depending on time.
FIG. 3 shows the CSM (the number of sinusoidal waves n=5) line spectrum of
the 9th order (N=9) and the 9th-order LPC spectrum envelop obtained from
the same sample (frequency transmission characteristic of LPC synthesis
filter).
As described later, the order N is related to the number of sinusoidal
waves by N=2n-1. From these drawings, it can be suspected that CSM
contains characteristic information extracted from the original speech.
Even if, however, n sinusoidal waves obtained by using values of n
parameter set (m.sub.i, actual amplitude being .sqroot.m.sub.i as
above-mentioned, and .omega..sub.i) yielded by CSM analysis are simply
combined (summed), the obtained synthesized sound can not be heard as the
original speech. The simple combination of such sinusoidal waves generates
the signal exhibiting a spectrum having n discrete lines as shown in FIG.
4A. On the other hand, the spectrum of the speech signal has a continuous
spectrum envelope. Voiced speech is represented by pitch structure and
unvoiced speech has fine spectral structure represented by stochastic
process. Therefore, to synthesize speech or to obtain continuous spectrum
by the CSM technique, spreading the line spectrum is required, in other
words, it is required to change the speech spectrum pattern characterized
by the line spectrum to the corresponding speech spectrum pattern.
According to the invention, the above-mentioned spectrum spreading for CSM
speech synthesis is accomplished by the following procedure:
For the voiced speech which has a distinct pitch structure, the phase
initialization is performed, that is, n sinusoidal waves specified by
m.sub.i and .omega..sub.i as above-stated are reset with respect to phase
every pitch period. This simply enables generation of the spectrum envelop
and fine pitch spectrum structure. For the unvoiced speech, the phase
initialization is performed by random codes having the upper and lower
limits of the distribution.
Further, a time window processing which will be well described in the
description of the embodiment is applied to the above-stated phase
initialization to eliminate the discontinuity of synthesized waveform
observed at the time of the phase resetting.
In this way, the CSM line spectrum shown in FIG. 4A is changed by spreading
to the corresponding spectrum having the spectrum envelope and fine pitch
structure as shown in FIG. 4B, which has been demonstrated by experimental
results to ensure the reproduction of speech quality audible
satisfactorily from the view point of practical use.
The above-stated method of CSM synthesis can be satisfactory audibly for
practical use, and requires no filters, which makes consideration of the
stability of the synthesis part (synthesis filter) unnecessary and
produces better speech quality than that of a vocoder under the poor
transmission performance of a channel.
Returning to FIG. 1, the transmitter part T comprises an A/D converter 10,
a Hamming window processor 11, an autocorrelation coefficient calculator
12, a CSM analyzer 13, a CSM quantizer 14, a power quantizer 15, a pitch
extractor 16, a voiced/unvoiced (V/UV) discriminator 17, and a multiplexer
18.
The receiver part R comprises a combined unit of demultiplexer and decoder
19, an interpolator 20, a V/UV switch 21, a period calculator 22, a random
code generator 23, n variable frequency oscillators with phase resetting
function 24(1), 24(2), . . . . , 24(n), n variable gain amplifiers 25(1),
25(2), . . . . , 25(n), a combiner 26, a variable length window function
generator 27, and multipliers 28 and 29.
The speech waveform is converted into digital data quantized in respect to
amplitude and time in the A/D converter 10. The digital data output is
supplied to the Hamming window processor 11, the pitch extractor 16 and
the V/UV discriminator 17, respectively.
Digital data supplied to the Hamming window processor 11 is subjected to
weighting multiplication by a known Hamming window function every
predetermined frame, and then applied in sequence to the autocorrelation
coefficient calculator 12. The autocorrelation coefficient calculator 12
yields the lowest N orders autocorrelation coefficients v.sub.l (l=0, 1,
2, . . . . , N) using the above-described operation expressed by the
equation
##EQU4##
where x.sub.t (t=0, 1, . . . . , M-1) denotes 1 frame data.
The thus obtained v.sub.l of each frame are applied to the CSM analyser 13,
and v.sub.0
##EQU5##
out of them to the power quantizer 15 to provide power information about
the frame.
In the CSM analyzer 13 having received autocorrelation coefficient v.sub.l
of each frame, the operation described later is made to determine
amplitudes m.sub.i and frequencies .omega..sub.i (i=1, 2, . . . . , n) of
n sinusoidal waves by the CSM synthesis of the frame, the resulting
outputs being applied to CSM quantizer 14.
The CSM quantizer 14 quantizes the series of sinusoidal waves specified by
m.sub.i and .omega..sub.i at an appropriate quantization step, which is
chosen taking requirements for reproduced speech quality and transmission
capacity of the transmission channel into consideration, and its outputs
are supplied to the multiplexer 18. Also in the power quantizer 15
receiving v.sub.0, quantization is performed at an appropriate
quantization step chosen from a similar view point, and the output from
this is applied to the multiplexer 18. The pitch extractor 16 extracts
pitch period from the digital data from the A/D converter 10 and applies
it to the multiplexer 18. The V/UV discriminator 17 discriminates whether
the digital data indicates voiced or unvoiced speech and applies the
result in the form of binary signals to the multiplexer 18. The
multiplexer 18 combines these signals and transmits the combined signals
through the transmission channel.
At the receiver part R, the thus-transmitted coded signals are decoded and
separated in the combined unit of demultiplexer and decoder 19. The
decoded signals are applied to an interpolator 20. In response to the
interpolated .omega..sub.i (107 .sub.1 through .omega..sub.n) of n CSM
waves, the output frequencies of the n variable frequency oscillator with
phase resetting function 24(1) through 24(n) are controlled.
Besides, m.sub.1 through m.sub.n specifying amplitudes of n CSM waves are
applied to gain control terminals of the n variable gain amplifiers 25(1)
through 25(n), and thereby oscillation powers of the frequencies are
controlled to be specified values. The thus-obtained n outputs are
combined or summed in a combiner 26 and the combined signal is applied to
the multiplier 28. The pitch period information from the combined unit 19
of demultiplexer and decoder is applied to the V/UV switch 21, if desired,
through the interpolator 20.
Random code signal generated from the random code generator 23 are
converted into uniformly-distributed random code signal such that the
distribution band and its lower limit, namely the upper and lower limit
values are specified values in the period calculator 22. Then, the random
codes are applied to the V/UV switch 21 as a data sequence to determine
the phase-reset timing for unvoiced speech. As stated above, according to
the invention, the phase initialization is performed in accordance with
the uniformly-distributed random codes ranged between the specified upper
and lower limit values and this enables the formation of an appropriate
spectrum envelope. The random code generator 23 and period calculator 22
are described more fully below.
The binary signal (V/UV) from the combined unit 19 of demultiplier and
decoder, which indicates whether voiced or unvoiced speech, is supplied as
switching control signal to the switch 21. If the binary signal indicates
voiced speech, the switch 21 supplies the above-mentioned pitch period fed
from the interpolator 20 to the window function generator 27. On the other
hand, the switch 21 supplies the random time interval generated by the
period calculator 22 to the window function generator 27 if the binary
signal indicates unvoiced speech.
The window function generator 27 generates window functions for phase
resetting, which eliminates discontinuity appearing in the output waveform
and phase resetting pulses as shown in FIGS. 5A and 5B.
As mentioned above, data sequence designating intervals between phase
resetting pulses is supplied one after another through the switch 21 to
the window function generator 27, which generates one after another
impulses having time intervals designated by the data sequence. These
impulses are applied to the phase reset terminals of the variable
frequency oscillators 24(1) through 24(n) for phase initialization. The
output of the window function generator 27 is applied also to the
interpolator 20 and used as timing signals for interpolating angular
frequency data .omega..sub.i and strength data m.sub.i.
The window function generator 27 generates, in synchronism with the phase
resetting pulse, the following variable length window function W(t). Let
the interval between phase resetting pulses be T and the lapsed time from
occurrence of the preceding phase resetting pulse be t, the generated
window function W(t) is expressed as
##EQU6##
where 0<t<T. The window function W(t) is shown in FIG. 5A. T value
indicates the pitch period for voiced speech, and the variable generated
in the probability process for unvoiced speech. The window function W(t)
has therefore variable length and is synchronous with the aforesaid phase
resetting pulse. In other words, starting and terminating timings of the
window function coincides with those of the phase resetting pulse.
In response to the thus-generated window function, the multiplier 28
outputs are products of n sinusoidal waveforms having been combined in the
combiner 26 and the above-mentioned window functions W(t) generated in
synchronism with the every phase resetting pulse. The waveforms of the
outputs are converged continuously to "0", as the result of multiplication
by the window function W(t) before each sinusoidal wave is phase reset.
Besides, at the time point of phase resetting, each sinusoidal wave rises
from "0" which ensures continuity of the waveform.
The multiplier 29 multiplies the output of the multiplier 28 by the power
information of each frame applied thereto and generates a synthetic
speech.
As described above, in the embodiment according to the invention, the CSM
synthesis necessary for speech reproduction is performed at the receiver
part R and good sound quality can be reproduced irrespective of the amount
of data in compression and error in the transmission line.
The interpolation of the transmission data in the interpolator 20 can be
performed in various ways in accordance with the quantization step of the
transmission data at the transmitter part T. For example, linear and more
complicate function interpolations are usable. Further, interpolation with
respect to .omega..sub.i and m.sub.i can be accomplished advantageously by
choosing the interpolation point for permitting interpolation data to be
given every time at the point of generation of the phase resetting pulse.
For insuring renewal of .omega..sub.i and m.sub.i values at this timing,
phase limitting pulses are applied to the interpolator 20.
Thus, in actual processing, for example, resetting of phase and setting of
frequencies .omega..sub.i in the oscillators 24(1) to 24(n), and setting
of amplitude m.sub.i in the amplifiers 25(1) to 25(n), can be performed at
different times. As a countermeasure against this, the interpolator 20 is
provided with a memory for storing necessary data.
The next description concerns analysis by the CSM analyzer 13. CSM analysis
is performed to determine frequencies .omega..sub.i and strengths or power
amplitudes m.sub.i at every analysis frame so that the lowest N order tap
values of the autocorrelation coefficients directly calculated from the
speech waveform is equal to the lowest N order tap values of the
synthsized wave consisting of n sinusoidal waves.
As described above, the autocorrelation coefficient r.sub.l of tap l is
represented as
##EQU7##
Further, the autocorrelation coefficient v.sub.l of tap l for a certain
frame is expressed by using speech samples x.sub.t as follows:
##EQU8##
By the use of the relationship
r.sub.l =v.sub.l (2)
where l=0,1, . . . , N (N=2n-1), the following matrix is obtained:
##EQU9##
The matrix can not be solved by simple matrix operation owing to the
unknown .omega..sub.i and m.sub.i included in it. Therefore, using
.omega..sub.i =cos.sup.-1 X.sub.i (4)
the substitution as
cos l.omega..sub.1 =cos (lcos.sup.-1 X.sub.i)-5 T.sub.l (X.sub.i)(5) is
made. The T.sub.1 (X) is a Tchebycheff polynominal. Thus equation (3) may
be expressed as
##EQU10##
Generally, X.sup.l can be related to T.sub.O (x), T.sub.1 (x), . . . ,
T.sub.l (x), as linear summation expressed by
##EQU11##
where S.sub.j.sup.(l) is inverse Tchebycheff coefficient. Using
S.sub.j.sup.(l), linear summation A.sub.1 of the above-mentioned sample
autorelation coefficient v.sub.j is defined by
##EQU12##
Using equations (7) and (8) in the left and right sides of equation (6),
gives
##EQU13##
Subsequently, the n-th degree polynominal having "0" point at x.sub.1,
x.sub.2, . . . , x.sub.n defined as
##EQU14##
Using the defined P.sub.n (x) gives
##EQU15##
It is apparent that the above equation becomes "0". It can be rewritten as
##EQU16##
Thus, assuming l=0, 1, 2, . . . , n gives
##EQU17##
Taking p.sub.n.sup.(n) =1, it follows that
##EQU18##
The matrix involving A.sub.i in the left side is generally termed the
Hankel matrix. As above-stated, A.sub.i is obtained by using equation (8)
from sample autocorrelation coefficient v.sub.j of the speech waveform to
be expressed and hence known. Accordingly, P.sub.0.sup.(n),
P.sub.1.sup.(n), . . . P.sub.n-1.sup.(n) can be obtained by solving
equation (10).
On substituting the obtained p.sub.i.sup.(n) values into the n-degree
equation
##EQU19##
Thus {x.sub.1, x.sub.2, . . . , x.sub.n } can be yielded.
Using these values gives CSM frequencies .omega..sub.i in accordance with
equation (4): .omega..sub.cos.sup.-1 x.sub.i. Likewise, CSM amplitudes
m.sub.i can be obtained according to the equation which is derived from
equation (9), expressed by
##EQU20##
The matrix of the left side of the equation is generally termed the Vander
Monde matrix.
In summary, algorithm of CSM analysis is as follows:
(1) Computation of autocorrelation coefficients in accordance with the
equation
##EQU21##
(2) Computation of Al using the inverse Tchebycheff coefficient at
##EQU22##
(3) Computation of P.sub.i.sup.(n) by solving the Hankel matrix equation of
A.sub.l
##EQU23##
(4) For n x.sub.i, solution of the n-th degree algebraic equation having as
coefficients
##EQU24##
(5) For CSM angular frequencies .omega..sub.i, performing the operation as
.omega..sub.i =cos.sup.-1 X.sub.i
(6) For CSM amplitudes m.sub.i, solution of the Vander Monde matrix
equation
##EQU25##
These processing steps give CSM frequencies {.omega..sub.1, .omega..sub.2,
. . . , .omega..sub.n } and CSM amplitudes {m.sub.1, m.sub.2, . . . ,
m.sub.n }. There is known a method of sequentially solving by providing
initial condition, as an efficient solution of the Hankel matrix. The
above-mentioned n-th degree algebraic equation has proved to have real
roots only, and therefore can be solved, for example, by the Newton &
Lapson's method. Also, it is possible to use the method of solving in
sequence by conversion into triangular matrix as an efficient solution of
the Vander Monde matrix equation.
It is to be understood the embodiment of the invention described above of
does not limit the invention. While the above embodiment of the invention
comprises the parameter interpolation by the interpolator at the time
point of phase resetting, this step may be omitted. In a preferred
embodiment of the invention, instead of the variable length window
function of a specified form, of course other function forms can be used.
FIG. 6 shows an example of circuitry of variable frequency oscillator 24
with a phase resetting function. A voltage is applied to a frequency
control terminal 241, and thus a constant current is caused to flow
through constant current power supplies 242 and 243, whereby current for
charging or discharging capacitor 244 is controlled, and by virtue of
this, the oscillation frequency is variable. At point "v", there is
generated a triangular waveform varying linearly between standard voltages
+V.sub.r and -V.sub.r. Upon applying an impulse to a phase reset terminal
245, point v is caused to be instantly grounded and returned to zero
potential. The triangular wave output is supplied to a sinusoidal wave
converter 246 to generate a sinusoidal wave from a terminal 247. The
sinusoidal wave converter 246 can be easily realized for example, by the
method of reading sinusoidal functions stored in ROM, in the form of input
waveform. Such a variable frequency oscillator with a phase resetting
function can simply be realized with a computer program.
FIG. 7 shows an example of circuitry of a variable gain amplifier 25. A
signal to be amplified is applied to a terminal 251 and a control signal
to another terminal 252 to control the gain of the operational amplifier
253. The control signal supplied to an FET 255 controls the current in the
resistor 254, thereby controlling the gain of the amplifier 253.
In FIG. 8, an example of circuitry of the random code generator 23 is
shown, which comprises a 15-stage register array D.sub.1, D.sub.2, . . . ,
D.sub.15 and an exclusive-OR circuit 232 and generates a pseudo random
code of the next 15-order M sequence having synchronism number of 2.sup.15
-1. At a necessary point of time, a shift pulse is applied to a clock
terminal 231 and thus the next random code value is output from an output
terminal group 233. In the example shown in FIG. 8, a 15-order M sequence
is generated from the output terminal group 233, and integers 1 to 32767
are generated once per period.
FIG. 9A is a block diagram of the period calculator 22, which comprises a
constant multiplier 221 and a constant adder 222, and which converts
random codes uniformly distributed in the range of 1 to 32767 from the
random code generator 23 into the codes having distribution suitable for
use in specifying time intervals of the phase-resetting phase for unvoiced
speech.
The constant multiplier 221 operates to multiply the output data (1 to
32767) from the random generator 23 by a constant (3.052.times.10.sup.-3
in the embodiment) to output uniformly-distributed data of 0-100. Then,
the process for yielding fractional points is made. The output of the
constant multiplier 221 is applied to the constant adder 222, and there a
constant (20 in the embodiment) is added to the respective data 0 to 100.
Thus data uniformly distributed over the range of 20 to 120 is obtained
and used as a random interval (initial phase intervals) for unvoiced
speech generation. According to the above described processings, an
appropriate distribution range, having for example the distribution width
D=100 and the lower limit L=20 of random codes, as illustrated in FIG. 9B,
can be obtained. In this way, good unvoiced speech is produced by phase
initialization using the random code signal.
FIG. 10 gives a block diagram of an example of window function generator 27
which comprises a register 271, a presettable down counter 272, a counter
273 and a read only memory (ROM) 274.
Data P from a switch 21 for specifying the phase resetting pulse interval
is stored in the register 271. The down counter 272, upon being preset to
data P read from the register 271, starts to count down in operable
association with a clock CLK. When the content of the counter 272 has
become zero, a pulse is generated from the output (borrow) terminal "B",
and applied to the down counter 272 and the counter 273. Thereby the
initial value of the down counter 272 is represet to P, and down counting
from the initial value is caused to start. As the result, at the output
terminal B, a pulse train of a period proportional to interval P (for
example, P/K, where K is the last address number set on a ROM 274) is
generated. The pulse train is applied to a counter 273 as clocks. The
count output of the counter 273 is applied as address to the RO | | |