|
Description  |
|
|
My invention relates to digital speech communication and more particularly
to digital speech signal coding and decoding arrangements.
The efficient use of transmission channels is of considerable importance in
digital communication systems where channel bandwidth is broad.
Consequently, elaborate coding, decoding, and multiplexing arrangements
have been devised to minimize the bit rate of each signal applied to the
channel. The lowering of signal bit rate permits a reduction of channel
bandwith or increase in the number of signals which can be multiplexed on
the channel.
Where speech signals are transmitted over a digital channel, channel
efficiency can be improved by compressing the speech signal prior to
transmission and constructing a replica of the speech from the compressed
speech signal after transmission. Speech compression for digital channels
removes redundancies in the speech signal so that the essential speech
information can be encoded at a reduced bit rate. The speech transmission
bit rate may be selected to maintain a desired level of speech quality.
One well known digital speech coding arrangement, disclosed in U.S. Pat.
No. 3,624,302 issued Nov. 30, 1971, includes a linear prediction analysis
of an input speech signal in which the speech is partitioned into
successive intervals and a set of parameter signals representative of the
interval speech are generated. These parameter signals comprise a set of
linear prediction coefficient signals corresponding to the spectral
envelope of the interval speech, and pitch and voicing signals
corresponding to the speech excitation. The parameter signals are encoded
at a much lower bit rate then required for encoding the speech signal as a
whole. The encoded parameter signals are transmitted over a digital
channel to a destination at which a replica of the input speech signal is
constructed from the parameter signals by synthesis. The synthesizer
arrangement includes the generation of an excitation signal from the
decoded pitch and voicing signals, and the modification of the excitation
signal by the envelope representative prediction coefficients in an
all-pole predictive filter.
While the foregoing pitch excited linear predictive coding is very
efficient in bit rate reduction, the speech replica from the synthesizer
exhibits a synthetic quality unlike the natural human voice. The synthetic
quality is generally due to inaccuracies in the generated linear
prediction coefficient signals which cause the linear prediction spectral
envelope to deviate from the actual spectral envelope of the speech signal
and to inaccuracies in the pitch and voicing signals. These inaccuracies
appear to result from differences between the human vocal tract and the
all pole filter model of the coder and the differences between the human
speech excitation apparatus and the pitch period and voicing arrangements
of the coder. Improvement in speech quality has heretofore required much
more elaborate coding techniques which operate at far greater bit rates
than does the pitch excited linear predictive coding scheme. It is an
object of the invention to provide natural sounding speech in a digital
speech coder at relatively low bit rates.
SUMMARY OF THE INVENTION
Generally, the synthesizer excitation generated during voiced portions of
the speech signal is a sequence of pitch period separated impulses. It has
been recognized that variations in the excitation pulse shape effects the
quality of the synthesized speech replica. A fixed excitation pulse shape,
however, does not result in a natural sounding speech replica. But,
particular excitation pulse shapes effect an improvement in selected
features. I have found that the inaccuracies in linear prediction
coefficient signals produced in the predictive analyzer can be corrected
by shaping the predictive synthesizer excitation signal to compensate for
the errors in the predictive coefficient signals. The resulting coding
arrangement provides natural sounding speech signal replicas at bit rates
substantially lower than other coding systems such as PCM, or adaptive
predictive coding.
The invention is directed to a speech processing arrangement in which a
speech analyzer is operative to partition a speech signal into intervals
and to generate a set of first signals representative of the prediction
parameters of the interval speech signal, and pitch and voicing
representative signals. A signal corresponding to the prediction error of
the interval is also produced. A speech synthesizer is operative to
produce an excitation signal responsive to the pitch and voicing
representative signals and to combine the excitation signal with the first
signal to construct a replica of the speech signal. The analyzer further
includes apparatus for generating a set of second signals representative
of the spectrum of the interval predictive error signal. Responsive to the
pitch and voicing representative signals and the second signals, a
predictive error compensating excitation signal is formed in the
synthesizer whereby a natural sounding speech replica is constructed.
According to one aspect of the invention, the prediction error compensating
excitation signal is formed by generating a first excitation signal
responsive to the pitch and voicing representative signals and shaping the
first excitation signal responsive to the second signals.
According to another aspect of the invention, the first excitation signal
comprises a sequence of excitation pulses produced jointly responsive to
the pitch and voicing representative signals. The excitation pulses are
modified responsive to the second signals to form a sequence of prediction
error compensating excitation pulses.
According to yet another aspect of the invention, a plurality of prediction
error spectral signals are formed responsive to the prediction error
signal in the speech analyzer. Each prediction error spectral signal
corresponds to a predetermined frequency. The prediction error spectral
signals are sampled during each interval to produce the second signals.
According to yet another aspect of the invention, the modified excitation
pulses in the speech synthesizer are formed by generating a plurality of
excitation spectral component signals corresponding to the predetermined
frequencies from the pitch and voicing representative signals and a
plurality of prediction error spectral coefficient signals corresponding
to the predetermined frequencies from the pitch representative signal and
the second signals. The excitation spectral component signals are combined
with the prediction error spectral coefficient signals to produce the
prediction error compensating excitation pulses.
BRIEF DESCRIPTION OF THE DRAWING
FIG. 1 depicts a block diagram of a speech signal encoder circuit
illustrative of the invention;
FIG. 2 depicts a block diagram of a speech signal decoder circuit
illustrative of the invention;
FIG. 3 shows a block diagram of a predictive error signal generator useful
in the circuit of FIG. 1;
FIG. 4 shows a block diagram of a speech interval parameter computer useful
in the circuit of FIG. 1;
FIG. 5 shows a block diagram of a prediction error spectral signal computer
useful in the circuit of FIG. 1;
FIG. 6 shows a block diagram of a speech signal excitation generator useful
in the circuit of FIG. 2;
FIG. 7 shows a detailed block diagram of the prediction error spectral
coefficient generator of FIG. 2; and
FIG. 8 shows waveforms illustrating the operation of the speech interval
parameter computer of FIG. 4.
DETAILED DESCRIPTION
A speech signal encoder circuit illustrative of the invention is shown in
FIG. 1. Referring to FIG. 1, a speech signal is generated in speech signal
source 101 which may comprise a microphone, a telephone set or other
electroacoustic transducer. The speech signal s(t) from speech signal
source 101 is supplied to filter and sampler circuit 103 wherein signal
s(t) is filtered and sampled at a predetermined rate. Circuit 103, for
example, may comprise a lowpass filter with a cutoff frequency of 4 kHz
and a sampler having a sampling rate of at least 8 kHz. The sequence of
signal samples, S.sub.n are applied to analog-to-digital converter 105
wherein each sample is converted into a digital code s.sub.n suitable for
use in the encoder. A/D converter 105 is also operative to partition the
coded signal samples into successive time intervals or frames of 10 ms
duration.
The signal samples s.sub.n from A/D converter 105 are supplied to the input
of prediction error signal generator 122 via delay 120 and to the input of
interval parameter computer 130 via line 107. Parameter computer 130 is
operative to form a set of signals that characterize the input speech but
can be transmitted at a substantially lower bit rate than the speech
signal itself. The reduction in bit rate is obtained because speech is
quasi-stationary in nature over intervals of 10 to 20 milliseconds. For
each interval in this range, a single set of signals can be generated
which signals represent the information content of the interval speech.
The speech representative signals, as is well known in the art, may
include a set of prediction coefficient signals and pitch and voicing
representative signals. The prediction coefficient signals characterize
the vocal tract during the speech interval while the pitch and voicing
signals characterize the glottal pulse excitation for the vocal tract.
Interval parameter computer 130 is shown in greater detail in FIG. 4. The
circuit of FIG. 4 includes controller 401 and processor 410. Processor 410
is adapted to receive the speech samples s.sub.n of each successive
interval and to generate a set of linear prediction coefficient signals, a
set of reflection coefficient signals, a pitch representative signal and a
voicing representative signal responsive to the interval speech samples.
The generated signals are stored in stores 430, 432, 434 and 436,
respectively. Processor 410 may be the CSP Incorporated Macro-Arithmetic
Processor system 100 or may comprise other processor or microprocessor
arrangements well known in the art. The operation of processor 410 is
controlled by the permanently stored program information from read only
memories 403, 405 and 407.
Controller 401 of FIG. 4 is adapted to partition each 10 millisecond speech
interval into a sequence of at least four predetermined time periods. Each
time period is dedicated to a particular operating mode. The operating
mode sequence is illustrated in the waveforms of FIG. 8. Waveform 801 in
FIG. 8 shows clock pulses CL1 which occur at the sampling rate. Waveform
803 in FIG. 8 shows clock pulses CL2, which pulses occur at the beginning
of each speech interval. The CL2 clock pulse occurring at time t.sub.1
places controller 401 in its data input mode, as illustrated in waveform
805. During the data input mode controller 401 is connected to processor
410 and to speech signal store 409. Responsive to control signals from
controller 401, the 80 sample codes inserted into speech signal store 409
during the preceding 10 millisecond speech interval are transferred to
data memory 418 via input/output interface circuit 420. While the stored
80 samples of the preceding speech interval are transferred into data
memory 418, the present speech interval samples are inserted into speech
signal store 409 via line 107.
Upon completion of the transfer of the preceding interval samples into data
memory 418, controller 401 switches to its prediction coefficient
generation mode responsive to the CL1 clock pulse at time t.sub.2. Between
times t.sub.2 and t.sub.3, controller 401 is connected to LPC program
store 403 and to central processor 414 and arithmetic processor 416 via
controller interface 412. In this manner, LPC program store 403 is
connected to processor 410. Responsive to the permanently stored
instructions in read only memory 403, processor 410 is operative to
generate partial correlation coefficient signals R=r.sub.1, r.sub.2, . . .
, r.sub.12, and linear prediction coefficient signals A=a.sub.1, a.sub.2 .
. . , a.sub.12. As is well known in the art, the partial correlation
coefficient is the negative of the reflection coefficient. Signals R and A
are transferred from processor 410 to stores 432 and 430, respectively,
via input/output interface 420. The stored instructions for the generation
of the reflection coefficient and linear prediction coefficient signals in
ROM 403 are listed in Fortran language in Appendix 1.
As is well known in the art, the reflection coefficient signals R are
generated by first forming the co-variance matrix P whose terms are
##EQU1##
and speech correlation factors
##EQU2##
Factors g.sub.1 through g.sub.10 are then computed in accordance with
##EQU3##
where T is the lower triangular matrix obtained by the triangular
decomposition of
[P.sub.ij ]=T T.sup.-1 (4)
the partial correlation coefficients are then generated in accordance with
the
##EQU4##
c.sub.0 corresponds to the energy of the speech signal in the 10
millisecond interval. Linear prediction coefficient signals A=a.sub.1,
a.sub.2, . . . , a.sub.12, are computed from the partial correlation
coefficient signals r.sub.m in accordance with the recursive formulation
##EQU5##
The partial correlation coefficient signals R and the linear prediction
coefficient signals A generated in processor 410 during the linear
prediction coefficient generation mode are transferred from data memory
418 to stores 430 and 432 for subsequent use.
After the partial correlation coefficient signals R and the linear
prediction coefficient signals A are placed in stores 430 and 432 (by time
t.sub.3), the linear prediction coefficient generation mode is ended and
the pitch period signal generation mode is started. At this time,
controller 401 is switched to its pitch mode as indicated in waveform 809.
In this mode, pitch program store 405 is connected to controller interface
412 of processor 410. Processor 410 is then controlled by the permanently
stored instructions of ROM 405 so that a pitch representative signal for
the preceding speech interval is produced responsive to the speech samples
in data memory 418 corresponding to the preceding speech interval. The
permanently stored instructions of ROM 405 are listed in Fortran language
in Appendix 2. The pitch representative signal produced by the operations
of central processor 414 and arithmetic processor 416 are transferred from
data memory 418 to pitch signal store 434 via input/output interface 420.
By time t.sub.4, the pitch representative signal is inserted into store
434 and the pitch period mode is terminated.
At time t.sub.4, controller 401 is switched from its pitch period mode to
its voicing mode as indicated in waveform 811. Between times t.sub.4 and
t.sub.5, ROM 407 is connected to processor 410. ROM 407 contains
permanently stored signals corresponding to a sequence of control
instructions for determining the voicing character of the preceding speech
interval from an analysis of the speech samples of that interval. The
permanently stored program of ROM 407 is listed in Fortran language in
Appendix 3. Responsive to the instructions of ROM 407, processor 410 is
operative to analyze the speech samples of the preceding interval in
accordance with the disclosure of the article "A Pattern-Recognition
Approach to Voiced-Unvoiced-Silence Classification With Applications to
Speech Recognition" by B. S. Atal and L. R. Rabiner appearing in the IEEE
Transactions on Acoustics, Speech, and Signal Processing, Vol. ASSP-24,
No. 3, June 1976. A signal V is then generated in arithmetic processor 416
which characterizes the speech interval as a voiced interval or as an
unvoiced interval. The resulting voicing signal is placed in data memory
418 and is transferred therefrom to voicing signal store 436 via
input/output interface 420 by time t.sub.5. Controller 401 disconnects ROM
407 from processor 410 at time t.sub.5 and the voicing signal generation
mode is terminated as indicated in waveform 811.
The reflection coefficient signals R and the pitch and voicing
representative signals P and V from stores 432, 434 and 436 are applied to
parameter signal encoder 140 in FIG. 1 via delays 137, 138 and 139
responsive to the CL2 clock pulse occurring at time t.sub.6. While a
replica of the input speech can be synthesized from the reflection
coefficient, pitch and voicing signals obtained from parameter computer
130, the resulting speech does not have the natural characteristics of a
human voice. The artificial character of the speech derived from the
reflection coefficient and pitch and voicing signals of computer 130 is
primarily the result of errors in the predictive reflection coefficients
generated in parameter computer 130. In accordance with the invention,
these errors in prediction coefficients are detected in prediction error
signal generator 122. Signals representative of the spectrum of the
prediction error for each interval are produced and encoded in prediction
error spectral signal generator 124 and spectral signal encoder 126,
respectively. The encoder spectral signals are multiplexed together with
the reflection coefficient, pitch, and voicing signals from parameter
encoder 140 in multiplexer 150. The inclusion of the prediction error
spectral signals in the coded signal output of the speech encoder of FIG.
1 for each speech interval permits compensation for the errors in the
linear predictive parameters during decoding in the speech decoder of FIG.
2. The resulting speech replica from the decoder of FIG. 2 is natural
sounding.
The prediction error signal is produced in generator 122, shown in greater
detail in FIG. 3. In the circuit of FIG. 3, the signal samples from A/D
converter 105 are received on line 312 after the signal samples have been
delayed for one speech interval in delay 120. The delayed signal samples
are supplied to shift register 301 which is operative to shift the
incoming samples at the CL1 clock rate of 8 kilohertz. Each stage of shift
register 301 provides an output to one of multipliers 303-1 through
303-12. The linear prediction coefficient signals for the interval
a.sub.1, a.sub.2, . . . , a.sub.12 corresponding to the samples being
applied to shift register 301 are supplied to multipliers 303-1 through
303-12 from store 430 via line 315. The outputs of multipliers 303-1
through 303-12 are summed in adders 305-2 through 305-12 so that the
output of adder 305-12 is the predicted speech signal
##EQU6##
Subtractor 320 receives the successive speech signal samples s.sub.n from
line 312 and the predicted value for the successive speech samples from
the output of adder 305-12 and provides a difference signal d.sub.n that
corresponds to the prediction error.
The sequence of prediction error signals for each speech interval is
applied to prediction error spectral signal generator 124 from subtractor
320. Spectral signal generator 124 is shown in greater detail in FIG. 5
and comprises spectral analyzer 504 and spectral sampler 513. Responsive
to each prediction error sample d.sub.n on line 501 spectral analyzer 504
provides a set of 10 signals, c(f.sub.1), c(f.sub.2), . . . c(f.sub.10).
Each of these signals is representative of a spectral component of the
prediction error signal. The spectral component frequencies f.sub.1,
f.sub.2, . . . , f.sub.10 are predetermined and fixed. These predetermined
frequencies are selected to cover the frequency range of the speech signal
in a uniform manner. For each predetermined frequency f.sub.i, the
sequence of prediction error signal samples d.sub.n of the speech interval
are applied to the input of a cosine filter having a center frequency
f.sub.k and an impulse response h.sub.k given by
h.sub.k =(2/0.54) (0.54-0.46 cos 2.pi.f.sub.o kT) Cosf.sub.i kT (8)
when
T .ident. sampling interval=125 .mu.sec
f.sub.o .ident. frequency spacing of filter center frequencies=300 Hz
k=0, 1, . . , 26
and to the input of a sine filter of the same center frequency having an
impulse response h'.sub.k given by
h'.sub.k =(2/0.54) (0.54-0.46 cos 2.pi.f.sub.o kT)sin f.sub.i kT (9)
Cosine filter 503-1 and sine filter 505-1 each has the same center
frequency f.sub.1 which may be 300 Hz. Cosine filter 503-2 and sine filter
505-2 each has a common center frequency of f.sub.2 which may be 600 Hz.,
and cosine filter 503-10 and sine filter 505-10 each have a center
frequency of f.sub.10 which may be 3000 Hz.
The output signal from cosine filter 503-1 is multiplied by itself is
squarer circuit 507-1 while the output signal from sine filter 505-1 is
similarly multiplied by itself in squarer circuit 509-1. The sum of the
squared signals from circuits 507-1 and 509-1 is formed in adder 510-1 and
square root circuit 512-1 is operative to produce the spectral component
signal corresponding to frequency f.sub.1. In like manner, filters 503-2,
505-2, squarer circuits 507-2 and 509-2, adder circuit 510-2 and square
root circuit 512-2 cooperate to form the spectral component c(f.sub.2)
corresponding to frequency f.sub.2. Similarly, the spectral component
signal of predetermined frequency f.sub.10 is obtained from square root
circuit 512-10. The prediction error spectral signals from the outputs of
square root circuits 512-1 through 512-10 are supplied to sampler circuits
513-1 through 513-10, respectively.
In each sampler circuit, the prediction error spectral signal is sampled at
the end of each speech interval by clock signal CL2 and stored therein.
The set of prediction error spectral signals from samplers 513-1 through
513-10 are applied in parallel to spectral signal encoder 126, the output
of which is transferred to multiplexer 150. In this manner, multiplexer
150 receives encoded reflection coefficient signals R and pitch and
voicing signals P and V for each speech interval from parameter signal
encoder 140 and also receives the coded prediction error spectral signals
c(f.sub.n) for the same interval from spectral signal encoder 126. The
signals applied to multiplexer 150 define the speech of each interval in
terms of a multiplexed combination of parameter signals. The multiplexed
parameter signals are transmitted over channel 180 at a much lower bit
rate than the coded 8 kHz speech signal samples from which the parameter
signals were derived.
The multiplexed coded parameter signals from communication channel 180 are
applied to the speech decoder circuit of FIG. 2 wherein a replica of the
speech signal from speech source 101 is contructed by synthesis.
Communication channel 180 is connected to the input of demultiplexer 201
which is operative to separate the coded parameter signals of each speech
interval. The coded prediction error spectral signals of the interval are
supplied to decoder 203. The coded pitch representative signal is supplied
to decoder 205. The coded voicing signal for the interval is supplied to
decoder 207, and the coded reflection coefficient signals of the interval
are supplied to decoder 209.
The spectral signals from decoder 203, the pitch representative signal from
decoder 205, and the voicing representative signal from decoder 207 are
stored in stores 213, 215 and 217, respectively. The outputs of these
stores are then combined in excitation signal generator 220 which supplies
a prediction error compensating excitation signal to the input of linear
prediction coefficient synthesizer 230. The synthesizer receives linear
prediction coefficient signals a.sub.1, a.sub.2, . . . a.sub.12 from
coefficient converter and store 219, which coefficients are derived from
the reflection coefficient signals of decoder 209.
Excitation signal generator 220 is shown in greater detail in FIG. 6. The
circuit of FIG. 6 includes excitation pulse generator 618 and excitation
pulse shaper 650. The excitation pulse generator receives the pitch
representative signals from store 215, which signals are applied to pulse
generator 620. Responsive to the pitch representative signal, pulse
generator 620 provides a sequence of uniform pulses. These uniform pulses
are separated by the pitch periods defined by pitch representative signal
from store 215. The output of pulse generator 620 is supplied to switch
624 which also receives the output of white noise generator 622. Switch
624 is responsive to the voicing representative signal from store 217. In
the event that the voicing representative signal is in a state
corresponding to a voiced interval, the output of pulse generator 620 is
connected to the input of excitation shaping circuit 650. Where the
voicing representative signal indicates an unvoiced interval, switch 624
connects the output of white noise generator 622 to the input of
excitation shaping circuit 650.
The excitation signal from switch 624 is applied to spectral component
generator 603 which generator includes a pair of filters for each
predetermined frequency f.sub.1, f.sub.2, . . . , f.sub.10. The filter
pair includes a cosine filter having a characteristic in accordance with
equation 8 and a sine filter having a characteristic in accordance with
equation 9. Cosine filter 603-11 and 603-12 provide spectral component
signals for predetermined frequency f.sub.1. In like manner, cosine filter
603-21 and sine filter 603-22 provide the spectral component signals for
frequency f.sub.2 and, similarly, cosine filter 603-n1 and sine filter
603-n2 provide the spectral components for predetermined frequency
f.sub.10.
The prediction error spectral signals from the speech encoding circuit of
FIG. 1 are supplied to filter amplitude coefficient generator 601 together
with the pitch representative signal from the encoder. Circuit 601, shown
in detail in FIG. 7, is operative to produce a set of spectral coefficient
signals for each speech interval. These spectral coefficient signals
define the spectrum of the prediction error signal for the speech
interval. Circuit 610 is operative to combine the spectral component
signals from spectral component generator 603 with the spectral
coefficient signals from coefficient generator 601. The combined signal
from circuit 610 is a sequence of prediction error compensating excitation
pulses that are applied to synthesizer circuit 230.
The coefficient generator circuit of FIG. 7 includes group delay store 701,
phase signal generator 703, and spectral coefficient generator 705. Group
delay store 701 is adapted to store a set of predetermined delay times
.tau..sub.1, .tau..sub.2, . . . .tau..sub.10. These delays are selected
experimentally from an analysis of representative utterances. The delays
correspond to a median group delay characteristic of a representative
utterance which has also been found to work equally well for other
utterances.
Phase signal generator 703 is adapted to generate a group of phase signals
.PHI..sub.1, .PHI..sub.2, . . . , .PHI..sub.10 in accordance with
.PHI..sub.i =(.tau..sub.i /P) i=1,2, . . . , 10 (10)
responsive to the pitch representative signal from line 710 and the group
delay signals .tau..sub.1, .tau..sub.2, . . . , .tau..sub.10 from store
701. As is evident from equation 10, the phases for the spectral
coefficient signals are a function of the group delay signals and the
pitch period signal from the speech encoder of FIG. 1. The phase signals
.PHI..sub.1, .PHI..sub.2, . . . , .PHI..sub.10 are applied to spectral
coefficient generator 705 via line 730. Coefficient generator 705 also
receives the prediction error spectral signals from store 213 via line
720. A spectral coefficient signal is formed for each predetermined
frequency in generator 705 in accordance with
##EQU7##
As is evident from equations 10 and 11, phase signal generator 703 and
spectral coefficient generator 705 may comprise arithmetic circuits well
known in the art.
Outputs of spectral coefficient generator 705 are applied to combining
circuit 610 via line 740. In circuit 610, the spectral component signal
from cosine filter 603-11 is multiplied by the spectral coefficient signal
H.sub.1,1 in multiplier 607-11 while the spectral component signal from
sine filter 603-12 is multiplied by the H.sub.1,2 spectral coefficient
signal in multiplier 607-12. In like manner, multiplier 607-21 is
operative to combine the spectral component signal from cosine filter
603-21 and the H.sub.2,1 spectral coefficient signal from circuit 601
while multiplier 607-22 is operative to combine the spectral component
signal from sine filter 603-22 and the H.sub.2,2 spectral coefficient
signal. Similarly, the spectral component and spectral coefficient signals
of predetermined frquency f.sub.10 are combined in multipliers 607-n1 and
607-n2. The outputs of the multipliers in circuit 610 are applied to adder
circuits 609-11 through 609-n2 so that the cumulative sum of all
multipliers is formed and made available on lead 670. The signal on the
670 may be represented by
##EQU8##
where C(f.sub.k) represents the amplitude of each predetermined frequency
component, f.sub.k is the predetermined frequency of the cosine and sine
filters, and .PHI..sub.k is the phase of the predetermined frequency
component in accordance with equation 10. The excitation signal of
equation 12 is a function of the prediction error of the speech interval
from which it is derived, and is effective to compensate for errors in the
linear prediction coefficients applied to synthesizer 230 during the
corresponding speech interval.
LPC synthesizer 230 may comprise an all-pole filter circuit arrangement
well known in the art to perform LPC synthesis as described in the article
"Speech Analysis and Synthesis by Linear Prediction of the Speech Wave" by
B. S. Atal and S. L. Hanauer appearing in the Journal of the Acoustical
Society of America, Vol. 50 pt 2, pages 637-655, August 1971. Jointly
responsive to the prediction error compensating excitation pulses and the
linear prediction coefficients for the successive speech intervals,
synthesizer 230 produces a sequence of coded speech signal samples
s.sub.n, which samples are applied to the input of the D/A converter 240.
D/A converter 240 is operative to produce a sampled signal S.sub.n which
is a replica of the speech signal applied to the speech encoder circuit of
FIG. 1. The sampled signal from converter 240 is lowpass filtered in
filter 250 and the analog replica output s(t) filter 250 is available from
loudspeaker device 254 after amplification in amplifier 252.
##SPC1##
* * * * *
|
|
|
|
|
Description  |
|