A method to reduce the amount of bandwidth used in the transmission of digitized voice packets is described. The method is used to reduce the number of transmitted packets by suspending transmission during periods of silence or when only noise is present. The system determines if a background noise update is warranted based on human auditory perception factors instead of an artificial limiter on excessive silence insertion descriptor packets. The system searches for characteristics in the perceptual changes of background noise instead of analyzing speech for improved audio compression. The invention weighs factors affecting the perception of sound including frequency masking, temporal masking, loudness perception based on tone, and auditory perception differential based on tone.
An overflow problem of LSF quantization in G.729 Annex B speech encoding which may lead to non-assignment of a codebook index. Preferred embodiments fix the problem with default or limited random variable assignments or flagging the overflow and adjusting the frame encoding such as by limiting spectral components or changing quantization targets.
Devices, softwares and methods for prioritizing between voice data packets for discard decision purposes. A perceptual importance of a voice data packet relative to the others is determined at encoding, preferably according to the content of the encoded sound. The relative importance is represented as a comparative discardability code in the packet. If a discard decision is made, it takes into account the comparative discardability code of the packet, thus preferring to discard the unimportant packets more frequently.
Mechanisms are known that allow receivers to control loudness of speech in broadcast signals but these mechanisms require an estimate of speech loudness be inserted into the signal. Disclosed techniques provide improved estimates of loudness. According to one implementation, an indication of the loudness of an audio signal containing speech and other types of audio material is obtained by classifying segments of audio information as either speech or non-speech. The loudness of the speech segments is estimated and this estimate is used to derive the indication of loudness. The indication of loudness maybe used to control audio signal levels so that variations in loudness of speech between different programs is reduced. A preferred method for classifying speech segments is described.