or
Bookmark and Share
Real-time speech and music classifier
   
Document Number
US Patent 6785645
Issued Date
August 31, 2004
Link
Inventors
Map
Abstract
An efficient and accurate classification method for classifying speech and music signals, or other diverse signal types, is provided. The method and system are especially, although not exclusively, suited for use in real-time applications. Long-term and short-term features are extracted relative to each frame, whereby short-term features are used to detect a potential switching point at which to switch a coder operating mode, and long-term features are used to classify each frame and validate the potential switch at the potential switch point according to the classification and a predefined criterion.
Tags:
Description:
Amusing 0%
Clever 0%
Complex 0%
Efficient 0%
Historic 0%
Important 0%
Innovative 0%
Interesting 0%
Practical 0%
Simple 0%
Number of Claims:
17
Comments:
no comments yet
Owner
Microsoft Corporation (Redmond, WA)
Published
August 31, 2004
Application Number
09/997,679
Filed
November 29, 2001
US Classification
704/216   704/219 704/500
Int'l Classification
G10L   19/14   (20060101)   G10L   19/00   (20060101)  
Assistant Examiner
Attorney/Law Firm
USPTO Field of Search
704/503   704/501   704/500   704/270   704/268   704/267   704/266   704/265   704/233   704/223   704/214   704/216   704/203  
Related Patents
7191128 - Method and system for distinguishing speech from music in a digital audio signal in real time - Owned by LG Electronics Inc. (Seoul,KR)

The present invention relates to method and system for distinguishing speech from music in a digital audio signal in real time. A method for distinguishing speech from music in a digital audio signal in real time for the sound segments that have been segmented from an input signal of the digital sound processing systems by means of a segmentation unit on the base of homogeneity of their properties, comprises the steps of: (a) framing an input signal into sequence of overlapped frames by a windowing function; (b) calculating frame spectrum for every frame by FFT transform; (c) calculating segment harmony measure on base of frame spectrum sequence; (d) calculating segment noise measure on base of the frame spectrum sequence; (e) calculating segment tail measure on base of the frame spectrum sequence; (f) calculating segment drag out measure on base of the frame spectrum sequence; (g) calculating segment rhythm measure on base of the frame spectrum sequence; and (h) making the distinguishing decision based on characteristics calculated.

7596486 - Encoding an audio signal using different audio coder modes - Owned by Nokia Corporation (Espoo,FI)

The invention relates to a method for supporting an encoding of an audio signal, wherein a first coder mode and a second coder mode are available for encoding a respective section of an audio signal. The second coder mode enables a coding of a respective section based on a first coding model, which requires for an encoding of a respective section only information from the section itself, and based on a second coding model, which requires for an encoding of a respective section in addition an overlap signal with information from a preceding section. After a switch from the first coder mode to the second coder mode, always the first coding model is used for encoding a first section of the audio signal. This section can then be employed to generate an artificial overlap signal for a subsequent section, which is possibly to be encoded with the second coding model.

Claims
Description
About| FAQs| Terms & Disclaimer| Link to Us| Contact Us