WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Interactive speech pronunciation apparatus and method    
United States Patent5393236   
Link to this pagehttp://www.wikipatents.com/5393236.html
Inventor(s)Blackmer; Elizabeth R. (Lexington, VA); Ferrier; Linda J. (Lexington, MA)
AbstractAn interactive speech pronunciation system for teaching pronunciation and reducing the accent of a user includes a memory for storing a plurality of presequenced lessons, an input interface for allowing a user to select predetermined ones of the presequenced lessons, a processor for executing program steps corresponding to the lessons selected by the user, and a monitor for displaying visual indicators to the user of the system. The speech pronunciation system further includes an audio input device, for recording sounds spoken by the user, an audio output device for transducing signals fed thereto to pre-recorded sounds and a speech processor for providing stored signals to the audio output device.
   














 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Drawing from US Patent 5393236
Interactive speech pronunciation apparatus and method - US Patent 5393236 Drawing
Interactive speech pronunciation apparatus and method
Inventor     Blackmer; Elizabeth R. (Lexington, VA); Ferrier; Linda J. (Lexington, MA)
Owner/Assignee     Northeastern University (Boston, MA)
Patent assignment
All assignments
Publication Date     February 28, 1995
Application Number     07/951,675
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     September 25, 1992
US Classification     434/169 434/185 434/308 704/3 704/270
Int'l Classification     G09B 005/00
Examiner     Apley; Richard J.
Assistant Examiner     Cheng; Joe H.
Attorney/Law Firm     Weingarten, Schurgin, Gagnebin & Hayes
Address
Parent Case    
Priority Data    
USPTO Field of Search     434/156 434/167 434/169 434/185 434/307 434/308 434/309 434/319 434/320 434/365 381/35 381/43 381/51 381/52 381/53 364/419 364/419.01 364/419.03 395/2 395/152 395/927
Patent Tags     interactive speech pronunciation
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
5180307
Hiramatsu
434/157
Jan,1993

[0 after 0 votes]
5065317
Hiramatsu

Nov,1991

[0 after 0 votes]
5056145
Yamamoto
704/270
Oct,1991

[0 after 0 votes]
5038377
Kihara
704/268
Aug,1991

[0 after 0 votes]
5010495
Willetts
704/235
Apr,1991

[0 after 0 votes]
4969194
Ezawa
704/276
Nov,1990

[0 after 0 votes]
4907274
Nomura
380/30
Mar,1990

[0 after 0 votes]
4737110
Masuda
434/350
Apr,1988

[0 after 0 votes]
4722621
Johnson

Feb,1988

[0 after 0 votes]
4698776
Shibata
704/201
Oct,1987

[0 after 0 votes]
4641343
Holland
704/276
Feb,1987

[0 after 0 votes]
4615680
Tomatis
434/157
Oct,1986

[0 after 0 votes]
4591929
Newsom
360/32
May,1986

[0 after 0 votes]
4586905
Groff
434/307R
May,1986

[0 after 0 votes]
4520501
DuBrucq
704/271
May,1985

[0 after 0 votes]
4460342
Mills
434/185
Jul,1984

[0 after 0 votes]
4406626
Anderson
704/270
Sep,1983

[0 after 0 votes]
4380438
Okamoto
434/157
Apr,1983

[0 after 0 votes]
4212119
Tomatis
434/156
Jul,1980

[0 after 0 votes]
4170834
Smart
434/157
Oct,1979

[0 after 0 votes]
4158264
Orth
434/311
Jun,1979

[0 after 0 votes]
4096645
Mandl
434/185
Jun,1978

[0 after 0 votes]
4048729
Derks
434/320
Sep,1977

[0 after 0 votes]
4030211
McGinley
434/167
Jun,1977

[0 after 0 votes]
5111409
Gasper
715/500.1
Dec,1969

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


What is claimed is:

1. An interactive computer system for teaching pronunciation and accent reduction to a user comprises:

a processor;

a memory, coupled to said processor, said memory having stored therein a set of instructions corresponding to a plurality of presequenced accent reduction lessons for execution in a predetermined order by said processor wherein the predetermined order is determined by the user selecting a first one of:

(a) the native language of the user; and

(b) a subject area, wherein said subject area is selected from the group consisting of an engineering subject area and a physical science subject area;

an input interface, coupled to said processor, for providing a user with control over the sequence in which the accent reduction lessons are executed by said processor;

a monitor, coupled to said processor, for displaying visual indicators to the user, wherein said visual indicators correspond to said instructions;

an audio input/output device, coupled to said processor, for recording sounds spoken by a user and for playing prerecorded sounds stored in said memory; and

a speech processor, coupled to said processor and said audio input/output device, for providing stored signals to said audio input/output device, wherein in response to an instruction displayed on said monitor, a user speaks a sound into said audio input/output device and the sound is stored in said memory and the sound corresponds to a first one of:

(a) a word;

(b) a phrase; and

(c) a sentence; and

wherein said pre-recorded sounds in said memory correspond to stored digitized signals representative of the sound spoken by the user into said audio input/output device.

2. The interactive computer system of claim 1, wherein:

in response to a predetermined instruction, a user speaks a sound into said audio input/output device and said sound is stored in the memory as a stored signal and said sound corresponds to a first one of:

(a) a word;

(b) a phrase; and

(c) a sentence; and

said pre-recorded sounds stored correspond to sequenced and stored digitized signals representative of the sound spoken into said audio input/output device,

graphical user interface to Select program options and perform selected lessons;

the processor calculates the percentage of correct and incorrect utterances spoken by the user; and

the processor stores a result corresponding to the percentage of correct and incorrect utterances spoken by the user in a first predetermined memory location dedicated to the user.

3. The interactive computer system of claim 2, wherein:

prior to said user speaking into said audio input/output device, said speech processor provides a signal to said audio input/output device wherein said signal corresponds to a model sound and the sound the user speaks into said audio input/output device corresponds to an imitation of said model sound;

in response to an instruction, said speech processor provides the signal corresponding to the model sound to said audio input/output device; and

in response to an instruction executed by said processor, said speech processor provides via said audio input/output device, an audio signal corresponding to the user's imitation of the model sound.

4. The interactive computer system of claim 3 wherein the model sound has a sound pattern having a particular complexity and when said audio input/output device provides the signal corresponding to the model sound, said audio input/output device also provides a presequenced series of sound targets sequenced according to the complexity of the sound pattern.

5. The interactive computer system of claim 1, wherein:

the input interface includes an input terminal and a graphical user interface, and the number of correct and incorrect utterances spoken by the user are recorded by an instructor using the input terminal while the user simultaneously utilizes the

6. A method for teaching pronunciation and accent reduction to a user, said method comprising the steps of:

displaying a first menu on a monitor, said first menu having a first plurality of user selectable options;

selecting, with a user interface device, a first user selectable option from said first plurality of user selectable options wherein said first user selectable option is selected according to a first one of the following criteria:

a first language corresponding to a native language of a user;

a first subject area, wherein said first subject area is selected from the group consisting of an engineering subject area and a physical science subject area; and a phonetic stress pattern; and

executing in a predetermined order by a processor, a plurality of steps determined by the first user selectable option selected in said selecting step, wherein when said first user selectable option is selected according to a first language, said plurality of steps correspond to a first lesson plan to teach pronunciation and accent reduction to a user, wherein said first lesson plan is organized according to the selected first language and when said first user selectable option is selected according to a second different language, said plurality of steps correspond to a second lesson plan to teach pronunciation and accent reduction to the user, wherein said second lesson plan is organized according to the selected second language and wherein the organization of the first and second lesson plans is different.

7. The method of claim 6 wherein:

said executing step further comprises the steps of:

displaying a second menu on said monitor, said second menu having a second plurality of user selectable options; and

selecting, with a user interface device, a first menu option from said second plurality of user selectable options.

8. The method of claim 7 further comprising the steps of:

executing said first menu option; and

displaying a first one of said first and second menus.

9. A method for teaching accent reduction and speech pronunciation to a user comprising the steps of:

performing a diagnostic test on the user;

computing the results of said diagnostic test; and

executing, in a processor, a plurality of lessons stored in a memory to teach pronunciation and accent reduction to the user wherein the order in which said lessons are executed is determined by the results of said diagnostic test computed in said computing step and wherein said diagnostic test comprises the steps of:

recording, with an audio input/output device, at least one word spoken by the user;

grading, by a second user, said at least one word with a predetermined grading criteria;

calculating, in said processor, a value corresponding to the results of the grading step; and

calculating a suggested lesson plan, in said processor, wherein said suggested lesson plan corresponds to a series of lessons having a predetermined order, said series of lessons corresponding to selected lessons from a plurality of lessons stored in a memory, wherein said predetermined order of lessons is determined in response to the calculating a value step.

10. The method of claim 9, wherein said method of performing the diagnostic test further comprises the steps of:

storing the suggested lesson plan in a first predetermined memory location;

storing the suggested lesson plan in a second predetermined memory location; and

displaying the suggested lesson plan on a monitor.

11. The method of claim 9, wherein the step of recording at least one word spoken by the user includes the step of playing each of said at least one words with the audio input/output device such that the user is able to listen to each of the at least one words.

12. The method of claim 9, wherein said grading step further includes the steps of:

indicating the number of correct and incorrect utterances spoken by the user; and

summing, by said processor, for each of the presequenced lessons the number of incorrect utterances spoken by the user in each particular presequenced lesson.
 Description Submit all comments and votes
 


FIELD OF THE INVENTION

This invention relates generally to language learning systems and more particularly to interactive speech pronunciation systems.

BACKGROUND OF THE INVENTION

As is known in the art, because of increased mobility of people throughout the world increasing numbers of people are now residing in countries where their native language is not widely spoken. For example, immigrants who are non-native speakers of the English language are entering the United States. Many of these immigrants are proficient in written English however, because of poor pronunciation due to accents they are unintelligible when they speak. The problem of lack of intelligibility in non-native speakers of English may be particularly recognized at universities which have immigrants working as teaching assistants (TAs). This is particularly true in the science and engineering fields. The need has thus arisen for improved accent reduction services.

Courses which teach English as a second language, for example, have been used for this purpose. A widely accepted aspect of current theory in such courses includes training in self-monitoring. Self-monitoring in this context includes graded exercises in listening to oneself speak, while focusing on a particular production feature. Self-monitoring in the sense of listening to oneself during the act of speaking, while useful, is not particularly easy.

In traditional audiotape lessons the act of speaking may be separated from the act of listening to one's own speech. While self-recording is possible with most instructional audiotapes, the learners very seldom actually listen to their recordings. Rewinding the tapes is awkward and results in an unacceptable delay. Listening to the self-recordings is not an integral part of using the tapes, but rather an additional feature which has the result of slowing down the time to complete the lesson and thus to users this feature appears to be "extra work." Furthermore, in such systems user control of the timing and sequencing of exercises and the number of repetitions is often minimal. This results in a passive learning experience. Lastly, the tasks to be performed in the audiotape lessons generally focus on the learning of syntactic structures through exercises such as substitution drills, with improvement of pronunciation being a subsidiary goal.

While courses which teach immigrants a second language may be the natural context in which immigrants may improve their pronunciation, such courses are usually oversubscribed, the language is taught at too basic a level, and it is difficult to address the pronunciation problems of specific individuals. Furthermore, instructors in such courses often have little training in language pathology or articulation therapy.

Speech-language pathologists, having an educational background in phonetics and articulation therapy, are well-equipped to provide this service. Speech-language pathologists are appropriate instructors for immigrants and some speech-language pathologists have developed specializations in this area. Nevertheless, the need for such services seems to be growing more rapidly than the services available.

One solution to this problem has been to provide language learning systems which use visual displays to aid speakers in identifying problems in their speech. One such system employs a spectrographic display of speech signals which may be used to train speakers to correct articulation errors. However, several problems exist with the spectrogram technique such as the fact that not all articulatory features may be visible in the spectrograph, the use of spectrograms requires both considerable knowledge on the part of the clinician and extensive guidance to the client. Furthermore it may be difficult to generalize the acquired articulations to spontaneous speech. Moreover, while visual feedback is useful in identifying problems, the user eventually relies on the auditory system to improve pronunciation.

SUMMARY OF THE INVENTION

In accordance with the present invention, an interactive speech pronunciation system for teaching pronunciation and reducing the accent of a user includes a memory for storing a plurality of presequenced lessons, an input interface coupled to the memory, for allowing a user to select predetermined ones of the presequenced lessons, a processor, coupled to the memory and the input interface for executing program steps corresponding to the lessons selected by the user, and a monitor coupled to the processor for displaying visual indicators to the user of the system. The speech pronunciation system further includes an audio input device, coupled to the processor for recording sounds spoken by the user, an audio output device for transducing signals fed thereto to prerecorded sounds and a speech processor, coupled to the processor and the audio output device, for providing stored signals to the audio output device. With this particular arrangement, an interactive speech pronunciation system for improving the pronunciation and reducing the accent of a user is provided. The system memory has a series of organized and sequenced lesson plans stored therein, thus an instructor (e.g. a speech language pathologist) need not originate lesson plans or text and corresponding sounds for a user. The input interface allows the user, or alternatively an instructor, to control the sequence in which the program and lesson steps are performed. The processor executes program steps and provides visual indicators and instructions to the user via the monitor. Thus minimal supervision is required which may lead to a concomitant reduction in the cost of speech lessons. The system provides a technique wherein a user may record their own voice and compare their recorded voice to a digitized auditory model. In contrast to visual feedback systems, the present invention relies on the auditory system of the user, employing a process referred to as "enhanced monitoring."

The lesson sequences may start with simple material and may lead to the mastery of a multi-syllabic professional vocabulary within phrasal contexts. Furthermore, since minimal supervision is required, the instructor (e.g. a speech-language pathologists) may more efficiently and creatively utilize the study time. Achieving a level of automaticity in speaking may require extensive drilling. The prior art approaches require the speech-language pathologist to provide intensive articulation therapy to clients on an individual basis which is very time consuming. Thus, the processor controlled speech pronunciation system reduces the amount of time which speech-language pathologists must spend conducting such intensive articulation therapy.

Furthermore the prestored lessons may be accessed according to the native language of a user. Thus the instructor may select appropriately structured drills for speakers of various languages. That is, the lesson sequences and contents may be selected for users having a particular native language such as Chinese, for example, trying to learn a second language, such as English for example.

The present invention emphasizes articulation of phonemes, words, phrases and sentences. Information on semantics and pragmatics may also be provided in a supplementary manner. The system optionally allows the instructor to customize the content of the lessons by adding to a particular lesson vocabulary that may be of special difficulty or of special interest to the user. Thus, the sequenced lessons may be customized to meet the requirements of specific immigrant persons including those immigrants having a technical vocabulary.

The system also provides a plurality of frequently used technical terms which may be especially difficult for non-native speakers of a language to pronounce. Such technical terms may be organized by subject matter for example. The self-recording feature used in the lessons is available for practice of these technical terms. A user may practice pronouncing the technical terms as an adjunct to, or extension of, the lessons.

The user may store correct and incorrect dated versions of their utterances in digitized form in the memory. These stored utterances may be later retrieved from the memory and used to monitor user progress. Such stored utterances may also be used for record keeping by the instructor.

The system also provides each user with a particular file which may contain the correct and incorrect utterances of the user, the lesson sequence plan, notes made by either the user or the instructor, and recordings of vocabulary of particular difficulty and interest for the user. The self-recording feature used in the lessons is available for practice of these terms.

The lessons and material are sequenced according to the complexity of the sound patterns to be mastered, thus providing the framework and material for an entire course or therapy plan. The content is organized with a sequence of sound targets that the user should practice and become proficient in the use thereof.

The system further provides an optional diagnostic test which may be used to determine which sounds the user has difficulty speaking and which lessons may be most useful for the user. Results from the diagnostic test may be linked to a specific lesson sequence in which particular sounds are identified to be practiced by the user. The diagnostic test may be displayed on the monitor and the client's responses may be recorded on an audiotape or in the memory and scored on the monitor by the instructor. The system provides a summary result and a suggested lesson plan based on the results of the diagnostic test.

The system further includes a file which may be used by the instructor to enter information including but not limited to session notes, billing information, personal comments on the user, and long and short-term therapy goals.

The user interface device may include a graphical user interface (GUI) such as a mouse for example. The GUI may be used to select one of a plurality of program options which appear as icons on screens which are displayed on the monitor.

Another optional feature of the program allows the instructor to maintain a record of correct and incorrect utterances using the keyboard while the user simultaneously uses the GUI to select program options and perform selected lessons. The program calculates the percentage of correct and incorrect utterances for a set of target words, for example, and saves the results in the file corresponding to a particular user to thus document the user's progress.

The speech pronunciation system also provides suggested lesson plans organized according to the first language (mother tongue) of the user to thus provide lesson planning assistance to the user or the clinician.

BRIEF DESCRIPTION OF THE DRAWINGS

The foregoing features of this invention, as well as the invention itself, may be more fully understood from the following detailed description of the invention in conjunction with the drawings in which:

FIG. 1 is a block diagram of a language learning system in accordance with the present invention;

FIGS. 2-2B are a series of pictorial representations of a plurality of program stacks used in the language learning system of FIG. 1;

FIG. 2C is a pictorial representation of a card which may be of the type used in the program stacks of FIG. 2;

FIG. 3 is a flow type diagram of the program options provided from a Main Menu Stack;

FIG. 4 is a flow type diagram showing graphical illustrations of a plurality of screens available from the Main Menu screen;

FIGS. 5 and 5A are a series of flow type diagrams of the program options available from a Sample Stack;

FIGS. 5B-5E are a series of graphical illustrations of a series of screens available from the Sample Stack;

FIG. 6 is a flow type diagram of the program options available from a Unit Lesson Stack;

FIGS. 6A-6E are a series of graphical illustrations of screens available from a the Unit Lesson Stack;

FIGS. 7 and 7A are a series of flow type diagrams illustrating of the program options available in a Table of Contents menu;

FIG. 7B is a graphical illustration of a screen available from the Table of Contents menu;

FIG. 8 is a flow type diagram of the program flow in a Native Languages menu;

FIGS. 8A and 8B are a series of graphical illustrations of screens available from the Native Languages menu;

FIG. 9 is an exemplary flow type diagram of the options available in a Technical Vocabulary Stack;

FIGS. 9A and 9B are a series of graphical illustrations of screens available from the Technical Vocabulary Stack;

FIGS. 10 and 10A are a series of flow type diagrams illustrating the options available from a Diagnostics Stack;

FIGS. 10B-10D are a series of graphical illustrations of screens available from the Diagnostic Stack;

FIG. 11 is a flow type diagram of the program options available in a Clinician Stack of the language learning system of FIG. 1; and

FIG. 11A is a graphical illustration of a screen from the Clinician Stack.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

Referring now to FIG. 1, an interactive speech pronunciation system 10 for teaching pronunciation and accent reduction to a user is shown to include a processor 12 having an internal random access memory (RAM) 12a, a speech processor 14, an external memory 16, an input terminal 18, a graphical user interface 19, a monitor 20, and an audio input/output device 21. Each of the above recited components are coupled to each other via a data and communication bus 17.

The processor memory 12a may be used to store program steps to be herein described which may be executed by the processor 12. Suitable processors may be found, for example, in any of the so-called personal computers such as an IBM or Macintosh.TM. personal computer. The speech processor 14 processes prerecorded speech sounds and is preferably provided as the type which utilizes digitized recordings rather than synthesized speech, however any means of providing realistic, exact and sound pleasing speech may be used. The speech processor 14 also converts analog audio signals fed thereto from the audio input/output device 24 to digitized data which may be stored in the memory 16. Thus, the speech processor 14 may include, for example, any of the commercially available devices such as a MacRecorder.TM. for digitizing speech.

The memory 16 may be provided as a so-called hard type disk drive, a so-called floppy disk drive and associated floppy disks, a so-called CD read only memory (CD ROM) or any other appropriate storage device which may be used to supplement the processor memory 12a. The memory 16 has stored thereon a plurality of presequenced lessons and program steps to be described below in conjunction with FIGS. 3-11 which the processor 12 may access and execute.

The input terminal 18 may be provided as a conventional keyboard and the GUI 19