WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Intelligent recognition of speech signals using caller demographics    
United States Patent5553119   
Link to this pagehttp://www.wikipatents.com/5553119.html
Inventor(s)McAllister; Alex (Silver Spring, MD); Wise; Laird (Ellicott City, MD)
AbstractIn a switching system for connecting a call between a calling station and a called station, a system and method of voice recognition using a concentrated or distributed multiplicity of voice recognition and other resources with a facility for selecting an initial resource on the calling station going off-hook by accessing a demographic database using common channel signaling and selecting a prompt to be delivered to the caller from a multiplicity of preselected prompts and reacting to a response by the caller with further addressing of database information to continue to select from said multiplicity of resources the most appropriate resource or resources in reaction to caller utterances. According to another feature the selection of resources is aided by optical means at the calling station delivering information regarding characteristics of the caller including lip movement to permit lip reading.



 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Drawing from US Patent 5553119
Intelligent recognition of speech signals using caller demographics - US Patent 5553119 Drawing
Intelligent recognition of speech signals using caller demographics
Inventor     McAllister; Alex (Silver Spring, MD); Wise; Laird (Ellicott City, MD)
Owner/Assignee     Bell Atlantic Network Services, Inc. (Arlington, VA)
Patent assignment
All assignments
Publication Date     September 3, 1996
Application Number     08/271,885
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     July 7, 1994
US Classification     379/88.01 379/88.22 379/221.09 379/230 704/231 704/235 704/270 704/270.1 704/275
Int'l Classification     H04M 003/64 G10L 009/06
Examiner     Hofsass; Jeffery
Assistant Examiner     Hunter; Daniel S.
Attorney/Law Firm     Lowe, Price, LeBlanc & Becker
Address
Parent Case    
Priority Data    
USPTO Field of Search     379/67 379/88 379/89 379/265 379/266 379/207 395/2.4 395/2.79
Patent Tags     intelligent recognition speech signals caller demographics
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
5479488
Lennig
379/88.04
Dec,1995

[0 after 0 votes]
5335269
Steinlicht

Aug,1994

[0 after 0 votes]
5335266
Richardson, Jr.

Aug,1994

[0 after 0 votes]
5333185
Burke
379/127.01
Jul,1994

[0 after 0 votes]
5311572
Friedes
379/211.02
May,1994

[0 after 0 votes]
5297183
Bareis

Mar,1994

[0 after 0 votes]
5297194
Hunt

Mar,1994

[0 after 0 votes]
5267304
Slusky
379/218.01
Nov,1993

[0 after 0 votes]
5185781
Dowden
379/88.04
Feb,1993

[0 after 0 votes]
5181237
Dowden
379/88.03
Jan,1993

[0 after 0 votes]
5163083
Dowden
379/88.03
Nov,1992

[0 after 0 votes]
5033088
Shipman
704/275
Jul,1991

[0 after 0 votes]
4979206
Padden
379/88.01
Dec,1990

[0 after 0 votes]
4922538
Tchorzewski
704/247
May,1990

[0 after 0 votes]
4769845
Nakamura
704/231
Sep,1988

[0 after 0 votes]
4594476
Freeman
379/76
Jun,1986

[0 after 0 votes]
4546383
Abramatic
348/14.15
Oct,1985

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


We claim:

1. In a switching system for connecting a calling station to a called station a method of establishing completion of said connection comprising the steps of:

responsive to said calling station dialing a destination identifier, establishing the identity of said calling station and using said identity to address a data base;

deriving from said data-base information relating to demographics of said calling station;

selecting from a plurality of speech recognition resources a first resource indicated by said demographic information;

establishing connection to said off-hook station;

inputting a spoken command from a caller at said calling station to said selected resource;

outputting from said first speech recognition resource a first output signal responsive to said spoken command;

selecting from said plurality of speech recognition resources a second resource responsive to said first output signal;

outputting from said second speech recognition resource a second output signal;

inputting a second spoken command from said caller at said calling station to said second resource;

outputting from said second speech recognition resource a third output signal responsive to the second spoken command;

determining the degree of traffic through said plurality of speech recognition resources;

comparing the determined degree of traffic to a predetermined traffic load; and

responsive to said determined degree of traffic being below said predetermined load, inputting at least one of said spoken commands to a plurality of said resources in parallel.

2. A method according to claim 1 including the steps of:

responsive to said second output signal selecting an audio request from a plurality of preestablished audio requests and outputting to said caller said selected audio request requesting said second command;

said caller inputting said second spoken command responsive to said selected audio request.

3. A method according to claim 2 including the steps of:

outputting multiple audio commands to said caller;

commencing timing of interaction with said caller subsequent to commencement of a spoken caller command;

comparing the duration of the interaction with said caller from said commencement of timing to a predetermined time duration;

connecting said caller to an operator station upon the timed duration of said timing exceeding said predetermined time duration.

4. A method according to claim 1 including the step of outputting from said plurality of speech recognition resources in parallel a fourth output signal.

5. A method according to claim 1 wherein said database is accessed responsive to common channel signaling in said switching system.

6. A method according to claim 5 wherein said common channel signaling accesses said database via a Signal Control Point (SCP) in said switching system.

7. A method according to claim 5 wherein said switching system is a Public Switched Telephone Network (SPTN).

8. A method according to claim 7 wherein the identity of said calling station is established responsive to a signal generated in said PSTN pursuant to the conventional operation of said PSTN.

9. A method according to claim 1 including the steps of:

sensing from said calling station characteristics of said caller through means other than audio sensing means and generating a signal representative of said characteristics;

inputting said signal representative of said characteristics to at least one of the resources in said plurality of resources; and

outputting a signal from said at least one resource substantially simultaneously with outputting of a signal from at least one other resource which is responsive to a spoken command from said caller.

10. A method according to claim 9 wherein said sensing means other than an audio sensing means is an optical sensing means.

11. A method according to claim 1 including the steps of:

monitoring command signals from said caller to detect the utterance of at least one of a plurality of predetermined utterances; and

outputting a control signal responsive to the detection of the utterance of at least one of said plurality of predetermined utterances.

12. A method according to claim 11 including the step of:

outputting an audio prompt to said caller responsive to said control signal.

13. A method according to claim 1 including the steps of:

sensing from said calling station characteristics of said caller through means other than audio sensing means and generating a signal representative of said characteristics; and

utilizing said signal representative of said characteristics to at least partially control said output from said plurality of resources responsive to said spoken command.

14. A method according to claim 13 including the step of:

utilizing said signal representative of said characteristics to at least partially control the selection of at least one resource in said plurality of resources.

15. A method according to claim 1 including the steps of:

optically sensing from said calling station lip movement of said caller;

generating a signal representative of the words indicated by said lip movement; and

utilizing said signal representative of the words indicated by said lip movement to at least partially control the output signal responsive to said spoken command.

16. In a Publicly Switched Telephone Network (PSTN) which includes Common Channel Signaling (CCS) and a Signal Control Point (SCP), a method of completing a call from a calling station to a called station comprising the steps of:

responsive to said calling station dialing a destination identifier, establishing the identity of said calling station through said CCS pursuant to the conventional functioning of said PSTN;

using said identity of said calling station to address a data base associated with said SCP;

deriving from said data base information relating to demographics of said calling station;

selecting from a plurality of resources a resource indicated by said demographic information;

establishing a connection to said off-hook station;

inputting a spoken command from a caller at said calling station to said selected resource;

sensing from said calling station characteristics of said caller through means other than audio sensing means and generating a signal representative thereof;

outputting to said caller an audio signal selected from a plurality of preestablished audio signals based on at least one of (a) audio sensing of said spoken command and (b) sensing other than audio;

inputting a second spoken command from said caller responsive to said audio signal outputted to said caller;

outputting from said plurality of resources a second output signal responsive to said second spoken command;

determining the degree of traffic through said plurality of resources;

comparing the determined degree of traffic to a predetermined traffic load; and

responsive to said determined degree of traffic being below said predetermined load, inputting at least one of said spoken commands to a plurality of said resources in parallel.

17. A method according to claim 16 including the steps of:

sensing said characteristics through said means other than audio means on a continuing basis including lip reading; and

generating from said lip reading an output signal responsive to said spoken commands.

18. A switching system including interconnected switching offices and stations connected to at least certain of said switching offices and a Common Channel Signaling System (CCSS) for controlling the connection of a calling station to a called station through said switching system;

a plurality of speech recognition resources connected to said switching system;

a data base associated with said CCSS having stored therein demographic information related to said stations connected to said switching stations;

means for addressing said data base in response to said calling station dialing a destination identifier to access demographic information relating to said dialing station;

means responsive to said accessed information to select at least one of said resources and to connect a signal from said calling station to said selected resource;

means associated with said plurality of resources for outputting a signal responsive to a spoken command to said calling station; and

audio response means for generating a plurality of predetermined audio responses, said audio response means generating an audio response responsive to said signal responsive to said spoken command;

said audio response inviting a further spoken command from said calling station, and including means associated with said plurality of resources for outputting a signal responsive to outputs from at least a pair of said resources operating in parallel.

19. A switching system according to claim 18 including:

sensing means associated with said calling station for sensing a characteristic other than an audible characteristic, said sensing means generating an output signal responsive to the characteristic sensed; and

means for rendering the signal outputted by said plurality of resources at least partially responsive to said output signal responsive to the characteristic sensed.

20. A switching system according to claim 19 wherein said sensing means is optical and the sensed characteristic is lip movement.

21. A switching system according to claim 18 including:

an operator station;

timing means;

means for initiating timing by said timing means at a predetermined occurrence after said calling station goes off-hook;

means establishing a predetermined time duration for timing by said timing means; and

means responsive to said timing means reaching said predetermined time duration to cause connection of said calling station to said operator station.
 Description Submit all comments and votes
 


TECHNICAL FIELD

This invention relates to methods and apparatus for automating various user initiated telephony processes, particularly through the use of improved recognition systems and methodology.

BACKGROUND ART

In the environment of telecommunications systems there has been a steady trend toward automating what was originally operator assistance traffic. Much current activity is directed to responding to directory assistance calls by processing voice frequency instructions from the caller without operator intervention. The instructions are used by an automatic speech recognition unit to generate data signals corresponding to recognized voice frequency signals. The data signals are then used to search a database for a directory listing to derive the desired directory number. A system of this type is described in U.S. Pat. No. 4,979,206 issued Dec. 18, 1990.

According to that patent such automated service is supplied by a switching system equipped with an automatic speech recognition facility for interpreting a spoken or keyed customer request comprising data for identifying a directory listing. In response to recognition of data conveyed by the request, the system searches a database to locate the directory number listing corresponding to the request. This listing is then automatically announced to the requesting customer. In implementing this system the calling customer or caller receives a prompting announcement requesting that the caller provide the zip code or spell the name of the community of the desired directory number. The caller is also prompted to spell the last name of the customer corresponding to the desired directory number. If further data is required, the caller may be prompted to spell the first name and street address of the desired party. Following responses to prompting announcements a search is made to determine if only one listing corresponds to the data supplied by the caller. When this occurs the directory number is announced to the caller. The aim of such a system has been to require a minimum of speech recognition capability by the speech recognition facility--namely, only letters of the alphabet and numbers.

A typical public switched telephone network (PSTN) arrangement proposed to effect such a system is illustrated in block diagram form in FIG. 1 of the aforementioned patent (PRIOR ART). The network of FIG. 1 is here described in some detail as a typical environment in which the method and apparatus of the invention may be utilized. In FIG. 1 block 1 represents a telecommunications switching system, or switch operating under stored program control. Switch 1 may be a switch such as the 5ESS switch manufactured by AT&T Technologies, Inc., arranged to offer the Operator Services Position System (OSPS) features.

Shown within switch 1 are various blocks for carrying out the functions of a program controlled switch. Control 10 is a distributed control system operating under the control of a group of data and call processing programs to control various sections or elements of switch 1. Element 12 is a voice and data switching network capable of switching voice and/or data between inputs connected to that switching network, frequently referred to as the switch fabric or network. Connected to network 12 is a Voice Processing Unit (VPU) 14. Network 12 and VPU 14 operate under the control of control 10. Trunks 31 and 33, customer line 44, data link 35, and operator access facility 26 are connected to network 12 at input ports 31a, 33a, 44a, 35a, and 26a respectively, and control 10 is connected to network 12 via data channel 11 at input port 11a.

VPU 14 receives speech or customer keyed information from callers at calling terminals 40 or 42 and processes the voice signals or keyed tone signals from a customer station using well known automatic speech recognition techniques to generate data corresponding to the speech or keyed information. These data are used by Directory Assistance Computers (DAS/C) 56 in making a search for a desired telephone or directory number listing. When a directory assistance request comes from a customer terminal 42 via customer line 44, port 44a and switching network 12 to VPU 14, VPU 14 analyzes voice input signals to recognize individual ones of various elements corresponding to a predetermined list of spoken responses.

VPU 14 also generates voice messages or announcements to prompt a caller to speak information into the system for subsequent recognition by the voice processing unit. VPU 14 generates output data signals, representing the results of the voice processing. These output signals are sent to control 10 whence they may be transmitted via data link 59 to DAS/C computer 56, or be used within control 10 as an input to the program of control 10 for controlling establishment of connections in switching network 12 or requesting further announcements by VPU 14. VPU 14 includes announcement circuits 13 and detection circuits, i.e., automatic speech recognition circuits 15 both controlled by a controller of VPU 14. A Conversant 1 Voice System, Model 80, manufactured by AT&T Technologies, Inc., may be used to carry out the functions of the VPU 14.

When the DAS/C computer 56 completes its data search and locates the requested directory listing, it is connected via data link 58 to an Audio Response Unit (ARU) 60, which is connected to the voice and data switching network 12 for announcing the telephone number of an identified telephone listing. Computer Consoles, Inc. (CCI) manufactures an Audio Response Unit 16 and the DAS/C terminal 52 which may be used in this environment. As shown, the DAS/C computer 56 is directly connected to control 10 by data link 59 but could be connected to control 10 via a link to network 12 and a connection through network 12 via port 11a. After a directory listing is found the directory number is reported to audio response unit 60 for announcement to the caller.

Directory assistance calls can also be processed with the help of an operator if the VPU fails to recognize adequate oral information.

Connected to switch 1 are trunks 31 and 33 connected to local switch 30 and interconnection network 32. Local switch 30 is connected to calling customer terminal 40 and interconnection network 32 is connected to a called customer terminal 46. Switch 30 and network 32 connect customer terminal signals from customer terminals to switch 1. Also connected to switch 1 are customer lines including customer line 44 for connecting a customer terminal 42 to switch 1.

In an alternate connection calling terminal 40 is connected via local switch 30 to switch 1. In a more general case, other switches forming part of a larger public telephone network such as interconnection network 32 would be required to connect calling terminal 40 to switch 1. Generally speaking, calls are connected to switch 1 via communication links such as trunks 31 and 33 and customer line 44. In the alternate connection calling terminal 40 is connected by a customer line to a 1AESS 30, manufactured by AT&T Technologies, Inc., and used here as a local switch or end office. That switch is connected to trunk 31 which is connected to switch 1. Local switch 30 is also connected to switch 1 by a data link 35 used for conveying common channel signaling messages between these two switches. Such common channel signaling messages are used herein to request switch 30 to initiate the setting up of a connection, for example, between customer terminals 40 and 46. Switch 1 is connected in the example terminating connection to called terminal 46 via interconnection network 32. If the calling terminal is not directly connected to switch 1, the directory number of the calling terminal identified, for example, by Automatic Number Identification (ANI), is transmitted from the switch connected to the calling terminal to switch one.

Operator position terminal 24 connected to switch 1 comprises a terminal for use by an operator in order to provide operator assistance. Data displays for the operator position terminal 24 are generated by control 10. Operator position terminal 24 is connected to switching network 12 by operator access facility 26 which may include carrier facilities to allow the operator position to be located far from switching network 12 or may be a simple voice and data access facility if the operator positions are located close to the switching network.

In order to handle directory assistance services, the directory assistance operator has access to two separate operator terminals; terminal 24 for communicating with the caller and switch 1 and terminal 52 used for communicating via data link 54 with DAS/C computer 56. The operator at terminals 24 and 52 communicates orally with a caller and on the basis of these communications keys information into the DAS/C terminal 52 for transmission to the DAS/C computer 56. The DAS/C computer 56 responds to such keyed information by generating displays of information on DAS/C terminal 52 which information may include the desired directory number. Until the caller provides sufficient information to locate a valid listing the caller is not connected to an audio response unit since there is nothing to announce. Further details of the operation of the system of FIG. 1 are set forth in U.S. Pat. No. 4,979,206.

Further examples of use of voice recognition in automation of telephone operator assistance calls is found in U.S. Pat. Nos. 5,163,083, issued Nov. 10, 1992; 5,185,781, issued Feb. 9, 1993; 5,181,237, issued Jan. 19, 1993, to Dowden et al.

Another proposed use for speech recognition in a telecommunications network is voice verification. This is the process of verifying the person's claimed identity by analyzing a sample of that person's voice. This form of security is based on the premise that each person can be uniquely identified by his or her voice. The degree of security afforded by a verification technique depends on how well the verification algorithm discriminates the voice of an authorized user from all unauthorized users. It would be desirable to use voice verification to verify the identity of a telephone caller. Such schemes to date, however, have not been implemented in a fully satisfactory manner. One such proposal for implementing voice verification is described in U.S. Pat. No. 5,297,194, issued Mar. 22, 1994, to Hunt et al. In an embodiment of such a system described in this patent a caller attempting to obtain access to services via a telephone network is prompted to enter a spoken password having a plurality of digits. Preferably, the caller is prompted to speak the password beginning with the first digit and ending with a last digit. Each spoken digit of the password is then recognized using a speaker-independent voice recognition algorithm. Following entry of the last digit of the password, a determination is made whether the password is valid. If so, the caller's identity is verified using a voice verification algorithm.

This method is implemented according to that patent using a system comprising a digital processor for prompting the caller to speak the password and then using speech processing means controlled by the digital processor for effecting a multi-stage data reduction process and generating resulting voice recognition and voice verification parameter data and voice recognition and verification routines.

Following the digit based voice recognition step, the voice verification routing is controlled by the digital processor and is responsive to a determination that the password is valid for determining whether the caller is an authorized user. This routing includes transformation means that receives the speech feature data generated for each digit in the voice verification feature transformation data and in response thereto generates voice verification parameter data for each digit. A verifier routing receives the voice verification parameter data and the speaker-relative voice verification class reference data and in response thereto generates an output indicating whether the caller is an authorized user.

In operation a caller places a call from a conventional calling station telephone to a financial institution or card verification company in order to access account information. The caller has previously enrolled in the voice verification database that includes his or her voice verification class reference data. The financial institution includes suitable input/output devices connected to the system (or integrally therewith) to interface signals to and from the telephone lines. Once the call set up has been established, the digital processor controls the prompt means to prompt the caller to begin digit-by-digit entry of the caller's preassigned password. The voice recognition algorithm processes each digit and uses a statistical recognition strategy to determine which digit (0-9 and "oh") is spoken. After all digits have been recognized, a test is made to determine whether the entered password is valid for the system. If so, the caller is conditionally accepted. In other words, if the password is valid the system "knows" who the caller claims to be and where the account information is stored.

Thereafter the system performs voice verification on the caller to determine if the entered password has been spoken by a voice previously enrolled in the voice verification reference database and assigned to the entered password. If the verification algorithm establishes a "match" access to the data is provided. If the algorithm substantially matches the voice to the stored version thereof but not within a predetermined acceptance criterion, the system prompts the caller to input additional personal information to further test the identity of the claimed owner of the password. If the caller cannot provide such information, the system rejects the access inquiry and the call is terminated.

Existing approaches for deploying speech recognition technology for universal application are based on creating speech models based on "average" voice features. This averaging approach tends to exclude persons with voice characteristics beyond the boundaries created by the averaging. The speech model averages are based on the training set used when the models are created. For example, if the models are created using speech samples for New Englanders then the models will tend to exclude voices with Southern accents or voices with Hispanic accents. If the models try to average an all inclusive population, the performance deteriorates for the entire spectrum.

BRIEF SUMMARY OF THE INVENTION

It is an object of the invention to provide a system and method for accomplishing universal speech recognition on a reliable basis using a unique combination of existing technologies and available equipment.

The new and improved methodology and system involves an initial two step passive and active procedure to preselect the most appropriate technology model or device for each type of caller. The passive feature may be based on numerous factors subject to determination without seeking active participation by the customer or user. One such factor is demographics which may be determined by identifying the geographic area of origin of the call. This may be accomplished through the use of ANI or Caller ID or any one of a number of other passively determinable factors such as ICLID, DNIC, NNX, area code, time of day, snapshot, date or biometrics. If the profile database constructed for the purpose of making an appropriate choice of recognition technology model or device on the basis of passive features is inconclusive, a second step or active procedure may be initiated. This may take the form of an automated oral query or prompt to solicit a customer or caller response that can be analyzed to select the appropriate recognition model or device following the caller active step.

It has been recognized by the inventor that a factor in obtaining high efficiency speech recognition is that the speech recognition products of different vendors perform more or less satisfactorily under differing specific circumstances. For example, the equipment of one vendor may provide the best performance for continuous digit recognition, the equipment of another vendor may provide the best performance for speaker dependent recognition, the equipment of still another vendor may provide the best performance for speaker independent/word spotting recognition, the equipment of another vendor or different equipment of the same vendor may provide the best performance for male voices or female voices, etc.

According to the invention this seeming limitation is utilized to advantage by providing a platform (which may be distributed) which includes the speech recognition equipment of multiple vendors. The recognition task is then handled by directing a specific recognition question to the type of equipment best able to handle that specific situation. Thus an optimal arrangement might incorporate the algorithms of multiple vendors within a single bus architecture so that multiple vendor boards are placed on the main machine and the operating program directs the signal to be recognized to the most appropriate board for processing.

In many cities it is known that certain areas are largely, if not completely, populated by particular ethnic groups. As a part of the passive step, the incoming call can be identified as to the area of call origin and that call directed at the outset to a voice recognition sub-system which is most effective for the language or accent of that ethnic group. This may be accomplished by creating a demographic database based on statistical data collected for the involved city. Thus each city may have its own unique demographic database.

According to a preferred embodiment the recognition device may then comprise a platform which includes multiple different recognition resources. Specific resources are then selected for their pre-established ability to handle different situations with high efficiency. With such resources available across a backbone, such as an Ethernet, an executive server can direct a speech input to a selected resource depending upon the ethnic vocabulary needed at that time. The demographic database may be advantageously associated with and controlled by the intelligence available in the AIN ISCP. The incoming call can trigger the ISCP via the AIN network on the basis of the ANI or Caller ID information to direct call setup to the selected resource prior to connection of the caller. This passive procedure is completely transparent to the caller.

Once the call is connected into a particular resource, a speech sample is obtained which can be used to confirm that the call is in the correct resource utilizing the appropriate models. If there is any question as to the correctness of this solution, a direct question can be triggered to obtain active caller participation. Thus the caller can be asked a question which would require an answer tailored to permit more specific language identification. In appropriate circumstances the caller may be instructed to converse in what is tentatively established to be his/her native language.

In addition to the foregoing it is a feature of the invention that the intelligent recognition process can also detect behavioral information such as anxiety, anger, inebriation, etc. This aspect of the invention requires additional database data which may be provided for that purpose. As a last resort, a caller can be connected to a live operator.

The foregoing discussion is directed to the situation in which a particular call is directed to a single voice recognition resource selected either on the passive and/or active basis described above. However in times of low network traffic it is also a feature of the invention to process an incoming call through multiple resources in parallel to provide a maximum reliability in recognition. For example, the involved telephone station, particularly a public station, may include a more or less sophisticated camera or optical/electronic device effective to accomplish lip reading, or classify gender, or other physical characteristics of the caller.

After speech recognition has been achieved according to the invention, the resulting output signals may be utilized for any of a number of purposes, such as in the directory assistance procedure illustrated and described in relation to FIG. 1, or as a substitute for dialing where the desired directory number is merely spoken by the caller. Still further, the high reliability of the system makes possible enhanced services which would permit a user to speak a predetermined identification word and then say "home" or "office" to achieve automatic completion of a call to his/her home or office.

Accordingly it is a primary object of the invention to provide an improved system and method for accomplishing universal speech recognition in the environment of a switched telephone network and most particularly a PSTN.

It is another object of the invention to provide a system and method for accomplishing universal speech recognition for purposes of the transfer of spoken intelligence as well as speaker authentication.

It is yet another object of the invention to provide an improved system and method for accomplishing universal speech recognition on an efficient and economic basis using features and technologies currently available to the public switched telephone network.

It is another object of the invention to provide such a system using a two step passive and active procedure wherein the passive