WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Apparatus for generating text data on the basis of speech data input from terminal    
United States Patent5956681   
Link to this pagehttp://www.wikipatents.com/5956681.html
Inventor(s)Yamakita; Tooru (Fussa, JP)
AbstractA speech signal input from the microphone of a mobile terminal having a PHS function in a communication or off-line state is sent from a PHS network to a speech control host unit connected to a LAN in a specific speech service provider through the Internet and recognized. The contents of the recognition result are automatically determined and shaped into text data of a format type designated from the mobile terminal, and more particularly, into E-mail text data or FAX text data. The formatted text data is returned to the mobile terminal in real time and edited on the mobile terminal as needed. Thereafter, the E-mail text data or FAX text data is transferred to the speech control host unit and transmitted. In this system, the mobile terminal does not require any advanced speech recognition environment and can have a speech recognition function having a practical accuracy at a low cost. The mobile terminal can also be equipped with an E-mail/FAX generation/transmission function based on the speech recognition result.
   














 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Inventor     Yamakita; Tooru (Fussa, JP)
Owner/Assignee     Casio Computer Co., Ltd. (Tokyo, JP)
Patent assignment
All assignments
Publication Date     September 21, 1999
Application Number     08/966,912
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     November 6, 1997
US Classification     704/260 704/231 704/235 704/270 704/270.1
Int'l Classification     G01L 009/06
Examiner     Hudspeth; David R.
Assistant Examiner     Chawan; Vijay B.
Attorney/Law Firm     Frishauf, Holtz, Goodman, Langer & Chick, P.C.
Address
Parent Case     This application is a continuation of application Ser. No. 08/708,133, filed Aug. 30, 1996, now U.S. Pat. No. 5,734,933, which is a continuation of Ser. No. 08/321,916, filed Oct. 12, 1994 abandoned; which is a division of Ser. No. 08/053,961, filed Apr. 26, 1993 (U.S. Pat. No. 5,386,264); which is a continuation of Ser. No. 07/970,652, filed Oct. 30, 1992, abandoned, which is a continuation of Ser. No. 07/621,294, filed Jan. 23, 1991, abandoned, which is a division of Ser. No. 07/319,658, filed Mar. 6, 1989 (U.S. Pat. No. 5,012,270).
Priority Data     Dec 27, 1996[JP]8-350323
USPTO Field of Search     704/231 704/270 704/275 704/260 704/235 379/93.09 379/67 379/88 706/20 707/523 395/601
Patent Tags     generating text data basis speech data input terminal
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
5758332
Hirotani

May,1998

[0 after 0 votes]
5632002
Hashimoto
704/231
May,1997

[0 after 0 votes]
5625675
Katsumaru
379/88.25
Apr,1997

[0 after 0 votes]
5577165
Takebayashi
704/275
Nov,1996

[0 after 0 votes]
5465326
Sawada
715/523
Nov,1995

[0 after 0 votes]
5280520
Abe
379/100.14
Jan,1994

[0 after 0 votes]
5182765
Ishii
379/88.04
Jan,1993

[0 after 0 votes]
5163111
Baji
706/20
Nov,1992

[0 after 0 votes]
5128985
Yoshida
379/93.09
Jul,1992

[0 after 0 votes]
4712243
Ninomiya
704/250
Dec,1987

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


I claim:

1. A speech control apparatus connected to a terminal through a communication network, comprising:

means for receiving speech data transmitted from said terminal;

processing means for recognizing the received speech data, converting the recognized speech data into document data, extracting a specific word from the converted document data, and generating formatted document data having a predetermined format by inserting the extracted word into a specified field of the converted document data; and

transmitting means for transmitting the generated formatted document data through said communication network.

2. An apparatus according to claim 1, wherein said processing means comprises means for extracting a word associated with a destination from the converted document data and inserting the extracted word into a field designating a destination of the formatted document data.

3. An apparatus according to claim 2, wherein said processing means specifies an E-mail destination as the destination and generates formatted E-mail text data as the formatted document data, and wherein the transmitting means transmits the formatted E-mail text data to the specified destination.

4. An apparatus according to claim 2, wherein said processing means specifies a FAX destination as the destination and generates formatted FAX text data as the formatted document data, and wherein the transmitting means transmits the formatted FAX text data to the specified destination.

5. An apparatus according to claim 2, wherein said terminal comprises means for receiving the formatted document data generated by said apparatus, means for editing the formatted document data, and means for transmitting the formatted document data to the destination.

6. An apparatus according to claim 1, wherein said terminal comprises means for designating a type of formatted document data, and said apparatus receives data representing the designated type, and extracts a word corresponding to the formatted document data of the designated type, thereby generating the formatted document data.

7. A speech control apparatus connected to a terminal through a communication network, comprising:

means for receiving speech data transmitted from said terminal;

means for recognizing the received speech data and converting the speech data into document data;

means for extracting a word relating to a destination from the converted document data to specify a destination; and

means for transmitting the converted document data to the specified destination.

8. An apparatus according to claim 7, further comprising an address database storing a correspondence between names and destinations, and wherein said means for extracting the word relating to the destination from the converted document data to specify the destination refers to the address database and specifies the destination from a name extracted as the word.

9. A portable terminal unit for obtaining text data from speech data through a network, comprising:

means for inputting speech data;

transmit control means for appending an identification code of the portable terminal unit to the input speech data and for transmitting the speech data to a speech control unit connected to the portable terminal unit through the network;

receive control means for receiving text data as a result of conversion of the speech data transmitted from the speech control unit to the portable terminal unit corresponding to the identification code; and

display means for displaying the received text data.

10. A speech control apparatus to which a plurality of terminal units are connected through a network, comprising:

means for receiving speech data and data identifying a format type transmitted from each of the terminal units;

document generating means for recognizing the received speech data, converting the recognized speech data into document data, and generating formatted document data having a format corresponding to the format type for each terminal unit; and

means for transmitting the generated formatted document data to the specified terminal unit through the network.

11. An article of manufacture comprising a computer usable medium having computer readable program code means embodied therein for causing speech data to be converted into formatted document data in a speech control apparatus to which a terminal unit is connected through a network, the computer readable program code means comprising:

means for causing a computer to receive speech data transmitted from the terminal unit;

means for causing the computer to recognize the received speech data, convert the recognized speech data into document data, extract a specific word from the converted document data, and insert the extracted word into a specific field of the converted document data to generate formatted document data having a predetermined format; and

means for causing the computer to transmit the generated formatted document data through the network.

12. An article of manufacture comprising a computer usable medium having computer readable program code means embodied therein for causing speech data to be converted into formatted document data in a speech control apparatus to which a plurality of terminal units are connected through a network, the computer readable program code means comprising:

means for causing a computer to receive speech data and data identifying a format type transmitted from each of the terminal units;

means for causing the computer to recognize the received speech data, convert the recognized speech data into document data, and generate document data having a format corresponding to the format type for each terminal unit; and

means for causing the computer to transmit the generated formatted document data to a specified terminal unit through the network.
 Description Submit all comments and votes
 


BACKGROUND OF THE INVENTION

The present invention relates to a technique of recognizing speech data such as communication speech data input from a mobile (portable) terminal and generating an E-mail document or a FAX document, i.e., text data formatted on the basis of the recognition result and, more particularly, to a technique of transmitting the generated document.

A speech recognition technique of recognizing a speech signal, converting the speech signal into character data, and storing the character data or using the recognition result for various services is conventionally demanded in various industrial fields.

In recent years, along with the advance of the speech recognition algorithm, speech recognition systems using main frame computers or workstation computers have been developed.

These systems, such as by a bank balance inquiry system for receiving telephone speech data, a seat reservation system, and a goods sorting system for automatically delivering goods upon recognizing the operator voice, are being introduced to various industrial fields.

However, such speech recognition systems have just reached a practical recognition accuracy in the environment of the above-described large-scale computer system. In the environment of a small computer system such as a personal computer, no inexpensive speech recognition systems having a practical recognition accuracy has been realized yet.

Together with the above-described information processing technology, mobile terminals including such as mobile phones, portable telephones, and PHSs (Personal Handyphone Systems) are rapidly becoming popular.

In particular, the PHS is compact and more inexpensive with respect to telephone charges than a mobile phone or portable telephone, and it is explosively being popularized because of its characteristic feature, i.e., the capability of high-quality communication "with anybody anytime anywhere". In addition, the PHS uses a public network having an ISDN (Integrated Services Digital Network) as a backbone and therefore allows high-speed digital communication at a transfer rate of 32 kbits/sec, so that future applications to multimedia communication fields are also increasingly expected.

The PHS is also expected as a multimedia information management/communication terminal which can be used not only as a portable telephone but also as a portable information management device while exploiting the convenience of the mobile terminal. More specifically, such a mobile terminal is expected to have a home page access function and an E-mail communication function as functions of accessing the Internet or an intra-office network as well as a speech communication/FAX function. An information management function such as address management, schedule management, memo management, or database searching is also expected to be arranged.

Such a mobile terminal is required to have a user interface as user-friendly and natural as possible such that the user can readily use it. User interfaces currently put into practice include finger operation input from a keyboard or a mouse and handwriting input using an electronic pen. It is ideal that the user interface can also cope with speech input or the like. More specifically, when not only address input, schedule input, and memo input but also E-mail generation/transmission and FAX generation/transmission are enabled using a speech signal representing the speech contents as data while using the speech communication function as the basic function, the convenience of the mobile terminal can be largely increased. This is the advantage of the application of the speech recognition function as a user interface to the mobile terminal.

However, the mobile terminal is compact and has only a limited information processing capability. In addition, in current speech recognition processing, the practical recognition accuracy can be realized only in the environment of a main frame computer or workstation computer. Therefore, the speech recognition function as the user interface of a mobile terminal has not yet been realized.

BRIEF SUMMARY OF THE INVENTION

It is an object of the present invention to realize, in a communication environment using a mobile terminal, a speech recognition function as a user interface of the mobile terminal at a practical accuracy and cost and to enable generation/transmission of an E-mail or FAX document as formatted text data on the basis of the recognition result.

To achieve the above object, there is provided a speech control apparatus connected to a terminal through a communication network, comprising: means for receiving speech data transmitted from the terminal; means for recognizing the received speech data and converting the speech data into document data; means for extracting a word from the converted document data and generating formatted text data on the basis of the extracted word; and means for transmitting the generated formatted text data through the communication network.

According to the present invention, since speech recognition processing need not be performed on the terminal side, simplification of processing and size reduction of the terminal can be realized. Only by inputting speech data from the terminal, another text format data such as E-mail data or FAX data can be obtained. Therefore, the interface is easy to use as compared to the conventional text data input in a key operation. In addition, an E-mail or FAX function can be added even when the terminal side has no special function.

Additional objects and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objects and advantages of the invention may be realized and obtained by means of the instrumentalities and combinations particularly pointed out in the appended claims.

BRIEF DESCRIPTION OF THE SEVERAL VIEWS OF THE DRAWING

The accompanying drawings, which are incorporated in and constitute a part of the specification, illustrate presently preferred embodiments of the invention, and together with the general description given above and the detailed description of the preferred embodiments given below, serve to explain the principles of the invention.

FIG. 1 is a block diagram showing the entire system configuration;

FIG. 2 is a perspective view showing the outer appearance of a mobile terminal;

FIG. 3 is a functional block diagram of the mobile terminal;

FIG. 4 is a flow chart of the entire processing of the mobile terminal;

FIG. 5 is a flow chart of transmission processing;

FIGS. 6A, 6B, and 6C are views showing the format of communication data;

FIGS. 7A and 7B are views showing the formats of an IP header and a TCP header, respectively;

FIG. 8 is a flow chart of call origination processing using PPP;

FIGS. 9A, 9B, and 9C are flow charts of the operation of a mobile terminal communication control section;

FIG. 10 is a view showing the data structure of a processing terminal registration table;

FIG. 11 is a block diagram of a text speech recognition section;

FIG. 12 is a flow chart of the operation of an input/output control section in the speech recognition section;

FIG. 13 is a flow chart of the operation of a formatted text generation section;

FIG. 14 is a flow chart of the operation of an input/output control section in the formatted text generation section;

FIG. 15 is a flow chart of the operation of a mail transmission/reception section; and

FIG. 16 is a flow chart of the operation of a FAX transmission/reception section.

DETAILED DESCRIPTION OF THE INVENTION

An embodiment of the present invention will be described below in detail with reference to the accompanying drawing.

System Configuration

FIG. 1 is a block diagram showing the entire system configuration of the embodiment of the present invention.

A mobile terminal 101 has a PHS terminal function and is connected to a PHS network 103 via a radio base station 102 in radio communication. The radio base station 102 is a public radio base station provided on a public telephone booth on a street, a utility pole, a building rooftop, or an underpass, or an extension telephone in a subscriber's house. When the mobile terminal 101 is connected to the extension telephone, it is directly connected to the public telephone network without interposing the PHS network. The mobile terminal 101 may be connected to the PHS network 103 or the public telephone network in wire communication via a wire connection unit in place of the radio base station 102.

The PHS network 103 is mutually connected to the public telephone network or an ISDN network, and these networks are connected to a mobile terminal control host unit 104 connected to the Internet 105 through a dedicated high-speed digital line or the like.

When the mobile terminal 101 automatically originates a dial-up call, through the radio base station 102 or the PHS network 103, to the mobile terminal control host unit 104 connected to the public telephone network or ISDN network, the mobile terminal 101 can be connected to the Internet 105.

A router unit 106 connected to a LAN 107 of a predetermined speech service provider through a high-speed digital leased line or the like is connected to the Internet 105. The LAN 107 is a local area network based on Ethernet, ATM (Asynchronous Transfer Mode), or FDDI. A speech control host unit 108 is also connected to the LAN 107.

After the mobile terminal 101 automatically originates a dial-up call to the mobile terminal control host unit 104, the mobile terminal 101 can communicate with the speech control host unit 108 through the Internet 105, the router unit 106, and the LAN 107.

When the user instructs communication with the speech control host unit 108 from the touch panel of an input section 109 in the mobile terminal 101, a control section 110 requests a communication section 111 to start communication with the speech control host unit 108.

If the mobile terminal 101 is not currently connected to the mobile terminal control host unit 104, the communication section 111 originates a call to the radio base station 102 by radio (or by wire) to connect the mobile terminal 101 to the PHS network 103 upon receiving the request for starting the communication from the control section 110, and thereafter, designates the access telephone number of the mobile terminal control host unit 104 and originates a dial-up call.

When the call terminates at the mobile terminal control host unit 104, the communication section 111 in the mobile terminal 101 communicates with a connection establishment section 113 in the mobile terminal control host unit 104 first to negotiate for establishment of connection based on TCP/IP and PPP as a standard communication protocol on the Internet 105. As a result, the mobile terminal control host unit 104 assigns an IP address as an identification address on the Internet 105 to the communication section 111 in the mobile terminal 101, thereby allowing the mobile terminal 101 to access the Internet 105.

If the mobile terminal 101 is connected to the mobile terminal control host unit 104, the communication section 111 in the mobile terminal 101 omits the dial-up call origination.

The communication section 111 in the mobile terminal 101 sends a TCP/IP packet which stores a "destination IP address" serving as a predetermined IP address of the speech control host unit 108, a "transmission source IP address" serving as the IP address assigned by the mobile terminal control host unit 104, a "terminal identification code" (e.g., a PHS telephone number) for identifying the mobile terminal 101, and a text speech recognition/formatting start request command and a format type data based on an instruction from the user or a text speech recognition/formatting end command to the Internet 105.

This TCP/IP packet is transferred to the router unit 106 in the speech service provider by a routing section 114 in the mobile terminal control host unit 104 and a relay host unit (not shown) in the Internet 105 on the basis of the "destination IP address" stored in the TCP/IP packet, and then transferred to a packet transmission/reception section 115 in the speech control host unit 108 through the LAN 107.

The packet transmission/reception section 115 extracts, from the received TCP/IP packet, the "transmission source IP address", the "terminal identification code", and the text speech recognition/formatting start request command and the format type data, or the text speech recognition/formatting end request command, and transfers these data to a mobile terminal communication control section 116 in the speech control host unit 108.

The mobile terminal communication control section 116 registers, in a processing terminal registration table (FIG. 10) to be described later, information associated with the transferred "transmission source IP address", "terminal identification code", and text speech recognition/formatting start request command and format type data, or text speech recognition/formatting end request command. Thereafter, the mobile terminal communication control section 116 requests the packet transmission/reception section 115 to return a TCP/IP packet storing transmission enable data to the mobile terminal 101.

The packet transmission/reception section 115 transmits the corresponding TCP/IP packet to the IP address corresponding to the mobile terminal 101.

In this way, the speech control host unit 108 can execute text speech recognition/formatting of speech data transferred from the mobile terminal 101. Upon receiving the TCP/IP packet storing the transmission enable data from the speech control host unit 108, the communication section 111 in the mobile terminal 101 transfers the transmission enable data stored in the TCP/IP packet to the control section 110.

Upon receiving the transmission enable data, the control section 110 in the mobile terminal 101 requests the communication section 111 to transmit, to the speech control host unit 108, speech data input from a microphone by a speech communication operation or a speech input operation in an off-line state.

The communication section 111 transmits the TCP/IP packet storing the speech data to the IP address corresponding to the speech control host unit 108.

This TCP/IP packet is transferred to the packet transmission/reception section 115 in the speech control host unit 108 through the routing section 114 in the mobile terminal control host unit 104, the relay host unit (not shown) in the Internet 105, the router unit 106 in the speech service provider, and the LAN 107 on the basis of the "destination IP address" stored in the TCP/IP packet.

The packet transmission/reception section 115 extracts speech data stored in the received TCP/IP packet and transfers the speech data to the mobile terminal communication control section 116 in the speech control host unit 108.

The mobile terminal communication control section 116 transfers the transferred speech data to a text speech recognition section 117. The text speech recognition section 117 executes text speech recognition processing for the transferred speech data and transfers the recognition result, i.e., recognized speech text data to a formatted text generation section 118. The formatted text generation section 118 determines the field of the recognized speech text data output from the text speech recognition section 117 using the format type data which is designated from the mobile terminal 101 together with the text speech recognition/formatting start request command, and a format type field dictionary. The formatted text generation section 118 also deletes unnecessary words using an unnecessary word dictionary 1505 (FIG. 13), generates formatted text data, and transfers the formatted text data to the mobile terminal communication control section 116.

To generate E-mail text data, the user of the mobile terminal 101 designates "E-mail" as format type data together with a text speech recognition/formatting start request command. Next, the user sequentially pronounces, e.g., "the destination is taro@casio.co.jp", "the carbon copy is hanako@osuga.co.jp", or "the text is . . . " To generate FAX text data, the user sequentially pronounces, e.g., "the destination number is 0425-79-7735", or "the text is . . . " These pronounced contents are recognized as recognized speech text data by the text speech recognition section 117 in the speech control host unit 108. The formatted text generation section 118 determines the recognized speech text data as text data in, e.g., the "To" field, "Cc" field, or "text" field of E-mail text data. The formatted text generation section 118 deletes unnecessary words and generates formatted text data such as "To: taro@casio.co.jp", "Cc: hanako@osuga.co.jp", or "text: . . . " Alternatively, the formatted text generation section 118 determines the recognized speech text data as text data in, e.g., the "destination number" field, or "text" field of FAX text data. The formatted text generation section 118 deletes unnecessary words and generates formatted text data such as "destination number: 0425-79-7735", or "text: . . . "

The mobile terminal communication control section 116 requests to return a TCP/IP packet storing the formatted text data to the mobile terminal 101.

The packet transmission/reception section 115 transmits the corresponding TCP/IP packet to the IP address corresponding to the mobile terminal 101.

Upon receiving the TCP/IP packet storing the formatted text data from the speech control host unit 108, the communication section 111 in the mobile terminal 101 transfers the formatted text data stored in the TCP/IP packet to the control section 110.

The control section 110 in the mobile terminal 101 inserts the formatted text data into text template data of a format type corresponding to the format type data designated by the user in advance and outputs the formatted text data to an output section 112. The output section 112 displays a text corresponding to the formatted text data on an LCD display section. The user can arbitrarily edit this text data.

When the user of the mobile terminal 101 instructs, from the touch panel of the input section 109, transmission of the E-mail text data or FAX text data which has undergone edit processing, the control section 110 requests the communication section 111 to transmit the E-mail text data or FAX text data to the speech control host unit 108. In this case, a "From" field representing the transmission source address is automatically added to the E-mail text data, or transmission source information is automatically added to the FAX text data.

The communication section 111 transmits a TCP/IP packet storing the E-mail text data or FAX text data to the IP address corresponding to the speech control host unit 108.

This TCP/IP packet is transferred to the packet transmission/reception section 115 in the speech control host unit 108 through the routing section 114 in the mobile terminal control host unit 104, the relay host unit (not shown) in the Internet 105, the router unit 106 in the speech service provider, and the LAN 107 on the basis of the "destination IP address" stored in the TCP/IP packet.

The packet transmission/reception section 115 extracts the E-mail text data or FAX text data stored in the received TCP/IP packet and transfers the data to a mail transmission/reception section 119 or a FAX transmission/reception section 120 in the speech control host unit 108.

The mail transmission/reception section 119 inquires of a name solution server (not shown) to convert an E-mail address set in the "To" field and "Cc" field of the E-mail text data into an IP address, and requests the packet transmission/reception section 115 to transmit the E-mail text data to the IP address. The packet transmission/reception section 115 generates a TCP/IP packet storing the E-mail address and transmits the TCP/IP packet to the Internet 105.

The FAX transmission/reception section 120 dials, on a telephone line 121 (FIG. 1), the destination number set in the "destination number" field of the FAX text data, thereby transmitting the FAX text data to a partner FAX apparatus where the call has terminated.

Upon receiving the E-mail text data for the mobile terminal 101 from the Internet 105 through the packet transmission/reception section 115, the mail transmission/reception section 119 spools the data.

Similarly, upon receiving the FAX text data for the mobile terminal 101 from the telephone line 121, the FAX transmission/reception section 120 spools the data.

When the user of the mobile terminal 101 instructs to receive E-mail text data or FAX text data from the touch panel at an arbitrary timing, the control section 110 requests the communication section 111 to transmit a mail reception request command or a FAX reception request command to the speech control host unit 108.

The communication section 111 transmits a TCP/IP packet storing the mail reception request command or FAX reception request command to the IP address corresponding to the speech control host unit 108.

This TCP/IP packet is transferred to the packet transmission/reception section 115 in the speech control host unit 108 through the routing section 114 in the mobile terminal control host unit 104, the relay host unit (not shown) in the Internet 105, the router unit 106 in the speech service provider, and the LAN 107 on the basis of a "destination IP address" stored in the TCP/IP packet.

The packet transmission/reception section 115 extracts the mail reception request command or the FAX reception request command stored in the received TCP/IP packet and transfers the command to the mail transmission/reception section 119 or the FAX transmission/reception section 120 in the speech control host unit 108.

Upon fetching the mail reception request command, the mail transmission/reception section 119 requests the packet transmission/reception section 115 to extract the E-mail text data which has been received for the mobile terminal 101 from a spool file corresponding to the "terminal identification code" transferred from the mobile terminal 101 together with the mail reception request command and transmit the data to the mobile terminal 101.

Similarly, upon fetching the FAX reception request command, the FAX transmission/reception section 120 requests the packet transmission/reception section 115 to extract FAX text data which has been received for the mobile terminal 101 from a spool file corresponding to the "terminal identification code" transferred from the mobile terminal 101 together with the FAX reception request command and transmit the data to the mobile terminal 101.

The packet transmission/reception section 115 generates a TCP/IP packet storing the E-mail text data or the FAX text data and transmits the TCP/IP packet to the IP address corresponding to the mobile terminal 101.

Upon receiving the TCP/IP packet storing the E-mail text data or the FAX text data from the speech control host unit 108, the communication section 111 in the mobile terminal 101 transfers the E-mail text data or the FAX text data to the control section 110.

The control section 110 in the mobile terminal 101 displays the received E-mail text or FAX text on the LCD display section.

In addition to the communication with the speech control host unit 108, the mobile terminal 101 can also freely access a desired resource on the Internet 105 by originating a dial-up call to the mobile terminal control host unit 104 using a home page browser tool of the mobile terminal 101.

Outer Appearance of Mobile Terminal 101

FIG. 2 is a perspective view showing the outer appearance of the mobile terminal 101 shown in FIG. 1.

The mobile terminal 101 has the outer appearance of a compact portable information management device comprising a microphone 201 also serving as a transmitter for inputting speech data, a camera 202 for inputting image data, an LCD display section 203 which displays various kinds of information and has a touch panel function for receiving touch inputs or pen inputs, and a loudspeaker 204 also serving as a receiver for outputting speech data.

The mobile terminal 101 also has a radio antenna 205 for originating a call to the radio base station 102 shown in FIG. 1, and a socket 206 for connecting the mobile terminal 101 to a wire connection unit in place of the radio base station 102.

The mobile terminal 101 also has an IC card slot 207 for receiving various IC cards, and an optical transceiver 208 for performing infrared optical communication with another mobile terminal 101 or a personal computer.

A switch 209 is a power switch.

Functional Block Diagram of Mobile Terminal 101

FIG. 3 is a functional block diagram of the mobile terminal 101.

As shown in FIG. 1, the mobile terminal 101 comprises the input section 109, the control section 110, the communication section 111, and the output section 112, which are connected to each other via a bus 326.

The input section 109 is constituted by a speech input section, an image input section, and a touch panel mechanism (to be described later in association with the operation of the output section 112).

The speech input section comprises a microphone 301, an A/D conversion section 302, and a microphone control section 303.

The microphone 301 (the microphone 301 corresponds to the microphone 201 shown in FIG. 2) also serves as the transmitter of the PHS and is used to input the user's voice.

The A/D conversion section 302 converts an analog speech signal input from the microphone 301 into digital speech data and codes the digital speech data using ADPCM (Adaptive Differential Pulse Code Modulation) as the standard speech coding method of the PHS. This section has already been put into practice as an LSI circuit constituting a PHS terminal.

In speech communication, the microphone control section 303 transfers the coded speech data to a communication control section 321 in the communication section 111 and sends it to a speech channel. In text speech recognition/formatting, the microphone control section 303 transfers the coded speech data to a RAM 317 in the control section 110.

The image input section is constituted by a CCD (Charge Coupled Device) camera 304, an A/D conversion section 305, a memory 306, and a camera control section 307.

The CCD camera 304 picks up an arbitrary image on the basis of the operation of the user.

The A/D conversion section 305 converts an analog image signal picked up by the CCD camera 304 into digital image data.

The memory 306 stores the digital image data in units of frames.

The camera control section 307 controls the operations of the CCD camera 304, the A/D conversion section 305, and the memory 306.

The output section 112 is constituted by a speech output section and an image output section.

The speech output section is constituted by a loudspeaker 308, a D/A conversion section 309, and a loudspeaker control section 310.

The loudspeaker control section 310 transfers PHS speech data received from the communication control section 321 in the communication section 111 or synthesized speech data received from the RAM 317 in the control section 110 to the D/A conversion section 309.

The D/A conversion section 309 decodes the received speech data, converts the data into an analog speech signal, and causes the loudspeaker 308 (the loudspeaker 308 corresponds to the loudspeaker 204 in FIG. 2) to output the speech signal as speech data.

The image output section is constituted by the LCD display section 203, an LCD driver 312, a memory 313, and an LCD control section 314.

The LCD control section 314 causes the memory 313 to hold various image data such as character data, image data, and command button data from the RAM 317 in the control section 110 in units of frames and starts the LCD driver 312.

The LCD driver 312 displays image data read out from the memory 313 in units of frames on an LCD display section 311 (the LCD display section 311 corresponds to the LCD display section 203 in FIG. 2).

A transparent touch panel is arranged on the surface of the LCD display section 311 (203 in FIG. 2). The user can touch the touch panel with a finger or a pen in accordance with, e.g., command button data displayed on the LCD display section 311 to input a command. This input signal is transferred to the RAM 317 in the control section 110 by a touch panel control section 315.

The control section 110 comprises a CPU 316, the RAM 317, a ROM 318, an IC card interface section 319, and an IC card 320 inserted into the IC card slot 207 (FIG. 2) as needed. The IC card interface section 319 controls input/output of data to/from the IC card 320.

The CPU 316 controls the entire operation of the mobile terminal 101 using the RAM 317 as a work area in accordance with a control program stored in the ROM 318.

The communication section 111 comprises the communication control section 321, a radio driver 322, a radio antenna 323, a wire driver 324, and a socket 325.

The communication control section 321 executes PHS speech communication processing or TCP/IP communication processing (to be described later) with the Internet 105 and controls the radio driver 322 or the wire driver 324.

The radio driver 322 performs conversion between communication data and a PHS radio signal transmitted/received through the radio antenna 323 (the radio antenna 323 corresponds to the radio antenna 205 shown in FIG. 2) in the radio communication mode. The PHS radio signal is based on a radio frequency of 1.9 GHz, a carrier frequency interval of 300 kHz, a four-channel/carrier TDMA-TDD radio access scheme, a .pi./4-shift QPSK modulation scheme, and a radio transfer rate of 384 kbits/sec.

The wire driver 324 performs conversion between communication data and a wire signal transmitted/received through the socket 325 (the socket 325 corresponds to the socket 206 shown in FIG. 2). This wire signal is a general telephone band modem modulated signal.

The operation of the embodiment of the present invention having the above arrangement will be described below in detail.

Processing in Mobile Terminal 101

Processing in the mobile terminal 101 will be described first.

FIG. 4 is a flow chart showing the entire control operation realized as an operation of the CPU 316 in the control section 110 shown in FIG. 3, which executes a control program stored in the ROM 318 in the control section 110 after power-ON.

The control program for realizing functions shown in the flow charts of FIGS. 4, 5, and 8 and data necessary for the program may be stored in the IC card 320 detachably attached to the IC card slot 207 shown in FIG. 2 in the form of program codes which can be read by the CPU 316. The program codes may be directly executed by the CPU 316, or loaded in the RAM 317 or the programmable ROM 318, as needed, and executed by the CPU 316. Alternatively, the control program and data necessary for the program ma