WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Dynamic destination-determined multimedia avatars for interactive on-line communications    

Get related patents on CD
United States Patent6453294   
Link to this pagehttp://www.wikipatents.com/6453294.html
Inventor(s)Dutta; Rabindranath (Austin, TX); Paolini; Michael A. (Round Rock, TX)
AbstractTransforms are used for transcoding input text, audio and/or video input to provide a choice of text, audio and/or video output. Transcoding may be performed at a system operated by the communications originator, an intermediate transfer point in the communications path, and/or at one or more system(s) operated by the recipient(s). Transcoding of the communications input, particular voice and image portions, may be employed to alter identifying characteristics to create an avatar for a user originating the communications input.
   














 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History Custom Search
Drawing from US Patent 6453294
Dynamic destination-determined multimedia avatars for interactive on-line

     communications - US Patent 6453294 Drawing
Dynamic destination-determined multimedia avatars for interactive on-line communications
Inventor     Dutta; Rabindranath (Austin, TX); Paolini; Michael A. (Round Rock, TX)
Owner/Assignee     International Business Machines Corporation (Armonk, NY)
Patent assignment
All assignments
Company News
Publication Date     September 17, 2002
Application Number     09/584,599
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     May 31, 2000
US Classification     704/270.1 704/235 704/260 704/270 704/275
Int'l Classification     G10L 021/06 G10L 013/08 G10L 015/26
Examiner     Dorvil; Richemond
Assistant Examiner     Nolan; Daniel A
Attorney/Law Firm     Dawkins; Marilyn Smith Bracewell & Patterson, L.L.P.
Address
Parent Case    
Priority Data    
USPTO Field of Search     345/419 345/420 345/421 345/422 345/423 345/424 345/425 345/426 345/427 345/428 345/429 345/430 345/431 345/432 345/433 345/434 345/435 345/436 345/437 345/438 345/439 345/440 345/441 345/442 345/443 345/444 345/445 345/446 345/447 345/448 345/449 345/450 345/451 345/452 345/453 345/454 345/455 345/456 345/457 345/458 345/459 345/460 345/461 345/462 345/463 345/464 345/465 345/466 345/467 345/468 345/469 345/470 345/471 345/472 345/473 704/259 704/260 704/270 704/270.1 704/275 704/235
Patent Tags     dynamic destination-determined multimedia avatars interactive on-line communications
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
5983003
Lection
709/202
Nov,1999

[0 after 0 votes]
5977968
Le Blanc
715/706
Nov,1999

[0 after 0 votes]
5963217
Grayson
345/473
Oct,1999

[0 after 0 votes]
5956681
Yamakita
704/260
Sep,1999

[0 after 0 votes]
5956038
Rekimoto
345/419
Sep,1999

[0 after 0 votes]
5950162
Corrigan
704/260
Sep,1999

[0 after 0 votes]
5930752
Kawaguchi

Jul,1999

[0 after 0 votes]
5894305
Needham
715/733
Apr,1999

[0 after 0 votes]
5894307
Ohno
715/757
Apr,1999

[0 after 0 votes]
5884029
Brush, II
709/202
Mar,1999

[0 after 0 votes]
5880731
Liles
715/758
Mar,1999

[0 after 0 votes]
5841966
Irribarren
709/206
Nov,1998

[0 after 0 votes]
5812126
Richardson
715/741
Sep,1998

[0 after 0 votes]
5802296
Morse
709/208
Sep,1998

[0 after 0 votes]
5736982
Suzuki
715/706
Apr,1998

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B

[0 market size comments]
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%

[0 market share comments]
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%

[0 reasonable royalty comments]
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

[0 Guesstimation of Royalty Value Comments]
License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
[0 license availability comments]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
[0 owner/assignee comments]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

[0 competitive advantage comments]
Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

[0 commercial alternatives comments]
 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


What is claimed is:

1. A method for controlling communications, comprising:

receiving communications content and determining a text, audio, or video input mode of the content;

determining a user-specified text, audio, or video output mode for the content for delivering the content to a destination; and

transcoding the content from the text, audio, or video input mode to the user-specified text, audio, or video output mode prior to delivering the content to the destination utilizing a transcoder selected from the group consisting of a text-to-text transcoder, a text-to-audio transcoder, a text-to-video transcoder, an audio-to-text transcoder, an audio-to-audio transcoder, an audio-to-video transcoder, a video-to-text transcoder, a video-to-audio transcoder, and a video-to-video transcoder.

2. The method of claim 1, wherein the step of transcoding the content from the text, audio, or video input mode to the user-specified text, audio, or video output mode prior to delivering the content to the destination further comprises:

transcoding the content at a system at which the content is initially received.

3. The method of claim 1, wherein the step of transcoding the content from the text, audio, or video input mode to the user-specified text, audio, or video output mode prior to delivering the content to the destination further comprises:

transcoding the content at a system intermediate to a system at which the content is initially received and a system to which the content is delivered.

4. The method of claim 1, wherein the step of transcoding the content from the text, audio, or video input mode to the user-specified text, audio, or video output mode prior to delivering the content to the destination further comprises:

transcoding the content at a system to which the content is delivered.

5. The method of claim 1, wherein the step of transcoding the content from the text, audio, or video input mode to the user-specified text, audio, or video output mode prior to delivering the content to the destination further comprises:

creating an avatar for an originator of the content by altering identifying characteristics of the content.

6. The method of claim 5, wherein the step of creating an avatar for an originator of the content by altering identifying characteristics of the content further comprises:

altering speech characteristics of the originator.

7. The method of claim 5, wherein the step of creating an avatar for an originator of the content by altering identifying characteristics of the content further comprises:

altering pitch, tone, bass or mid-range of the content.

8. A system for controlling communications, comprising:

means for receiving communications content and determining a text, audio, or video input mode of the content;

means for determining a user-specified text, audio, or video output mode for the content for delivering the content to a destination; and

means for transcoding the content from the text, audio, or video input mode to the user-specified text, audio, or video output mode prior to delivering the content to the destination utilizing a transcoder selected from the group consisting of a text-to-text transcoder, a text-to-audio transcoder, a text-to-video transcoder, an audio-to-text transcoder, an audio-to-audio transcoder, an audio-to-video transcoder, a video-to-text transcoder, a video-to-audio transcoder, and a video-to-video transcoder.

9. The system of claim 8, wherein the means for transcoding the content from the text, audio, or video input mode to the user-specified text, audio, or video output mode prior to delivering the content to the destination further comprises:

means for transcoding the content at a system at which the content is initially received.

10. The system of claim 8, wherein the means for transcoding the content from the text, audio, or video input mode to the user-specified text, audio, or video output mode prior to delivering the content to the destination further comprises:

means for transcoding the content at a system intermediate to a system at which the content is initially received and a system to which the content is delivered.

11. The system of claim 8, wherein the means for transcoding the content from the text, audio, or video input mode to the user-specified text, audio, or video output mode prior to delivering the content to the destination further comprises:

means for transcoding the content at a system to which the content is delivered.

12. The system of claim 8, wherein the means for transcoding the content from the text, audio, or video input mode to the user-specified text, audio, or video output mode prior to delivering the content to the destination further comprises:

means for creating an avatar for an originator of the content by altering identifying characteristics of the content.

13. The system of claim 12, wherein the means for creating an avatar for an originator of the content by altering identifying characteristics of the content further comprises:

means for altering speech characteristics of the originator.

14. The system of claim 12, wherein the means for creating an avatar for an originator of the content by altering identifying characteristics of the content further comprises:

means for altering pitch, tone, bass or mid-range of the content.

15. A computer program product within a computer usable medium for controlling communications, comprising:

instructions for receiving communications content and deter a text, audio, or video input mode of the content;

instructions for determining a user-specified text, audio, or video output mode for the content for delivering the content to a destination; and

instructions for transcoding the content from the text, audio, or video input mode to the user-specified text, audio, or video output mode prior to delivering the content to the destination utilizing a transcoder selected from the group consisting of a text-to-text transcoder, a text-to-audio transcoder, a text-to-video transcoder, an audio-to-text transcoder, an audio-to-audio transcoder, and audio-to-video transcoder, a video-to-text transcoder, a video-to-audio transcoder, and a video-to-video transcoder.

16. The computer program product of claim 15, wherein the instructions for transcoding the content from the text, audio, or video input mode to the user-specified text, audio, or video output mode prior to delivering the content to the destination further comprises: instructions for transcoding the content at a system at which the content is initially received.

17. The computer program product of claim 15, wherein the instructions for transcoding the content from the text, audio, or video input mode to the user-specified text, audio, or video output mode prior to delivering the content to the destination further comprises:

instructions for transcoding the content at a system intermediate to a system at which the content is initially received and a system to which the content is delivered.

18. The computer program product of claim 15, wherein the instructions for transcoding the content from the text, audio, or video input mode to the user-specified text, audio, or video output mode prior to delivering the content to the destination further comprises:

instructions for transcoding the content at a system to which the content is delivered.

19. The computer program product of claim 15, wherein the instructions for transcoding the content from the text, audio, or video input mode to the user-specified text, audio, or video output mode prior to delivering the content to the destination further comprises:

instructions for creating an avatar for an originator of the content by altering identifying characteristics of the content.

20. The computer program product of claim 19, wherein the instructions for creating an avatar for an originator of the content by altering identifying characteristics of the content further comprises:

instructions for altering speech characteristics of the originator.

21. The computer program product of claim 19, wherein the instructions for creating an avatar for an originator of the content by altering identifying characteristics of the content further comprises:

instructions for altering pitch, tone, bass or mid-range of the content.
 Description Submit all comments and votes
 


BACKGROUND OF THE INVENTION

1. Technical Field

The present invention generally relates to interactive communications between users and in particular to altering identifying attributes of a participant during interactive communications. Still more particularly, the present invention relates to altering identifying audio and/or video attributes of a participant during interactive communications, whether textual, audio or motion video.

2. Description of the Related Art

Individuals use aliases or "screen names" in chat rooms and instant messaging rather than their real name for a variety of reasons, not the least of which is security. An avatar, an identity assumed by a person, may also be used in chat rooms or instant messaging applications. While an alias typically has little depth and is usually limited to a name, an avatar may include many other attributes such as physical description (including gender), interests, hobbies, etc. for which the user provides inaccurate information in order to create an alternate identity.

As available communications bandwidth and processing power increases while compression/transmission techniques simultaneously improve, the text-based communications employed in chat rooms and instant messaging is likely to be enhanced and possibly replaced by voice or auditory communications or by video communications. Audio and video communications over the Internet are already being employed to some extent for chat rooms, particularly those providing adult-oriented content, and for Internet telephony. "Web" motion video cameras and video cards are becoming cheaper, as are audio cards with microphones, so the movement to audio and video communications over the Internet is likely to expand rapidly.

For technical, security, and aesthetic reasons, a need exists to allow users control over the attributes of audio and/or video communications. It would also be desirable to allow user control over identifying attributes of audio and video communications to create avatars substituting for the user.

SUMMARY OF THE INVENTION

It is therefore one object of the present invention to improve interactive communications between users.

It is another object of the present invention to alter identifying attributes of a participant during interactive communications.

It is yet another object of the present invention to alter identifying audio and/or video attributes of a participant during interactive communications, whether textual, audio or motion video.

The foregoing objects are achieved as is now described. Transforms are used for transcoding input text, audio and/or video input to provide a choice of text, audio and/or video output. Transcoding may be performed at a system operated by the communications originator, an intermediate transfer point in the communications path, and/or at one or more system(s) operated by the recipient(s). Transcoding of the communications input, particular voice and image portions, may be employed to alter identifying characteristics to create an avatar for a user originating the communications input.

The above as well as additional objectives, features, and advantages of the present invention will become apparent in the following detailed written description.

BRIEF DESCRIPTION OF THE DRAWINGS

The novel features believed characteristic of the invention are set forth in the appended claims. The invention itself however, as well as a preferred mode of use, further objects and advantages thereof, will best be understood by reference to the following detailed description of an illustrative embodiment when read in conjunction with the accompanying drawings, wherein:

FIG. 1 depicts a data processing system network in which a preferred embodiment of the present invention may be implemented;

FIGS. 2A-2C are block diagrams of a system for providing communications avatars in accordance with a preferred embodiment of the present invention;

FIG. 3 depicts a block diagram of communications transcoding among multiple clients in accordance with a preferred embodiment of the present invention;

FIG. 4 is a block diagram of serial and parallel communications transcoding in accordance with a preferred embodiment of the present invention; and

FIG. 5 depicts a high level flow chart for a process of transcoding communications content to create avatars in accordance with a preferred embodiment of the present invention.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

With reference now to the figures, and in particular with reference to FIG. 1, a data processing system network in which a preferred embodiment of the present invention may be implemented is depicted. Data processing system network 100 includes at least two client systems 102 and 104 and a communications server 106 communicating via the Internet 108 in accordance with the known art. Accordingly, clients 102 and 104 and server 106 communicate utilizing HyperText Transfer Protocol (HTTP) data transactions and may exchange HyperText Markup Language (HTML) documents, Java applications or applets, and the like.

Communications server 106 provides "direct" communications between clients 102 and 104--that is, the content received from one client is transmitted directly to the other client without "publishing" the content or requiring the receiving client to request the content. Communications server 106 may host a chat facility or an instant messaging facility or may simply be an electronic mail server. Content may be simultaneously multicast to a significant number of clients by communications server 106, as in the case of a chat room. Communications server 106 enables clients 102 and 104 to communicate, either interactively in real time or serially over a period of time, through the medium of text, audio, video or any combination of the three forms.

Referring to FIGS. 2A through 2C, block diagrams of a system for providing communications avatars in accordance with a preferred embodiment of the present invention are illustrated. The exemplary embodiment, which relates to a chat room implementation, is provided for the purposes of explaining the invention and is not intended to imply any limitation. System 200 as illustrated in FIG. 2A includes browsers with chat clients 202 and 204 executing within clients 102 and 104, respectively, and a chat server 206 executing within communications server 106. Communications input received from chat clients 202 and 204 by chat server 206 is multicast by chat server 206 to all participating users, including clients 202 and 204 and other users.

In the present invention, system 200 includes transcoders 208 for converting communications input into a desired communications output format. Transcoders 208 alter properties of the communications input received from one of clients 202 and 204 to match the originator's specifications 210 and also to match the receiver's specifications 212. Because communications capabilities may vary (i.e., communications access bandwidth may effectively preclude receipt of audio or video), transcoders provide a full range of conversions as illustrated in Table I:

TABLE I Receives Audio Receives Text Receives Video Origin Audio Audio-to-Audio Audio-to-Text Audio-to-Video Origin Text Text-to-Audio Text-to-Text Text-to-Video Origin Video Video-to-Audio Video-to-Text Video-to-Video

Through audio-to-audio (speech-to-speech) transcoding, the speech originator is provided with control over the basic presentation of their speech content to a receiver, although the receiver may retain the capability to adjust speed, volume and tonal controls in keeping with basic sound system manipulations (e.g. bass, treble, midrange). Intelligent speech-to-speech transforms alter identifying speech characteristics and patterns to provide an avatar (alternative identity) to the speaker. Natural speech recognition is utilized for input, which is contextually mapped to output. As available processing power increases and natural speech recognition techniques improve, other controls may be provided such as contextual mapping of speech input to a different speech characteristics--such as adding, removing or changing an accent (e.g., changing a Southern U.S. accent to a British accent), changing a child's voice to an adult's or vice versa, and changing a male voice to a female voice or vice versa--or to a different speech pattern (e.g., changing a New Yorker's speech pattern to a Londoner's speech pattern).

For audio-to-text transcoding the originator controls the manner in which their speech is interpreted by a dictation program, including, for example, recognition of tonal changes or emphasis on a word or phrase which is then placed in boldface, italics or underlined in the transcribed text, and substantial increases in volume resulting in the text being transcribed in all capital characters. Additionally, intelligent speech to text transforms would transcode statements or commands to text shorthand, subtext or "emoticon". Subtext generally involves delimited words conveying an action (e.g., "<grin>") within typed text. Emoticons utilize various combinations of characters to convey emotions or corresponding facial expressions or actions. Examples include: :) or :-) or :-D or d; ) for smiles,:(for a frown, ;-) or; -D for a wink; -P for a "raspberry" (sticking out tongue), and :-.vertline., :-> or :-x for miscellaneous expressions; With speech-to-text transcoding in the present invention, if the originator desired to present a smile to the receiver, the user might state "big smile", which the transcoder would recognize as an emoticon command and generate the text ":-D". Similarly, a user stating "frown" would result in the text string ":-(" within the transcribed text.

For text-to-audio transcoding, the user is provided with control over the initial presentation of speech to the receiver. Text-to-audio transcoding is essentially the reverse of audio-to-text transcoding in that text entered in all capital letters would be converted to increased volume on the receiving end. Additionally, short hand chat symbols (emoticons) would convert to appropriate sounds (e.g., ":-P" would convert to a raspberry sound). Additionally, some aspects of speech-to-speech transcoding may be employed, to generate a particular accent or age/gender characteristics. The receiver may also retain rights to adjust speed, volume, and tonal controls in keeping with basic sound system manipulations (e.g. bass, treble, midrange).

Text-to-text transcoding may involve translation from one language to another. Translation of text between languages is currently possible, and may be applied to input text converted on the fly during transmission. Additionally, text-to-text conversion may be required as an intermediate step in audio-to-audio transcoding between languages, as described in further detail below.

Audio-to-video and text-to-video transcoding may involve computer generated and controlled video images, such as anime (animated cartoon or caricature images) or even realistic depictions. Text or spoken commands (e.g., "<grin>" or "<wink>") would cause generated images to perform the corresponding action.

For video-to-audio and video-to-text transcoding, origin video typically includes audio (for example, within the well-known layer 3 of the Motion Pictures Expert Group specification, more commonly referred to as "MP3"). For video-to-audio transcoding, simple extraction of the audio portion maybe performed, or the audio track may also be transcoded for utilizing the audio-to-audio transcoding techniques described above. For video-to-text transcoding, the audio track may be extracted and transcribed utilizing audio-to-text coding techniques described above.

Video-to-video transcoding may involve simple digital filtering (e.g., to change hair color) or more complicated conversions of video input to corresponding computer generated and controlled video images described above in connection with audio-to-video and text-to-video transcoding.

In the present invention, communication input and reception modes are viewed as independent. While the originator may transmit video (and embedded audio) communications input, the receiver may lack the ability to effectively receive either video or audio. Chat server 206 thus identifies the input and reception modes, and employs transcoders 208 as appropriate. Upon "entry" (logon) to a chat room, participants such as clients 202 and 204 designate both the input and reception modes for their participation, which may be identical or different (i.e., both send and receive video, or send text and receive video). Server 206 determines which transcoding techniques described above are required for all input modes and all reception modes. When input is received, server 206 invokes the appropriate transcoders 208 and multicasts the transcoded content to the appropriate receivers.

With reference now to FIG. 3, a block diagram of communications transcoding among multiple clients in accordance with a preferred embodiment of the present invention is depicted. Chat server 206 utilizes transcoders 208 to transform communications input as necessary for multicasting to all participants. In the example depicted, four clients 302, 304, 306 and 308 are currently participating in the active chat session. Client A 302 specifies text-based input to chat server 206, and desires to receive content in text form. Client B 304 specifies audio input to chat server 206, and desires to receive content in both text and audio forms. Client C 306 specifies text-based input to chat server 206, and desires to receive content in video mode. Client D 308 specifies video input to chat server 206, and desires to receive content in both text and video modes.

Under the circumstances described, chat server 206, upon receiving text input from client A 302, must perform text-to-audio and text-to-video transcoding on the received input, then multicast the transcoded text form of the input content to client A 302, client B 304, and client D 308, transmit the transcoded audio mode content to client B 308, and multicast the transcoded video mode content to client C 306 and client D 308. Similarly, upon receiving video mode input from client D 308, server 206 must initiate at least video-to-text and video-to-audio transcoding, and perhaps video-to-video transcoding, then multicast the transcoded text mode content to client A 302, client B 304, and client D 308, transmit the transcoded audio mode content to client B 308, and multicast the (transcoded) video mode content to client C 306 and client D 308.

Referring back to FIG. 2A, transcoders 206 may be employed serially or in parallel on input content. FIG. 4 depicts serial transcoding of audio mode input to obtain video mode content, using audio-to-text transcoder 208a to obtain intermediate text mode content and text-to-video transcoder 208b to obtain video mode content. FIG. 4 also depicts parallel transcoding of the audio input utilizing audio-to-audio transcoder 208c to alter identifying characteristics of the audio content. The transcoded audio is recombined with the computer-generated video to achieve the desired output.

By specifying the manner in which input is to be transcoded for all three output forms (text, audio and video), a user participating in a chat session on chat server 206 may create avatars for their audio and video representations. It should be noted, however, that the processing requirements for generating these avatars through transcoding as described above could overload a server. Accordingly, as shown in FIG. 2B and 2C, some or all of the transcoding required to maintain an avatar for the user may be transferred to the client systems 102 and 104 through the use of client-based transcoders 214. Transcoders 214 may be capable of performing all of the A different types of transcoding described above prior to transmitting content to chat server 206 for multicasting as appropriate. The elimination of transcoders 208 at the server 106 may be appropriate where, for example, content is received and transmitted in all three modes (text, audio and video) to all participants, which selectively utilize one or more modes of the content. Retention of server transcoders 208 may be appropriate, however, where different participants have different capabilities (i.e., one or more participants can not receive video transmitted without corresponding transcoded text by another participant).

With reference now to FIG. 5, a high level flow chart for a process of transcoding communications content to create avatars in accordance with a preferred embodiment of the present invention is depicted. The process begins at step 502, which depicts content being received for transmission to one or more intended recipients. The process passes first to step 504, which illustrates determining the input mode(s) (text, speech or video) of the received content.

If the content was received in at least text-based form, the process proceeds to step 506, which depicts a determination of the desired output mode(s) in which the content is to be transmitted to the recipient. If the content is to be transmitted in at least text form, the process then proceeds to step 508, which illustrates text-to-text transcoding of the received content. If the content is to be transmitted in at least audio form, the process then proceeds to step 510, which depicts text-to-audio transcoding of the received content. If Dent. the content is to be transmitted in at least video form, the process then proceeds to step 512, which illustrates text-to-video transcoding of the received content.

Referring back to step 504, if the received content is received in at least audio mode, the process proce