|
Description  |
|
|
BACKGROUND OF THE INVENTION
1. Field of the Invention
The present invention relates to the transmission of digital audio signals,
particularly full frequency digital audio signals.
2. Background Art
The telephone provides convenient audio communications worldwide. However,
the frequency response of a telephone line is limited. Thus, a telephone
is incapable of transmitting high fidelity or full frequency sound. While
the telephone provides satisfactory communications of human speech when
only the content of the speech, not the characteristics of the speech, is
of interest, the telephone is unsuitable for transmission of music or
professional quality audio. Additionally, the telephone requires two or
more parties to be connected simultaneously and in real time. If one party
is unavailable at a particular time, the other party must call again at a
later time to attempt to communicate with the called party. Real time
refers to the actual time during which something takes place. In this
context, real time refers to conditions where the amount of time required
to transmit a sound is substantially equal to the duration of the original
sound, such that the transmitted sound is not of a substantially longer
duration than the original sound and the transmitted sound may be
reproduced with substantially no delay relative to the original sound, if
simultaneous reproduction is desired.
The telephone has been inadequate for certain applications within the
entertainment industry, particularly in the production of international
versions of domestic entertainment materials, such as movies. For
international versions to be produced, the script must be translated into
other languages and performers must be located to provide new soundtrack
speech in the other languages. In casting these performers, it is
desirable to allow domestic casting personnel to listen to auditions in
foreign countries. An "audition" or "audition performance" is a trial
performance by an actor to demonstrate his or her suitability or skill,
and in particular his or her vocal and verbal suitability or skill. To
adequately evaluate the performers, the casting personnel need to hear
high-fidelity full frequency sound. At times, casting personnel may wish
to listen to and direct the auditions in real time from various distant
locations. At other times, however, casting personnel may wish to have the
audition recorded for later listening or review. Thus, a system that
allows monitoring of recorded and real time sound samples and remote
casting is desirable.
Since performers are located in countries geographically remote from the
casting personnel, it has been difficult for casting personnel to attend
performances or participate in recording studio sessions. Listeners have
had to travel to distant lands to direct and evaluate a performance, then
travel back to their homeland. This involves much time and expense. To
avoid such extensive travel, the mailing or transportation of audio
recordings of the performances have been used as alternatives. However,
the shipment of audio recordings is slow and their use often inconclusive,
requiring additional recordings to be made and shipped. Thus, a system
that allows listening to a performance from a remote location at a base
location or some other remote location is desirable.
It is also useful for the remote auditioning locations to be able to hear
stock sound samples for comparison with the auditioning performers. These
sound samples may be compared with audition materials to help casting
personnel make the best actor selection. Comparison may be accomplished
both audibly and visually using integrated software of a type well known
in the art, for example, SoundEdit Pro, manufactured by Macromedia of San
Francisco, Calif., that converts audio signals into printable or
displayable graphic voice patterns. Since stock sound samples may
constitute large amounts of information, it may be difficult to transport
them in a portable form. Thus, it is desirable to provide a method of
communicating with a fixed data base of stock sound samples.
In the past, it was necessary to make tape recordings of performances and
ship the tapes to the desired listener. Tapes provide high fidelity
recording and do not require the listener to be present in real time,
i.e., at the time the performance is given. However, tapes also do not
allow the listener to listen in real time if desired. There is a
substantial delay between performance and listening while the tapes are
being transported. No realtime dialog between the performer and the
listener is possible with tape recordings. The listener cannot make
comments to the performer during the performance or between performances.
Moreover, this method also results in substantial shipping costs.
Alternatively, if many performances are planned or if a lengthy performance
is expected, the listener would often travel to the remote site and attend
the performance in person. This approach avoids any impairment of sound
quality by transmission media and allows realtime listening, but requires
costly and time-consuming travel by important personnel.
Production of entertainment materials that include soundtrack materials
from remote locations has also been difficult. One example of such a
situation might involve the inclusion of the voice of an actor in Germany
in an entertainment program or promotional message produced in France. In
such a case, the actor's voice would have to be recorded on high fidelity
media, such as magnetic tape, which would have to be transported from
Germany to France. As in the case of casting, a substantial delay occurs
and shipping costs are incurred during transportation of the media, and no
realtime interaction is possible between the actor in Germany and the
production personnel in France.
SUMMARY OF THE INVENTION
The present invention provides a method and apparatus for interactive
transmission of full frequency broadcast quality audio signals over a
network. The invention allows transmission of audio frequencies covering
substantially the entire audible spectrum. The present invention also
allows retrieval of sound samples from a remote location and comparison of
a signal from an audio source to the sound samples. The comparison may be
presented audibly or visually.
The present invention allows realtime, interactive full frequency audio
communication plus data communication over the public switched telephone
network (PSTN). The present invention provides portability and may be
easily and quickly installed at any location with suitable access to the
PSTN. The present invention avoids the need for traveling and shipping of
tapes and the delays and expense associated therewith. The invention
allows non-realtime and/or interactive realtime review of previous
performances, allowing a listener to listen to a performance at a time
convenient for the listener.
The invention is significant to the entertainment (e.g. motion picture)
industry since it may be used for remote casting and production. For
example, the present invention may be used to cast character voices.
Production of international theatrical versions may be expedited by
eliminating the need for extensive travel by creative executives and
replacing the current audition process involving multiple international
audio cassette shipments. Thus, revenues from international exhibition may
be realized sooner while simultaneously reducing production costs.
In the present invention, microphones, recording equipment, speakers, and a
computer are provided at a remote site. A medium allowing the transmission
of full frequency audio plus data is provided between the remote site and
a base site. A sound sample server may be located at the base site.
Optionally, microphones, recording equipment and speakers may be provided
at the base site. Audio at the remote site is digitally recorded and may
optionally be stored in the computer at the remote site. The digital
recording is compressed and transmitted to the base site. At the base
site, the transmission is received and decompressed. The reconstructed
full frequency audio may be reproduced for the listener and/or recorded on
the sound sample server or on recording equipment (e.g. a digital audio
tape (DAT) recorder) at the base site.
The present invention may also be used from the remote site to access a
sound sample from a sound sample server at the local site. To access a
sound sample, the computer at the remote site accesses the sound sample
server at the base site, preferably through the public switched telephone
network (PSTN) using an ISDN BRI connection or through a high-speed
digital network. The sound sample server locates the requested sound
sample stored digitally in a data file and transmits the sound sample to
the remote site as a realtime digital audio signal. The computer at the
remote site receives the realtime digital audio signal and reproduces the
audio signal and/or stores it at the remote site for later reproduction.
Software of a type well known in the art for comparing signals, such as
speech signals, may be used to compare the sound sample to an audio
signal, such as an actor's performance from an audition. Comparing the
sound sample to the audition performance helps casting personnel select
the actor best matched to the part being cast.
The preferred embodiment of the present invention uses a fully duplexed
integrated services digital network (ISDN) basic rate interface (BRI) line
to allow bidirectional transmission of full frequency audio and data. An
ISDN BRI line comprises two 64 kbps B channels plus one 16 kbps D channel.
Digitized audio and data is combined and transmitted over the two B
channels of the BRI line. Optionally, data may be transmitted over the D
channel. At the receiver, the digitized audio and data are received from
the BRI line, separated and passed on to their respective destinations.
To practice the present invention with ISDN, analog audio signals at the
remote site may be applied to a high fidelity coder/decoder (codec) at the
remote site. The codec converts the analog audio signals to digital audio
signals, which may be applied to a terminal adapter. A terminal adapter
transmits the digital audio signals over the ISDN. At the base site, a
second terminal adapter supplies the digital audio signals to a second
codec. The second codec receives the signal from the terminal adapter and
converts it back to a high fidelity analog audio signal. The invention may
also be practiced to transmit audio signals from the base site to the
remote site by applying the above procedure in the opposite direction. By
using the present invention for transmission in both directions,
simultaneous bidirectional communication is provided in real time. A sound
server at the base site may also be accessed from the remote site over the
ISDN.
As described above, the present invention allows realtime interactive full
frequency audio and data communications between one or more remote sites
and a base site with the option of realtime interactive
listening/directing or delayed review of transmitted audio. Thus, the
disadvantages of the prior art have been overcome.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a block diagram illustrating the present invention.
FIG. 2 is a block diagram illustrating the preferred embodiment of the
present invention.
FIG. 3 is a sound spectrogram of the audition performance of a first actor.
FIG. 4 is a sound spectrogram of the audition performance of a second
actor.
FIG. 5 is a sound spectrogram of the audition performance of a third actor.
FIG. 6 is a sound spectrogram of a sound sample.
DETAILED DESCRIPTION OF THE INVENTION
A method and apparatus for transmission of full frequency digital audio
signals over a telecommunications network is described. In the following
description, numerous specific details are set forth in order to provide a
more thorough understanding of the present invention. It will be apparent,
however, to one skilled in the art, that the present invention may be
practiced without these specific details. In other instances, well-known
features have not been described in detail in order not to unnecessarily
obscure the present invention.
In the entertainment industry it is often desirable to release movies and
other works in foreign countries as well as domestically. To do so,
however, requires translation of the audio portion of the work into native
languages of the foreign countries. It is desirable to maintain similar
vocal qualities even in foreign language translations. Thus, domestic
casting personnel often assist in the selection of performers in foreign
countries. Casting personnel often have to travel to geographically remote
locations to hear auditions by performers. To avoid having important
personnel travel to remote locations, tape recordings are sometimes made
of the performance and shipped to the listener. However, shipping tapes
results in delays before the recordings reach the listener. Regular
telephone communications are not a suitable medium because the limited
range of frequencies that may be communicated does not provide broadcast
quality in real time and because they do not allow review of previous
performances. Thus, there is a need for a method of listening to remote
performances in high fidelity audio with the capability of reviewing
previous performances.
The present invention allows bidirectional, interactive, realtime high
fidelity broadcast quality audio and data communications over a
telecommunications network, such as the public switched telephone network
(PSTN). The present invention also allows a sound sample server to be
accessed from a geographically remote site and for high fidelity sound
samples to be digitally retrieved in real time from the remote site. The
present invention allows a listener to listen or interactively direct and
record for broadcast a geographically remote performance in high fidelity
and in real time and also allows review of previous performances in high
fidelity. The present invention may be easily transported and installed
quickly anywhere suitable communications facilities are present. The
present invention may be practiced with a plurality of sites coupled
together to form a network, allowing a listener to hear performances from
several different sites or many listeners at different sites to hear
performances at one or more sites. Thus listeners and performers are made
much more accessible to one another.
A block diagram of the present invention is illustrated in FIG. 1.
Microphone 108 is coupled to network interface 102 via coupling 118.
Alternatively, microphone 108 is coupled to computer 104 via coupling 122.
Speaker 110 is coupled to network interface 102 via coupling 120.
Alternatively, speaker 110 is coupled to computer 104 via coupling 124.
Computer 104 is coupled to storage device 106 via coupling 116. Computer
104 is also coupled to network interface 102 via coupling 114. Network
interface 102 is coupled to network 101 via coupling 112.
Microphone 109 is coupled to network interface 103 via coupling 119.
Alternatively, microphone 109 is coupled to computer 105 via coupling 123.
Speaker 111 is coupled to network interface 103 via coupling 121.
Alternatively, speaker 111 is coupled to computer 105 via coupling 125.
Computer 105 is coupled to storage device 107 via coupling 117. Computer
105 is also coupled to network interface 103 via coupling 115. Network
interface 103 is coupled to network 101 via coupling 113.
Signals picked up by microphone 108 are provided to either network
interface 102 through coupling 118 or to computer 104 through coupling
122. If the microphone signals are provided directly to network interface
102, network interface 102 digitizes and compresses the signals and
formats them for transmission through coupling 112 and over network 101.
If the microphone signals are provided to computer 104, computer 104
digitizes the signals and provides them through coupling 114 to network
interface 102. Although the signals passing through coupling 114 are
preferably digital signals, they may alternatively be analog signals.
Optionally, computer 104 sends digital audio signals through coupling 116
and stores them on storage device 106. Network interface 102 formats the
signals for transmission through coupling 112 and over network 101.
The signals originally picked up by microphone 108 pass through network 101
and through coupling 113 to network interface 103. Network interface 103
performs digital-to-analog conversion of the signals and provides an
analog output signal to speaker 111. Alternatively, network interface 103
provides signals through coupling 115 to computer 105. The signals passing
through coupling 115 are preferably digital signals, but may alternatively
be analog signals. If the signals are provided to computer 105, computer
105 performs digital-to-analog conversion on the signals and provides an
analog output signal through coupling 125 to speaker 111. Optionally,
computer 105 may send digital audio signals through coupling 117 and store
them on storage device 107.
The present invention also allows for signals to pass through network 101
in the opposite direction simultaneously (fully duplexed). Signals picked
up by microphone 109 are provided to either network interface 103 through
coupling 119 or to computer 105 through coupling 123. If the microphone
signals are provided directly to network interface 103, network interface
103 digitizes and compresses the signals and formats them for transmission
through coupling 113 and over network 101. If the microphone signals are
provided to computer 105, computer 105 digitizes the signals and provides
them through coupling 115 to network interface 103. Although the signals
passing through coupling 115 are preferably digital, they may
alternatively be of an analog nature. Optionally, computer 105 sends
digital audio signals through coupling 117 and stores them on storage
device 107. Network interface 103 formats the signals for transmission
through coupling 113 and over network 101.
The signals originally picked up by microphone 109 pass through network 101
and through coupling 112 to network interface 102. Network interface 102
performs digital-to-analog conversion of the signals and provides an
analog output signal to speaker 110. Alternatively, network interface 102
provides signals through coupling 114 to computer 104. The signals passing
through coupling 114 are preferably digital signals, but may alternatively
be analog signals. If the signals are provided to computer 104, computer
104 performs digital-to-analog conversion on the signals and provides an
analog output signal through coupling 124 to speaker 110. Optionally,
computer 104 sends digital audio signals through coupling 116 and stores
them on storage device 106.
A block diagram of the preferred embodiment of the present invention is
illustrated in FIG. 2. Audio gear 206 is coupled through couplings 212 and
214 to APT DSM100 codec 204. Coupling 212 couples signals from audio gear
206 to codec 204. Coupling 214 couples signal from codec 204 to audio gear
206. Audio gear 206 is coupled through coupling 220 to computer 104. Codec
204 is coupled to computer 104 through coupling 216. Computer 104 is
coupled to storage device 106 through coupling 116. Codec 204 may be
coupled to terminal adapter 202 through coupling 210. Alternatively or
supplementally, computer 104 may be coupled to terminal adapter 202
through coupling 218. Terminal adapter 202 is coupled to ISDN 201 through
coupling 208.
Audio gear 207 is coupled through couplings 213 and 215 to APT DSM100 codec
205. Coupling 213 couples signals from audio gear 207 to codec 205.
Coupling 215 couples signal from codec 205 to audio gear 207. Audio gear
207 is coupled through coupling 221 to computer 105. Codec 205 is coupled
to computer 105 through coupling 217. Computer 105 is coupled to storage
device 107 through coupling 117. Codec 205 is coupled to terminal adapter
203 through coupling 211. Alternatively or supplementally, computer 105 is
coupled to terminal adapter 203 through coupling 219. Terminal adapter 203
is coupled to ISDN 201 through coupling 209.
An analog-to-digital (A/D) and digital-to-analog (D/A) conversion device,
or coder/decoder (codec), preferably the DSM100 made by Audio Processing
Technology, that uses a data compression algorithm to allow bidirectional
full frequency audio to be transmitted over digital channels, may be
employed. The DSM100 is coupled to the ISDN BRI line through an ISDN BRI
terminal adapter (TA). One DSM100 is used at each site connected to the
ISDN.
The APT DSM100 digital audio transceiver is produced by Audio Processing
Technology Limited (HQ), 21 Stranmillis Road, Northern Ireland, BT9 5AF.
The DSM100 uses an apt-X100 digital audio data compression system to
provide a 4:1 compression ratio. The DSM100 may be configured to transmit
one audio signal (mono mode) or two simultaneous audio signals (stereo
mode) with selectable bandwidths between 6.2 kHz and 22.5 kHz. The DSM100
provides balanced XLR analog audio inputs and outputs for coupling the
DSM100 to analog audio apparatus. Alternatively, Sony.RTM./Philips.RTM.
Digital Interface Format (SPDIF) and Audio Engineering Socity/European
Broadcasting Union (AES/EBU) digital audio inputs and outputs are provided
for coupling to digital audio apparatus. The DSM100 also provides an
RS449/RS422/X.21 compatible interface that allows connection to an ISDN
BRI TA.
Full frequency audio from audio gear 206 is provided to an audio input of
DSM100 codec 204 through coupling 212. DSM100 codec 204 digitizes (if the
audio input is analog) and compresses the audio signal and provides a
digital output through coupling 210 to TA 202. TA 202 transmits the
digital signal through coupling 208 and ISDN 201. The signal remains in
digital form while transmitted via ISDN 201. At the opposite end of ISDN
201, a second TA 203 receives the signal through coupling 209 and provides
it through coupling 211 to a second DSM100 codec 205. DSM100 codec 205
decompresses the signal, converts it to analog form and provides a full
frequency analog output signal through coupling 215 that may be monitored,
amplified or otherwise processed by audio gear 207. Alternatively, the
signal may be left in digital form and provided as an uncompressed digital
audio output signal through coupling 215. DSM100 codec 205 can also
transmit full frequency audio in the opposite direction through ISDN 201
for reception by DSM100 codec 204. Couplings 213 and 214 are used to
couple signals during transmission in the opposite direction. Depending on
the situation, bidirectional or unidirectional communication may be occur
in either a full or half duplex mode. Full duplex communication allows
signals to be sent in both directions simultaneously, while half duplex
communication allows signals to be sent in only one direction at a time.
Although a half duplex mode may be used for unidirectional or alternating
bidirectional communication, it is preferable to use a full duplex mode
for both unidirectional and bidirectional communication. Thus,
bidirectional full frequency audio transmission is provided.
Digital data may also be transmitted along with full frequency audio.
Digital data from one end of the ISDN line may be provided from a computer
104 to DSM100 codec 204 through coupling 216, which combines the data with
the digital audio signal and compresses it. The combined audio and data
signal in compressed form is provided through coupling 210 to TA 202,
which transmits it over ISDN 201 | | |