|
Description  |
|
|
TECHNICAL FIELD
This invention relates to optimizing techniques for use with executable computer code and more particularly, relates to computerized systems and methods for dynamically linking code segments.
BACKGROUND ART
In a digital signal processor implementation of some complex functions, it is common to utilize basic building blocks or code segments which may be interconnected in multiple ways. These code segments are analogous to separate hardware modules
which might previously have been found in various systems employing patch chords or switchable hardwired interconnections to permit selection and configuration of the modules.
As but one example, although the invention is not intended to be so limited and admits to numerous embodiments and applications, in the implementation of a music synthesizer with a digital signal processor, such modules might include analogous
software implementations of hardware modules such as oscillators, filters, or voltage controlled amplifiers. These modules were found in music synthesizers of the 1970s wherein they were interconnected as desired and as previously described with patch
chords or other means.
In a DSP implementation of such complex functions, in many applications it may be desirable to reconfigure the system in real time. Again, using the musical application as but one example, it may be desirable to provide a DSP implementation of a
music synthesizer wherein the synthesizer may be capable of being reconfigured in real time to permit generation of several different sounds at the same time. Such capability being referred to in the art as "polytimbral".
In such an implementation of function wherein real time reconfiguration is required (typically in DSP systems and code, although the invention is not so limited), one apparent solution to the problem of real time configuration is to group
together all the possible code segments needed to implement a function such as the particular sound of one instrument, a filter, or the like. Each code sequence necessary to implement that function is made a "callable" routine. In this manner, a given
configuration then need only be a list of such subroutines and the order in which they appear. It would appear by such an implementation, that this approach would allow for great flexibility in the configuration of the modules. However, it was found in
the art that in fact such implementations suffered significant performance penalties resulting from the time involved to effect the necessary calls and returns made to each such code segment.
In the development of the art, a solution eventually appeared wherein the required code modules could be linked together as needed. This linking was a common programming procedure employed by program compilers, for example, which typically would
collect multiple code segments together into a single code sequence.
One problem with this approach, however, relates to the particular application being discussed wherein various code modules in need of execution are changing over time, as is the case, for example, when a DSP, in order to implement a polytimbral
combination of brass and woodwind sounds, for example, would be required to first implement the brass code module followed very quickly by that of the woodwind in order for the sound to be perceived as a simultaneous voicing of both instrument types.
Moreover, the problem is compounded in certain applications due to the nature of shared memory DSP co-processor systems which may have a relatively small amount of shared memory on the order of 8K, for example, which must execute code resident therein at
a very rapid rate. Due to the small memory size, in order to achieve the desired multi-timbral and other effects, it is necessary to periodically load other code from the host into this shared memory for execution by the DSP. However, due to the
additional factor of this code having to execute very rapidly (for example to generate satisfying synthesized acoustic sound), the problem arose of how to interject such additional code modules into this limited memory in such a way so as to avoid the
host attempting to update the buffer of such a shared memory while code in the buffer was being executed by the DSP.
Usually, the DSP is disabled during these periods when the host is writing additional DSP program modules to the shared memory so as to prevent the DSP from attempting to execute such instructions which have only partially been written out by the
host processor to the shared memory. A "ping-pong" form of double buffering has long been known in the data processing arts wherein one buffer contains code being executed while a second buffer is being filled. The situation is then reversed wherein
the code in the just-filled buffer executes while additional code is being loaded into the first buffer. However, this conventional practice, while useful in preventing the host from updating the buffer being executed by the DSP in some instances, is
not entirely satisfactory. For example, if the host attempted to update buffers twice in rapid succession such a double buffering technique would not prevent the undesired updating by the host of the buffer being executed by the DSP.
Accordingly, for these and other reasons, a system and method were long desired for dynamically linking code segments wherein executable code is optimized by performing linking of code segments in real time.
SUMMARY OF THE INVENTION
A system is provided including a host processor and an audio capture and playback adapter having a DSP co-processor. The adapter includes shared memory accessible from both the DSP and the host. A DSP program is periodically written to the
shared memory by the host and executed by the DSP. A non ping-pong dual buffer technique is disclosed wherein alternately one buffer is executed by the DSP while the remaining buffer is updated or linked by the host. In one embodiment two pointer
variables are used, each indicating respectively which buffer is currently being executed by the DSP and which has been updated by the host. Initially, both pointer A and pointer B point to buffer A containing the initial DSP code. Each time the DSP
requires execution of the configurable program, it reads pointer B, copies it to pointer A, and then branches to the buffer pointer A points to. When the host begins to relink to a buffer, it first sets pointer B equal to pointer A, relinks into the
opposite buffer as pointer A, and then sets pointer B to this opposite buffer. The host is thereby prevented from updating a buffer currently being executed by the DSP.
BRIEF DESCRIPTION OF THE DRAWINGS
The novel features believed to be characteristic of the invention are set forth in the appended claims. The invention itself, however, as well as other features and advantages thereof, will be best understood by reference to the following
description of the preferred embodiment when read in conjunction with the accompanying figures where:
FIG. 1 is a high level schematic illustration of a computerized audio capture and playback adapter system employing the linking system of the invention.
FIG. 2 is a block diagram of representative computer code for generating an acoustic sound by the system of FIG. 1 depicting a representative module and submodule building blocks.
FIG. 3 is a schematic illustration of code segments comprising multiple modules such as that depicted in FIG. 2 intended to be executed by the digital signal processor system portion of the computerized audio system of FIG. 1.
FIG. 4 is a functional block diagram of an audio capture and playback adapter card for use in the system of FIG. 1.
FIG. 5 is an illustration of a double buffering technique known in the prior art for handling execution of code modules while simultaneously loading additional modules for execution.
FIGS. 6A-6D are schematic illustrations of the sequential status of pointers employed in the linking system of the invention.
FIG. 7 is an illustration of a specific state of pointers in operation of a linking system which causes malfunctioning of the system of FIG. 1 when the features of the linking system of the invention are not employed.
FIG. 8 is a flow diagram of the various states of the system of FIG. 1 as they relate to the pointers of FIG. 6 when operating in accordance with the invention.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT
First, a generalized description will be provided of an illustrative environment to which the invention may be adapted related to the generation of sound. In connection with this, the general functions provided by the digital signal processor or
DSP component of the system of FIG. 1 will next be described with reference to FIGS. 2 and 3 and the code segments executed by this DSP to effect a musical sound generally (FIG. 2 showing typical illustrative code segments required for creation of sound
generally and FIG. 3 showing the sequence of code segments to be executed for a user to perceive different multiple simultaneous sounds referred to as "multi-timbral" in the musical arts). Disclosure of a representative audio capture and playback
adapter including such a DSP as a component of the system of FIG. 1 will then be provided with reference to FIG. 4. This will be followed by a discussion of the novel linking system of the invention employing a set of pointers to program addresses
wherein various code segments of DSP code reside with reference to FIGS. 6A-6D.
Referring now to FIG. 1, a computerized audio capture and playback system 10 is depicted in simplified form. The system includes a computer 12 including a host CPU 22, and an I/O Bus 20. Interconnected to the Bus 20 are numerous peripheral
devices well known in the art such as a keyboard 12, display 14, microphone 16, and speaker 18. The system 10 with the exception of an audio capture and playback adapter 24 to be hereinafter described may take the form of any conventional personal
computer such as the PS/2 (PS/2 is a trademark of the IBM Corp.) computer marketed by the IBM Corporation. With respect to this audio capture and playback adapter (ACPA) 24, in a preferred embodiment it will take the form of an adapter card which may be
plugged into an expansion slot which forms a portion of the computer 12. This ACPA card will preferably contain a digital signal processor 26 such as the TMS 320 C25 signal processor available from the Texas Instruments Corporation. Additionally the
ACPA 24 will include shared memory 28 accessible both by the DSP 26 and the host CPU 22 through the I/O bus 20.
One purpose of the ACPA card 24 is to receive analog audio data from an audio source such as microphone 16 through the bus 20, and to "capture" or digitize this audio data, store it, and then later retrieve it whereupon it is converted back to
analog form and sent out on the I/O bus 20 to an appropriate sound transducer such as a speaker 18. The keyboard 12 and 14 will serve conventional purposes well known in the art of facilitating human interface to the computer 12 and are thus not
discussed herein. For purposes of simplicity, other necessary components of the computer 12 are omitted for clarity which are well known in the art, such as disk memory and the like.
Yet an additional important function of the adapter card 24 is to execute code which may generate synthesized sounds. In this form of operation, executable code is stored in the shared memory 28 which is accessed and executed by the DSP 26, the
output of which is passed through the I/O bus 20 to the speaker 18 if desired, recorded, or the like. An important notion with respect to the system of FIG. 4 which will be hereinafter described in greater detail relates to the fact that the system 10
of FIG. 1 in which the invention is implemented is a tightly coupled co-processing system wherein shared memory is included accessed by multiple processors such as that of the host CPU 22 and the DSP 26. Although in the implementation shown in FIG. 10
one of the processors is a digital signal processor well known in the art, it will be readily appreciated that the invention is not intended to be so limited and admits to other implementations not involving DSPs but some other form of multiple
processors accessing this shared memory 28 to form a tightly coupled co-processing system.
It will be noted that, as will become clearer in he disclosure to follow, this shared memory 28 will contain memory locations which may be read by the host CPU 22 as well as the DSP 26 and written into by the CPU 22. This shared memory 28 will
include code segments or modules which will be executed by the DSP 26 to generate the desired sounds. However, due to the limitations on size of this shared memory 28 and the need for it to execute rapidly, it is necessary for the host CPU 22 to load in
from time to time additional such code segments for sequential execution by the DSP. Inasmuch as both the DSP 26 and the CPU 22 have access to this common memory, this gives rise to a problem addressed by the subject invention wherein contention or
conflicts may occur with respect to this shared memory 28. This relates to the fact that shared memory is accessed in a co-processing environment by multiple processors such as the DSP 26 and CPU 22.
Referring now to FIG. 2, a functional block diagram is shown of typical code segments which must be implemented and executed by the DSP 26 to generate audio sounds. It will be noted in passing that although an audio application is discussed,
that the invention is not to be so limited but has application to a variety of code modules. It will further be appreciated that the particular function being performed by this code segment 80 of FIG. 2 executed by the DSP 26 could be essentially any
function although in the particular embodiment being discussed these functions happen to relate to generation of audio sounds. In a digital signal processor implementation of sound, such as in the case of a DSP emulation of a music synthesizer, it is
common to utilize basic building blocks or code segments such as code segment 80 which contains sub blocks such as oscillator 82, filter 84, voltage controlled amplifier (VCA) 86 and various operators 90-94 which may be interconnected in multiple ways to
generate many different sounds as desired. Such code segments are analogous to the separate hardware modules found on larger music synthesizers of the 1970s. In the latter case, it was conventional either to use patch chords or a limited set of
switchable hardwire interconnections to allow selection and configuration of the modules as desired. However, in the modern implementation of a music synthesizer under discussion, it is desirable to reconfigure in real time to allow the execution of
several different sounds at the same time, such function being referred to as polytimbral. This such operation necessitates sequential execution of different versions of modules such as that shown in FIG. 2. It will be noted from the line 96 of FIG. 2
that some of the functions necessary to generate sound for a given module such as those above line 96 may be implemented by the DSP 26 executing code corresponding to these functions in the code segment 80 resident in the shared memory 28.
Additional functions below line 96 such as the operators 90--94 may be executed by the host CPU 22. The functions appearing above line 96 would be those requiring rapid sequential execution wherein the code segments of the submodules such as the
oscillator 82, filter 84, and VCA 86 must execute rapidly in a looping successive fashion to generate the desired digital output 88 which may be converted to perceptible analog audio output. Such a loop through the entire collection of code segments
shown in box 80 may typically occur at the audio digitizing or sample rate of, for example, 44 Kilohertz, e.g. the code segment 80 will be looped through and executed once for every sample generated for an audio signal from the audio source 16.
In contrast, additional code of FIG. 2 below the line 96 may execute at a much slower rate such as 100-300 hertz which may thus typically be executed by the CPU 22. Examples of such operators being implemented by the CPU 22 at the same time the
code of segment 80 is being executed by the DSP 26 include amplitude envelopes for example. In addition to the rapid execution required of the code segment 80 yet an additional reason for limiting the amount of code executing by the DSP relates to the
nature of the onboard shared memory 28 associated with the DSP 26, namely that such memory is quite limited in size relative to the conventional memory normally associated with a computer system 12. As an example, it may be common in such systems to
have only 8K of shared memory. Due to the need to have different types of code modules such as that of FIG. 2 executing sequentially to achieve the desired multi-timbral effect and the constraint of a limited memory size, it will be readily apparent
that it may be necessary for the host CPU 22 from time to time to load into the shared memory 28 additional such code segments such as that of FIG. 2. This may be seen illustrated with reference to FIG. 3.
Referring now to FIG. 3 in greater detail, a code segment 98 is shown which may correspond to the code segment 80 of FIG. 2 including the necessary DSP code such as oscillators, filters, and the like to generate a clarinet sound. Also shown in
FIG. 3 are additional code segments 100 and 102 similar to that of code segment 98 and the generalized representation of code for generating an instrument such as the code segment 80 of FIG. 2. It will be noted from FIG. 3 that in order to generate a
different instrument such as a trumpet or horn represented by code segments 100 or 102, a different form of oscillator 82, filter 84 or other submodules or combinations thereof are required. By the DSP 26 executing these various code segments 100, 98
and 102 sequentially as shown schematically by the arrow 104 and by doing so rapidly enough, the audio data thereby made available from the ACPA 24 as indicated by output 88 of FIG. 2 will achieve the desired polytimbral effect wherein a human may
perceive the simultaneous sounding of the trumpet, clarinet, and horn corresponding to these code segments 100, 98, and 102.
It will be recalled that due to the limited size of the shared memory 28, it is sometimes not possible to include all of the code segments 100, 98 and 102 in the shared memory. This is illustrated by the box 106 and arrow 108 indicating that the
shared memory 28 may perhaps be large enough only to contain the code segment associated with the clarinet 98. Accordingly, it will be readily perceived that a need exists at the appropriate time for the host CPU 22 to load these additional code
segments such as that of the trumpet 100 or horn 102 into the shared memory 28. Yet an additional reason for providing the facility of being able to load additional code segments from the CPU 22 into the shared memory 28 for execution by the DSP 26 in
real time is that the necessary or desired additional code segments might be determined on the fly in real time and thus change over time as a function of other parameters or considerations. One problem in providing the facility in a co-processing
environment for permitting the CPU 22 to load these additional code segments into the shared memory 28 is that this shared memory is also being accessed simultaneously by the DSP 26.
Accordingly, a problem arises in being able to asynchronously permit the CPU 22 to download appropriate additional code segments as desired into this shared memory 28 without affecting the accessing of the shared memory 28 by the DSP 26 and thus
without interfering with the sequential execution of code in the shared memory 28 by this DSP 26. As a simple illustrative example, the DSP 26 may be in the middle of executing code in the shared memory 28 associated with the oscillator function 82 of
FIG. 2 when, due to the asynchronous operation of the CPU 22, the CPU 22 begins downloading another code segment into the shared memory 28 thereby overwriting code yet to be executed by the DSP 26 related to this oscillator 82 function.
Before providing a detailed description of prior art double buffering techniques with respect to FIG. 5 and the techniques of the invention with respect to FIGS. 6-8, a detailed description will be provided of a representative Audio Capture and
Playback A adapter or ACPA 24 of FIG. 1 which may be employed in the computer system 12 to effect the dynamic linking of code segments in real time in accordance with the invention.
Thus referring now to FIG. 4, there is depicted a block diagram of an audio adapter which includes digital signal processor 26 which may be utilized to implement the method and apparatus of the present invention. As discussed above, this audio
adapter may be simply implemented utilizing the IBM Audio Capture & Playback Adapter (ACPA) which is commercially available. In such an implementation digital signal processor 26 is provided by utilizing a Texas Instruments TMS 320C25, or other suitable
digital signal processor.
As illustrated, the interface between processor 22 and digital signal processor 26 is I/O bus 30. Those skilled in the art will appreciate that I/O bus 30 may be implemented utilizing the Micro Channel or PC I/O bus which are readily available
and understood by those skilled in the personal computer art. Utilizing I/O bus 30, processor 22 can access the host command register 34. Host command register 32 and host status register 34 are used by processor 22 to issue commands and monitor the
status of the audio adapter depicted within FIG. 4.
Processor 22 may also utilize I/O bus 30 to access the address high byte latched counter and address low byte latched counter which are utilized by processor 22 to access shared memory 48 (also shown in FIG. 1 as shared memory 28) within the
audio adapter depicted within FIG. 4. Shared memory 48 is preferably an 8K.times.16 fast static RAM which is "shared" in the sense that both processor 22 and digital signal processor 26 may access that memory. As will be discussed in greater detail
herein, a memory arbiter circuit is utilized to prevent processor 22 and digital signal processor 26 from accessing shared memory 28 simultaneously.
As is illustrated, digital signal processor 26 also preferably includes digital signal processor control register 36 and digital signal processor status register 38 which are utilized, in the same manner as host command register 32 and host
status register 34, to permit digital signal processor 26 to issue commands and monitor the status of various devices within the audio adapter.
Processor 22 may also be utilized to couple data to and from shared memory 38 via I/O bus 30 by utilizing data high byte bi-directional latch 44 and data low-byte bi-directional latch, 46, in a manner well known in the art.
Sample memory 50 is also depicted within the audio adapter of FIG. 4. Sample memory 50 is preferably a 2K.times.16 static RAM which is utilized by digital signal processor 26 for outgoing samples to be played and incoming samples of digitized
audio. Sample memory 50 may be utilized as a temporary buffer to store decompressed digital audio samples and MIDI synthesized music samples for simultaneous output. Those skilled in the art will appreciate that by decompressing digital audio data and
by creating synthesized music from MIDI files unit a predetermined amount of each data type is stored within sample memory 50, it will be a simple matter to combine these two outputs if desired.
Control logic 56 is also depicted within the audio adapter of FIG. 4. Control logic 56 is preferably a block of logic which, among other tasks, issues interrupts to processor 22 after a digital signal processor 26 interrupt request, controls the
input selection switch and issues read, write and enable strobes to the various latches and memory devices within the audio adapter depicted. Control logic 56 preferably accomplishes these tasks utilizing control bus 58.
Address bus 60 is depicted and is preferably utilized, in the illustrated embodiment of the present invention, to permit addresses of various samples and files within the system to be coupled between appropriate devices in the system. Data bus
62 is also illustrated and utilized to couple data among the various devices within the audio adapter depicted.
As discussed above, control logic 56 also uses memory arbiter logic 64 and 66 to control access to shared memory 48 and sample memory 50 to ensure that processor 22 and digital signal processor 26 do not attempt to access either memory
simultaneously. This technique is well known in the art and is necessary to ensure that memory deadlock or other such symptoms do not occur.
Finally, digital-to-analog converter 52 is illustrated and is utilized to convert the decompressed digital audio or digital MIDI synthesized music signals to an appropriate analog signal. The output of digital-to-analog convert 52 is then
coupled to analog output section 68 which, preferably includes suitable filtration and amplification circuitry. Similarly, the audio adapter depicted within FIG. 4 may be utilized to digitize and store audio signals by coupling those signals into analog
input section 70 and thereafter to analog-to-digital converter 54. Those skilled in the art will appreciate that such a device permits the capture and storing of analog audio signals by digitization and storing of the digital values associated with that
signal.
Now that a description of the audio adapter card has been provided, a brief explanation will be given as to why conventional double buffering techniques well known in the art will be ineffective in solving the problem presented with reference to
FIG. 5. As indicated therein, the traditional "ping-pong" method of double buffering, i.e. writing to one buffer while the remaining buffer is being read will not work in the disclosed implementation for dynamic linking of code segments in real time.
As shown in FIG. 5, in accordance with conventional practice in the computer science art, when it is necessary for data or code to be received and stored while a processing system is continuing smoothly to execute code, it has long been known to provide
for two buffers 112 and 114 in a memory system 110. It is further conventional to provide for a pointer such as pointers 116, 118, and 120 whose function it is to point to an address location of interest. Thus pointer 116 for example could be pointing
to code in buffer 112 which is executing. When the host 22 has a need for execution of additional functions not provided by the code resident in the left buffer, upon an appropriate request from the host during execution of the code in the left buffer
112, the host might then begin filling the right buffer 114. Upon completion of filling the buffer 114 the current pointer would be changed as shown schematically by pointer 118 to point to the right, e.g. at the next loop through the code the pointer
would provide a starting address location of the code in the right buffer 114 containing the new function to be executed whereupon execution would begin by looping through and performing the new code now contained in the right buffer 114. The next time
the host required still additional function to be provided by code not then contained in the right buffer 114, in a manner correlative to the previously described steps, the host would thence begin to fill the left buffer 112. In a manner similar to the
middle figure of FIG. 5, once filling of the left buffer 114 with code for performing t | | |