|
Claims  |
|
|
What is claimed is:
1. A sound processing apparatus comprising a sound information input device, a recording device to record said sound information, a converting device to convert said sound
information into image information, and a display device to display said image information, said display device being such that the vertical and horizontal directions of said display device are time axes and the unit of one of the time axes is longer
than the unit of the other time axis.
2. The sound processing apparatus of claim 1, further comprising a selecting device to select the image information displayed on said display device, whereby the sound information can be selected.
3. The sound processing apparatus of claim 1, further comprising a time measuring device and wherein said time is recorded in said recording device, and said image information and said time are displayed on said display device.
4. The sound processing apparatus of claim 1, further comprising:
a sound free portion detecting device to detect a sound-free portion in which there is no sound of a predetermined level or higher for a predetermined time or longer, wherein first image information made from said sound information recorded
before the sound-free portion detected by said frequency detecting device and image information made from said sound information recorded after the sound-free portion are separated from each other by starting a new line and displayed on said display
device.
5. The sound processing apparatus of claim 4, further comprising a selecting device and wherein the image information displayed on said display device is selected, whereby the sound information can be selected.
6. The sound processing apparatus of claim 4, further comprising a time measuring device and wherein said time is recorded in said recording device, and said image information and said time are displayed on said display device.
7. The sound processing apparatus of claim 1, further comprising:
a frequency detecting device to detect the frequency component of said sound information within a predetermined time, and a converting device to convert said sound information into image information corresponding to said frequency component,
wherein said image information is displayed on said display device in colors set in correspondence with a frequency.
8. The sound processing apparatus of claim 7, further comprising a selecting device and wherein the image information displayed on said display device is selected, whereby the sound information can be selected.
9. The sound processing apparatus of claim 7, wherein the predetermined time for detecting the frequency component is at least 0.3 second.
10. sound processing apparatus of claim 7, further comprising a compressing device using a discrete cosine transformation device to compress the sound information and wherein said discrete cosine transformation device is used as said frequency
detecting device.
11. The sound processing apparatus of claim 7, further comprising a time measuring device and wherein said time is recorded in said recording device and said image information and said time are displayed on said display device.
12. The sound processing apparatus of claim 1, further comprising:
a frequency detecting device to detect the frequency component of said sound information within a predetermined time,
wherein when the difference between the frequency component of first sound information recorded with the lapse of time and the frequency component of second sound information recorded thereafter is a predetermined value or greater, image
information made from said first sound information and image information made from said second sound information are separated from each other and displayed.
13. The sound processing apparatus of claim 12, further comprising a selecting device and wherein the image information displayed on said display device is selected, whereby the sound information can be selected.
14. The sound processing apparatus of claim 12, wherein the predetermined time for detecting the frequency component is at least 0.3 second.
15. The sound processing apparatus of claim 12, further comprising a compressing device using a discrete cosine transformation device to compress the sound information and wherein said discrete cosine transformation device is used as said
frequency detecting device.
16. The sound processing apparatus of claim 12, further comprising a time measuring device and wherein said time is recorded in said recording device, and said image information and said time are displayed on said display device.
17. The sound processing apparatus of claim 1, further comprising:
a frequency detecting device to detect the frequency component of said sound information within a predetermined time,
wherein said converting device converts said sound information into image information corresponding to said frequency component, and wherein when the difference between the frequency component of first sound information recorded with the lapse of
time and the frequency component of second sound information recorded thereafter is a predetermined value or greater, image information made from said first sound information and image information made from said second sound information are separated
from each other and displayed.
18. The sound processing apparatus of claim 17, further comprising a selecting device and wherein the image information displayed on said display device is selected, whereby the sound information can be selected.
19. The sound processing apparatus of claim 17, wherein the predetermined time for detecting the frequency component is at least 0.3 second.
20. The sound processing apparatus of claim 17, further comprising a compressing device using a discrete cosine transformation device to compress the sound information and wherein said discrete cosine transformation device is used as said
frequency detecting device.
21. The sound processing apparatus of claim 17, further comprising a time measuring device and wherein said time is recorded in said recording device, and said image information and said time are displayed on said display device.
22. The sound processing apparatus of claim 1, further comprising:
a frequency detecting device to detect the frequency component of said sound information within a predetermined time; and
a sound-free portion detecting device to detect a sound-free portion in which there is no sound of a predetermined level or higher for a predetermined time or longer,
wherein when the difference between the frequency component of first sound information recorded with the lapse of time and the frequency component of second sound information recorded thereafter is a predetermined value or greater, and when said
sound-free portion is detected between said first sound information and said second sound information by said frequency detecting device, image information made from said first sound information and image information made from said second sound
information are separated from each other and displayed.
23. The sound processing apparatus of claim 22, further comprising a selecting device and wherein the image information displayed on said display device is selected, whereby the sound information can be selected.
24. The sound processing apparatus of claim 22, wherein the predetermined time for detecting the frequency component is at least 0.3 second.
25. The sound processing apparatus of claim 22, further comprising a compressing device using a discrete cosine transformation device to compress the sound information and wherein said discrete cosine transformation device is used as said
frequency detecting device.
26. The sound processing apparatus of claim 22, further comprising a time measuring device and wherein said time is recorded in said recording device, and said image information and said time are displayed.
27. The sound processing apparatus of claim 1, further comprising:
a frequency detecting device to detect the frequency component of said sound information within a predetermined time; and
an output device to output sound information including a predetermined frequency component from among a plurality of bits of sound information recorded in said recording device.
28. The sound processing apparatus of claim 27, wherein the predetermined time for detecting the frequency component is at least 0.3 second.
29. The sound processing apparatus of claim 27, further comprising a compressing device using a discrete cosine transformation device to compress the sound information and wherein said discrete cosine transformation device is used as said
frequency detecting device.
30. The sound processing apparatus of claim 1, comprising:
an image reproducing device to reproduce still image information; and
a sound reproducing device to reproduce sound information corresponding to said still image information,
wherein said still image information is displayed for a time necessary to reproduce the sound information corresponding to said still image information.
31. A sound processing apparatus comprising a sound information input device, a recording device to record said sound information, a frequency detecting device to detect the frequency component of said sound information within a predetermined
time, a converting device to convert said sound information into first image information corresponding to said frequency component, an image pickup device to convert an object image into second image information, a compressing device to compress said
second image information by the use of discrete cosine transformation, and an image recording device to record said compressed information, wherein said frequency detecting device uses the discrete cosine transformation.
32. A sound processing apparatus, comprising:
a sound information input device;
a recording device to record said sound information;
a display device;
a frequency detecting device to detect the frequency component of said sound information within a predetermined time;
a converting device to convert said sound information into image information corresponding to said frequency component; and
a selecting device to select the image information displayed on said display device and to select the sound information displayed on said display device,
wherein said image information is displayed on said display device in colors set in correspondence with a frequency. |
|
|
|
|
Claims  |
|
|
Description  |
|
|
BACKGROUND OF THE INVENTION
1. Field of the Invention
This invention relates to a sound processing apparatus.
2. Related Background Art
There are known tape recorders for recording and reproducing sound and sound recording electronic cameras or the like capable of recording and reproducing both of sound and images.
Such an apparatus is provided with a so-called counter and has been designed such that the display by the counter changes with the lapse of time or the running of a tape.
In such a sound processing apparatus, when sound is to be reproduced, it has been necessary to look for the location of desired sound with the display by the counter as a standard. When the desired sound is not found out, it has been necessary
to rapidly feed or rewind the tape and look for the sound by the help of the counter and the sixth sense, and it has been very difficult to operate such apparatus.
Also, there has been software displaying sound information in personal computers or the like, but some of the software is merely the above-described sound processing apparatus as it has been simulated by software and the operability of the
apparatus has never been particularly improved.
Also, in another set of software, an oscilloscope is simulated in the fashion of software, and there has been one which displays sound as a waveform. It has been possible to select a portion of which the sound reproduction is desired on a
monitor by selecting means.
However, even when the kind of the sound which is the object of recording changes as when for example, the speaker changes, a similar waveform is displayed and it has been impossible to recognize more or less difference in the waveform with the
naked eye and pressure the generation source of the sound. Accordingly, there have been required trial and error such as reproducing the sound and further reproducing this side or that side thereof from that situation and thus, the convenience of use
has been bad.
Also, in a sound processing apparatus of this kind, sound is generally represented as a graph on a monitor, and the vertical direction has been a sound pressure axis representative of the strength of waveform and the horizontal direction has been
a time axis representative of time. Therefore, when an attempt is made to display sound recorded for a long time at once, it has been necessary to reduce the whole as by changing the axis of abscissas of the graph, for example, from five seconds to one
minute per 1 cm. If this is done, there has arisen the problem that when there is sound uttered for a short time in a portion thereof, the graph representative of this sound of short time becomes small and becomes unrecognizable.
SUMMARY OF THE INVENTION
It is an object of the present invention to provide a sound processing apparatus which can quickly effect the retrieval of desired sound information.
To achieve the above object, according to a first aspect of the present invention, there is provided a sound processing apparatus provided with sound information input means, recording means for recording the sound information, converting means
for converting the sound information into image information, and display means for displaying the image information, the display means being such that the vertical and horizontal directions of the display means are time axes and the unit of one of the
time axes is longer than the unit of the other time axis.
According to a second aspect of the present invention, there is provided a sound processing apparatus provided with sound information input means, recording means for recording the sound information, converting means for converting the sound
information into image information, display means for displaying the image information, and frequency detecting means for detecting a sound-free portion in which there is no sound of a predetermined level or higher for a predetermined time or longer,
wherein first image information made from sound information recorded before the sound-free portion detected by the frequency detecting means and image information made from sound information recorded after the sound-free portion are separated from each
other and displayed on the display means.
According to a third aspect of the present invention, there is provided a sound processing apparatus provided with sound information input means, recording means for recording the sound information, converting means for converting the sound
information into image information, display means for displaying the image information, and sound-free portion detecting means for detecting a sound-free portion in which there is no sound of a predetermined level or higher for a predetermined time or
longer, wherein the image information differs between the sound-free portion and a non-sound-free portion.
According to a fourth aspect of the present invention, there is provided a sound processing apparatus provided with sound information input means, recording means for recording the sound information, display means, frequency detecting means for
detecting the frequency component of the sound information within a predetermined time, and converting means for converting the sound information into image information corresponding to the frequency component, wherein the image information is displayed
on the display means.
According to a fifth aspect of the present invention, there is provided a sound processing apparatus provided with sound information input means, recording means for recording the sound information, display means, frequency detecting means for
detecting the frequency component of the sound information within a predetermined time, and converting means for converting the sound information into image information, wherein when the difference between the frequency component of first sound
information recorded with the lapse of time and the frequency component of second sound information recorded thereafter is a predetermined value or greater, image information made from the first sound information and image information made from the
second sound information are separated from each other and displayed.
According to a sixth embodiment of the present invention, there is provided a sound processing apparatus provided with sound information input means, recording means for recording the sound information, display means for displaying image
information, frequency detecting means for detecting the frequency component of the sound information within a predetermined time, and converting means for converting the sound information into image information corresponding to the frequency component,
wherein when the difference between the frequency component of first sound information recorded with the lapse of time and the frequency component of second sound information recorded thereafter is a predetermined value or greater, image information made
from the first second information and image information made from the second sound information are separated from each other and displayed.
According to a seventh aspect of the present invention, there is provided a sound processing apparatus provided with sound information input means, recording means for recording the sound information, display means, frequency detecting means for
detecting the frequency component of the sound information within a predetermined time, sound-free portion detecting means for detecting a sound-free portion in which there is no sound of a predetermined level or higher for a predetermined time or
longer, and converting means for converting the sound information into image information, wherein when the difference between the frequency component of first sound information recorded with the lapse of time and the frequency component of second sound
information recorded thereafter is a predetermined value or greater, and when the sound-free portion is detected between the first sound information and the second sound information by the frequency detecting means, image information made from the first
sound information and image information made from the second sound information are separated from each other and displayed.
According to an eighth aspect of the present invention, there is provided a sound processing apparatus provided with sound information input means, recording means for recording the sound information, frequency detecting means for detecting the
frequency component of the sound information within a predetermined time, and output means for outputting sound information including a predetermined frequency component from among a plurality of bits of sound information.
According to a ninth aspect of the present invention, there is provided a sound processing apparatus provided with sound information input means, recording means for recording the sound information, display means, converting means for converting
the sound information into image information, and frequency detecting means for detecting the frequency component of the sound information within a predetermined time, wherein only image information made from one of a plurality of bits of sound
information of which the frequency component is within a predetermined value is displayed.
According to a tenth aspect of the present invention, there is provided a sound processing apparatus provided with sound information input means, sound recording means for recording the sound information, display means, frequency detecting means
for detecting the frequency component to the sound information within a predetermined time, converting means for converting the sound information into first image information corresponding to the frequency component, image pickup means for converting an
object image into second image information, compressing means for compressing the second image information by the use of discrete cosine transformation, and image recording means for recording the compressed information, wherein said frequency detecting
means uses the discrete cosine transformation.
According to an eleventh aspect of the present invention, there is provided a sound processing apparatus provided with image reproducing means for reproducing image information, and sound reproducing means for reproducing sound information
corresponding to the image information, wherein the image information is displayed for a time necessary to reproduce the sound information corresponding to the image information.
The above and other objects, features and advantages of the present
invention will be explained hereinafter and may be better understood by reference to the drawings and the descriptive matter which follows.
BRIEF DESCRIPTION OF THE DRAWINGS
FIGS. 1A and 1B are schematic views of a sound processing apparatus according to the present invention.
FIG. 2 is a circuit block diagram of the sound processing apparatus according to the present invention.
FIG. 3 is a schematic view of the display unit of the sound processing apparatus of the present invention.
FIG. 4 is a graph of a sound raw waveform and a raw waveform.
FIG. 5 shows the display by a personal computer.
FIG. 6 shows the display on the display unit of the sound processing apparatus of the present invention in which sound-free portions are represented with dotted lines or colors changed as at 53e and 53f.
FIG. 7 is a block diagram illustrating in detail the operations performed by the digital signal processor (DSP) shown in FIG. 2 according to the present invention.
DESCRIPTION OF THE PREFERRED EMBODIMENTS
FIGS. 1A and 1B are schematic views of an electronic camera apparatus according to the present invention. The electronic camera apparatus 1 is provided with a power source switch 10 and a liquid crystal display (hereinafter referred to as the
LCD, the size of which is 6 cm.times.4 cm) 2 for displaying the reproduction of a still image and various kinds of data. A stroboscopic lamp 5, a finder 6, a photo-taking lens 7 and a release button 8 are concerned in the recording of an image, and a
microphone 3, an earphone jack 4, a recording button 9 and a speaker 12 are concerned in the recording and reproduction of sound. A switch button 11 is a switch for a user to effect various settings. Also, on the surface of the LCD 2, there is provided
a so-called touch tablet 13 which, when touched by a pen-like indicating member, can input an indicated position. This touch tablet 13 is formed of transparent resin and the LCD 2 inside thereof can be observed through the touch tablet 13.
FIG. 2 is a circuit block diagram. Sound is inputted from the microphone 3, is converted into digital data by an A/D converting circuit 21, and is inputted to a digital signal processor 26 (shown as DSP in the figure). The digitized sound
signal is compressed in the digital signal processor 26, and is recorded in a memory 31 via a CPU 29 and an interface 30.
This compression of the sound is effected by effecting discrete cosine conversion, and then quantizing the sound and Huffman-coding it. As will be described later, this makes it possible to effect the analysis of a frequency by the use of the
result of the discrete cosine conversion. The compression of the sound may be effected not by the use of such a compressing method, but by the use of a compressing system using discrete cosine conversion for the compression of image information (for
example, the JPEG compressing system), and this discrete cosine conversion means may be used for the analysis of the frequency of sound information.
The image will now be described.
As regards an object image, a light beam condensed by the photo-taking lens 7 is imaged on a CCD 23 which is an image pickup device. The photoelectrically converted image information is converted into digital data by an A/D converter 25 via a
correlative dual sampling circuit (shown as CDS in the figure) 24. The digital data is compressed by the digital signal processor 26 and is accumulated in the memory 31 via the CPU 29 and the interface 30. Here, the compression effected is the JPEG
compressing system comprising a combination of discrete cosine transformation, quantization and Huffman coding.
The information compressed and accumulated in the memory 31 can be displayed on the LCD 2 provided on the back of the apparatus 1. The information in the memory 31 is read by the CPU 29 via the interface 30, is stretched by the digital signal
processor 26, again passes through the CPU 29 and is once stored in a frame memory 27, and then is displayed on the LCD 2. Here, in the case of image information, stretched image data is stored as a bit map in the frame memory and is displayed.
Further, as required, the bit map data is sent as a thinned and reduced so-called thumbnail image to the frame memory 27 and is displayed by the LCD 2.
On the other hand, when sound information is to be reproduced, the bit map data stretched by the digital signal processor 26 and resulting from sound having been visualized is sent so as to be displayed as a bar graph as will be described later,
and is displayed.
Also, a timepiece circuit for knowing date and time is contained in the CPU 29, and the date and time when the sound information and the image information are recorded can be recorded with the sound information and the image information.
FIG. 3 shows the substance displayed by the LCD 2. This display is a screen after image photographing and sound recording have already been completed and when the information thereof is reproduced.
On this display screen, the sound information is visualized and is displayed as a bar graph 53a. The bar graph when the recorded sound is short is displayed short. Also, when a time which can be regarded as a sound-free state in which the sound
is smaller than a predetermined volume is present for a predetermined time or when the frequency band of sound (for example, a man's voice and a woman's voice, or the sound of a background such as a little stream and man's voice) has changed, it is
displayed as a bar graph 53b with the display of the bar graph lowered by one stage. Further, the display of the bar graphs 53a and 53b is effected in colors corresponding to the frequencies of the sounds by a method which will be described later.
From this, a user can see by looking at the bar graphs 53a and 53b that the recorded substance of conversation has changed or the speaker has changed, and this makes the standard when the sound is reproduced later. The above-mentioned sound-free
state will hereinafter be referred to as the sound-free portion.
When the same continuous sound is recorded for a long time (e.g. 2 minutes and 30 seconds), information recorded for a predetermined time (e.g. one minute) is displayed as a bar graph 53b (corresponding to one minute), and is further displayed as
a bar graph 53c (corresponding to one minute) on a new line, and further in this case, is displayed as a bar graph 53d (corresponding to 30 seconds).
As described above, the axis of abscissas of the display is used as a time axis in which the longest bar graph is one minute and the axis of ordinates is used as a time axis in which one line is one minute, whereby long sound information, i.e.,
the bar graphs 53b, 53c and 53d and short sound information 53a can be recognized at a time.
This display of the sound information is not limited to bar graphs, but for example, a plurality of marks "*" may be arranged side by side in conformity with the recording time. Also, the marks may be changed or the pattern of the bar graphs may
be changed corresponding to the frequency of sound.
The time 51 during sound recording is displayed at the left of the bar graph. The display of this sound recording time may be that at the start or the end of the sound recording, or the average value at the start and end of the sound recording.
Further, the recording time may be displayed laterally of or below the sound recording time.
Design is made such that when the date of recording has changed, date information 58 is displayed. By this, when information recorded on a later date is to be reproduced, it becomes possible to quickly look for a desired portion to be
reproduced.
The reference character 52a designates a so-called thumbnail image in which photographed image information is displayed small, and this is displayed laterally of sound information when it is recorded simultaneously with sound. When image
information alone is recorded and sound information is not recorded, the image information alone is displayed as indicated at 52c. Also, when it is difficult in terms of the processing capability of the CPU 29 to reduce and display the image
information, for example, a mark "*" may replace as indicated at 52d and 52e.
The detection of the sound-free portion will now be described with reference to FIG. 4.
The waveform 40 of sound can be divided broadly into a sound having portion 41, a sound-free portion 42 and a sound-free portion 43. Here, waveforms of a predetermined amplitude or less are defined as the sound-free portions, and the magnitude P
of the amplitude recognized as the sound-free portions can be selected by the user. As represented by .DELTA.t in FIG. 4, generally man's voice include very short sound-free portions as when consonants have been pronounced. So, design is made such that
only sound-free portions of a predetermined line or longer are recognized so that such sound-free portions may not be detected. The lengths of these sound-free portions can be selected between about 0.3 sec. and about 1 sec. by the user. As previously
described, only the sound-free portion 42 smaller than a predetermined amplitude and longer than a predetermined time is recognized and the bar graph thereof is displayed in a new line. Also, by mode setting means, not shown, it is possible to display
the sound-free portions with dotted lines or colors changed as indicated at 53e and 53f in FIG. 6. By this, the presence of the sound-free portions and the lengths of the sound-free portions can be visually recognized.
Besides this, the sound-free portions may be displayed by the use of a special mark representative of being free of sound, for example, a pause in musical notes or the like. Further, sound data in which a sound-free portion has once been found
out may be again recorded in the memory with a special code put into the sound-free portion. In this case, there is the advantage that when the bar graph of sound is to be again displayed, the process of looking for the sound-free portion becomes simple
and the display speed of the bar graph is improved. Also, besides the display in which the bar graph is lowered by one stage in the sound-free portion, provision may be made of a mode in which the sound-free portion is also displayed as a bar graph and
a mode in which the sound-free portion is not displayed.
The detection of the frequency of sound will now be described.
The present apparatus incorporates hardware for compressing image information and sound information in the digital signal processor. Now, generally in the compression, discrete cosine transformation (DCT), quantization and two-dimensional
Huffman coding are effected. DCT is not restricted to hardware, but may be carried out by software.
Here, when the inputted data x are eight, DCT is represented by the transformation of mathematical expression 1. ##EQU1##
Here, sound data are put into x0-x7, whereby values corresponding to different frequencies can be obtained in y0-y7. While the data are eight here, the data may be sixteen.
Now, assuming that sampling data are eight and sampling frequency is 1 KHz, there are obtained 125 sets of values of y0-y7 within a second. When these values are averaged for each of y0-y7, the fluctuation of frequency by the utterance of each
sound, i.e., "a" or "i", is averaged and there is obtained a value conforming to the frequency of the utterer's voice. When the change in the value of y at each one second has become greater than a predetermined value, it is judged that the utterer has
changed or the utterer has stopped utterance and only the noise behind him or her has been recorded, and a bar graph is displayed in a new line.
Further, when a bar graph is to be displayed in a mixture of colors R, G and B, the size of R is determined as a function of the values of y0, y1 and y2, and the size of G is determined from y3, y4 and y5, and the level of B is determined from
the sizes of y6 and y7. Specifically, each value of y assumes a value of 0 to 255 and therefore, calculation is made as
R=(y0.times.65536+y1.times.256+y2).div.65536
G=(y3.times.65536+y4.times.256+y5).div.65536
B=(y6.times.256+y7)/256.
Here, B alone has been calculated from two y's, whereas the calculation is not restricted to B, but may be R or G.
By this, it is possible to analyze the frequency of sound by the utilization of the DCT used in the compression, and start a new line and classify the bar graphs by color and therefore, it is possible to effect the retrieval of the user's voice
quickly and software or hardware for the analysis of the frequency need not be newly prepared and thus, a decrease in cost becomes possible and the efficiency of processing is improved.
The predetermined time for averaging the frequency is not limited to one second, but yet when there is short utterance such as an agreeable response, the possibility that it cannot be detected becomes greater as the time becomes longer. Also, if
the predetermined time is too short, there is the possibility that the user is captured by each sound in a pronunciation and therefore, it is experimentally desirable that the predetermined time be 0.3 second or longer. By this, the length and frequency
of sound man can recognize as at least voice are detected, whereby it becomes possible to discriminate between the voices of a plurality of persons or between man's voice and noise or the like. Also, if for example, the difference between the frequency
averaged during one second and the frequency averaged during the next one second is equal to or less than a predetermined value, display is effected in the same color as an error by the same person's pronunciation.
When among the bar graphs classified by color as described above, a bar graph of a particular color is touched twice from above the touch tablet 13 by an indicating member, only that bar graph of the particular color is displayed and the bar
graphs of the other colors becomes temporarily extinct from the display screen. By this, it becomes possible to select only the sound of a particular speaker or a sound producing member. When the switch button 11 is depressed, only the sound of a
particular frequency corresponding to the bar graph of the selected particular color is reproduced. By this, it becomes possible to reproduce only a particular speaker's sound.
Further, when the frequency varies periodically variously, the possibility of music having been recorded is high and therefore, it is possible to display the mark of a musical note at the left end of a bar graph and also display the bar graph in
a color differing from the others.
Description will now be made of a method of reproducing sound and image information.
Only the bar graph 53a in the display of FIG. 3 is touched by a pen-like indicating member, not shown, and the switch button 11 is depressed, whereupon only the sound corresponding to the bar graph 53a is reproduced.
A | | |