WikiPatents - Community Patent Review
Create Free Account  |  License or Sell Your Patent  |  WikiPatents Marketplace  |  WikiPatents Blog
Username:  Password:  
    
Advanced Search
Method for automatically transcribing music and apparatus therefore    
United States Patent5038658   
Link to this pagehttp://www.wikipatents.com/5038658.html
Inventor(s)Tsuruta; Schichirou (Osaka, JP); Takashima; Yosuke (Tokyo, JP); Fujimoto; Masaki (Tokyo, JP); Mizuno; Masanori (Tokyo, JP)
AbstractAn automatic music transcription method and system for generating a muscial score from an input acoustic signal. The acoustic signal may include vocal songs, vocal humming, and music from musical instruments. The system comprises means for extracting pitch information and power information from the input acoustic signal, for correcting the pitch information based on the deviation of the acoustic signal relative to an absolute musical scale, for dividing the acoustic signal into a set of single-sound segments using the corrected pitch information, dividing the acoustic signal into a second set of single-sound segments this time using changes in the power information, for dividing the acoustic signal in still greater detail using information contained in both previous segmentations, for associating each segment with a musical interval of an absolute musical scale, and for determining single-sounds segments depending on whether or not the musical intervals of adjacent segments are identical, for determining the key of the acoustic signal, for correcting the placement of the segments on the musical scale of the determined key using the pitch information, for determining the time and tempo of the acoustic signal using this placement, and for compiling musical score data using the determined musical scale, sound length, key, time, and tempo of the acoustic signal.
   














 Title Information Submit all comments and votes
 
Patent Text Patent PDF Print Page Summary File History
Plain text PDF images Print Summary File History
Drawing from US Patent 5038658
Method for automatically transcribing music and apparatus therefore - US Patent 5038658 Drawing
Method for automatically transcribing music and apparatus therefore
Inventor     Tsuruta; Schichirou (Osaka, JP); Takashima; Yosuke (Tokyo, JP); Fujimoto; Masaki (Tokyo, JP); Mizuno; Masanori (Tokyo, JP)
Owner/Assignee     NEC Home Electronics Ltd. (Osaka, JP); NEC Corporation (Tokyo, JP)
Patent assignment
All assignments
Publication Date     August 13, 1991
Application Number     07/315,761
PAIR File History     Application Data   Transaction History
Image File Wrapper   Patent Term   Fees
Litigation
Filing Date     February 27, 1989
US Classification     84/461 84/475 84/616
Int'l Classification     G09B 015/02
Examiner     Stephan; Steven L.
Assistant Examiner     Voeltz; Emanuel Todd
Attorney/Law Firm     Cushman, Darby & Cushman
Address
Parent Case    
Priority Data     Feb 29, 1988[JP]63-46111 Feb 29, 1988[JP]63-46112 Feb 29, 1988[JP]63-46113 Feb 29, 1988[JP]63-46114 Feb 29, 1988[JP]63-46115 Feb 29, 1988[JP]63-46116 Feb 29, 1988[JP]63-46117 Feb 29, 1988[JP]63-46118 Feb 29, 1988[JP]63-46119 Feb 29, 1988[JP]63-46120 Feb 29, 1988[JP]63-46121 Feb 29, 1988[JP]63-46122 Feb 29, 1988[JP]63-46123 Feb 29, 1988[JP]63-46124 Feb 29, 1988[JP]63-46125 Feb 29, 1988[JP]63-46126 Feb 29, 1988[JP]63-46127 Feb 29, 1988[JP]63-46128 Feb 29, 1988[JP]63-46129 Feb 29, 1988[JP]63-46130
USPTO Field of Search     84/461 84/462 84/475 84/603 84/616 84/477 R
Patent Tags     automatically transcribing music therefore
   
Enter a comma (,) or semicolon (;) between multiple tag words/phrases.
Describe this patent:
 Amusing   
 Clever   
 Complex   
 Efficient   
 Historic   
 Important   
 Innovative   
 Interesting   
 Practical   
 Simple   
[no votes]
Patent WIKI

Share information and news about this patent, including information and news about the technology, inventors, company, ligation and licensing.

 References Submit all comments and votes
 
*references marked with an asterisk below are user-added references
 U.S. References
 
Add a new US reference:  
ReferenceRelevancyCommentsReferenceRelevancyComments
4603386
Kjaer
84/461
Jul,1986

[0 after 0 votes]
4479416
Clague
84/462
Oct,1984

[0 after 0 votes]
4392409
Coad, Jr.
84/462
Jul,1983

[0 after 0 votes]
3647929
Milde, Jr.
84/642
Mar,1972

[0 after 0 votes]
 Foreign References
 Other References
 Market Review Submit all comments and votes
   
Market Size
Estimate the gross annual revenues of the relevant market sector:
> $10B
$5B - $10B
$2B - $5B
$500M - $2B
$100M - $500M
$10M - $100M
$1M - $10M
$500K - $1M
$100K - $500K
< $100K
[No votes]
$0
 
$0   $2.5B   $5B   $7.5B   $10B
Market Share
Estimate the percentage of the relevant market sector this invention will capture:
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Reasonable Royalty
What percentage of gross sales should the inventor or assignee be paid?
75% - 100%
50% - 74.99%
25% - 49.99%
10 - 24.99%
5 - 9.99%
2 - 4.99%
1 - 1.99%
< 1%
[No votes]
0.0%
 
0%   25%   50%   75%   100%
Public's "Guesstimation" of Royalty Value
Market SizeN/A[No votes]
xMarket ShareN/A[No votes]
xReasonable RoyaltyN/A[No votes]

N/A

License Availablity
If you are NOT the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
License Availablity
If you ARE the owner or assignee, answer here:
Yes, license is available for purchase

No, license is not currently available



[No votes]
Competitive Advantage
Does this invention have a significant competitive advantage over similar technologies?
Yes

No



[No votes]
Most helpful competitive advantage comment
[No comments]

Commercial Alternatives
Are there viable commercial alternatives for this invention?
Yes

No



[No votes]
Most helpful commercial alternative comment
[No comments]

 Technical Review Submit all comments and votes
 Claims Submit all comments and votes
 


What is claimed is:

1. A method for transcribing music onto an absolute musical interval axis with predetermined frequencies marking boundaries of each interval, comprising the steps of:

inputting an acoustic signal;

extracting pitch information and power information from said acoustic signal;

correcting said pitch information by determining a musical interval axis of said pitch information according to a predetermined algorithm and then shifting the pitch of said pitch information so that a musical interval axis of the shifted pitch information according to said algorithm matches the absolute musical interval axis;

first dividing said acoustic signal into first single sound segments on the basis of said corrected pitch information while second dividing said acoustic signal into second single sound segments on the basis of power changes in said power information;

third dividing said acoustic signal into third single sound segments on the basis of both said first and second single sound segments;

identifying musical intervals in said acoustic signal by matching each of said third single sound segments to one of said predetermined frequencies marking the boundaries of the absolute musical interval axis;

fourth dividing said acoustic signal again into fourth single sound segments by combining adjacent third single sound segments which are matched to the same predetermined marking frequency;

determining a key inherent in said acoustic signal on the basis of the pitch information extracted in said extracting pitch information step;

correcting the matching of said fourth dividing step using said determined key;

fifth dividing said acoustic signal again into fifth single sound segments by combining adjacent third single sound segments which are matched to the same predetermined marking frequency;

determining a time and tempo inherent in said acoustic signal on the basis of said corrected segment information; and

compiling musical score data from the fifth single sound segments, the predetermined marking frequency on the absolute musical interval axis to which each of the fifth single sound segments is matched, the key, the time and the tempo.

2. The method for transcribing music of claim 1, further comprising the step of:

eliminating noise from and interpolating said extracted pitch and power information, the noise eliminating and interpolating step being performed after said step of extracting pitch and power information and before said step of correcting said pitch information.

3. The method for transcribing music of claim 1, wherein said second dividing step comprises the steps of:

comparing said power information to a predetermined value and dividing said acoustic signal into a first section larger than said predetermined value while recognizing said first section as an effective section and also dividing said acoustic signal into a second section smaller than said value while recognizing said second section as an invalid section;

extracting a point of change where said power information rises with respect to said effective section;

dividing said effective segment into smaller parts at said point of change;

measuring the length of said segments of both of said effective and invalid sections; and

connecting any segment with a length shorter than a predetermined length to the preceding segment to form one segment.

4. The method for transcribing music of claim 1, wherein said second dividing step comprises the steps of:

comparing said power information to a predetermined value and dividing said acoustic signal into a first section larger than said predetermined value while recognizing said first section as an effective section and also dividing said acoustic signal into a second section smaller than said value while recognizing said second section as an invalid section;

extracting a point of change where said power information rises with respect to said effective section; and

dividing said acoustic signal on the basis of said extracted point of change.

5. The method for transcribing music of claim 1, wherein said second dividing step comprises the steps of:

dividing said acoustic signal into a first section larger than a predetermined value while recognizing said first section as an effective section and into a second section smaller than said predetermined value while recognizing said second section as an invalid section;

measuring the length of both said first and second sections; and

connecting any segment with a length shorter than a predetermined length to the preceding segment.

6. The method for transcribing music of claim 1, wherein said second dividing step comprises the steps of:

extracting a point of change where said power information rises; and

dividing said acoustic signal with respect to said point of change.

7. The method for transcribing music of claim 1, wherein said second dividing step comprises the steps of:

extracting a point of change where of said power information rises;

dividing said acoustic signal with respect to said point of change; and

connecting any segment with a length shorter than a predetermined length to the preceding segment.

8. The method for transcribing music of claim 1 wherein the acoustic signal is sampled into individual sampling points, wherein said first dividing step comprises the steps of:

analyzing said individual sampling points of the acoustic signal using said extracted pitch information to determine a length of a series of said sampling points in which the pitch of said sampling points remains in a range;

detecting a section in which said determined length of said series exceeds a predetermined value;

identifying the sampling point beginning the series having the maximum series length of said detected sections to be the typical point;

detecting the amount of the variation in said pitch information between adjacent typical points with respect to the individual sampling points between them when the difference in said pitch information at two adjacent typical points exceeds a predetermined value; and

dividing said acoustic signal at one of said sampling points between adjacent typical points where the amount of variation between said one sampling point and an adjacent sampling point is maximum.

9. The method for transcribing music of claim 1, wherein said third dividing step comprises the steps of:

determining a standard length of a note corresponding to a predetermined duration of time on the basis of the length of each of said first single sound segments divided in said first dividing step; and

dividing each of said first single sound segments on the basis of said determined standard length and dividing said single sound segments again which have lengths longer than said predetermined duration of time of said note.

10. The method for transcribing music of claim 1, wherein said step of identifying musical intervals comprises the steps of:

calculating the differences in pitch between the pitches of each of said third single sound segments and said predetermined frequencies of said absolute musical interval;

detecting the smallest difference; and

recognizing the musical interval of said third single sound segment to be at said predetermined frequency on said absolute musical interval axis in relation to which the pitch of said third single sound segment has said smallest difference.

11. The method for transcribing music of claim 1, wherein said step of identifying musical intervals comprises the steps of:

calculating an average value of all said pitch information of each of said third single sound segments; and

recognizing the musical interval of each of said third single sound segments to be at the predetermined frequency on said absolute musical interval axis in relation to which said calculated average pitch value of said third single sound segment is closest

12. The method for transcribing music of claim 1, wherein said step of identifying musical intervals comprises the steps of:

extracting an intermediate value of said pitch information of each of said third single sound segments; and

recognizing the musical interval of each of said third single sound segments to be at the predetermined frequency on said absolute musical interval axis in relation to which said intermediate value is closest.

13. The method for transcribing music of claim 1, wherein said step of identifying musical intervals comprises the steps of:

extracting the most frequent value of said pitch information of each of said third single sound segments; and

recognizing the musical interval of each of said third single sound segments to be at the predetermined frequency on said absolute musical interval axis in relation to which said most frequent value is closest.

14. The method for transcribing music of claim 1, wherein said step of identifying musical intervals comprises the steps of:

extracting the peak point pitch value of said power information for each of said third single sound segments; and

recognizing the musical interval each of said third single sound segments to be at the predetermined frequency on said absolute musical interval axis in relation to which said peak point pitch value is closest.

15. The method for transcribing music of claim 1, wherein the acoustic signal is sampled into individual sampling points, wherein the step of identifying musical intervals comprises the steps of:

analyzing said individual sampling points of the acoustic signal using said extracted pitch information to determine a series for each of said sampling points in which the pitch of said sampling points in the series remains in a range;

identifying which of said series in each of said third single sound segments has the longest length

finding an analytical point for said series of longest length in each of said third single sound segments, the analytical point being the sampling point about which the pitches of all other sampling points fall within half of said range; and

identifying each of said third single sound segments with a predetermined pitch of the absolute musical interval axis by matching the pitch of the analytical point to the closest predetermined pitch on the absolute musical interval axis.

16. The method for transcribing music of claim 1, wherein said step of identifying musical intervals comprises the steps of;

extracting segments with lengths lower than a predetermined value;

extracting segments which have changes in pitch information of a particular constant inclination;

detecting the differences in pitch between the identified musical interval of each of said extracted segments and adjacent segments;

identifying the musical interval of both the extracted segment and the adjacent segment to be the predetermined marking frequency of the absolute musical interval axis which is closest to either of the extracted segment and the adjacent segment which is smaller than a predetermined value as an actual musical interval.

17. The method for transcribing music of claim 1, wherein said step of identifying musical intervals comprises the steps of:

extracting segments of said acoustic signal which begin and end according to a half step above and a half step below each of the predetermined frequencies of the absolute musical interval axis;

classifying totals of each of said extracted segments in said acoustic signal which corresponds to the same predetermined frequency on the absolute musical interval axis; and

identifying the musical interval of each of said segments in accordance with said classified totals.

18. The method for transcribing music of claim 1, wherein said key determining step comprises the steps of:

classifying totals of said pitch information with respect to the absolute musical interval axis;

extracting a frequency of occurrence of each of said predetermined frequencies on the absolute musical interval axis;

calculating product sums of predetermined weighing coefficient and said extracted frequency of occurrence of each of said predetermined frequencies on the absolute musical interval axis, a different calculation being performed for each of musical key; and

identifying the key of the acoustic signal to be the particular musical key resulting in the maximum product sum calculation.

19. The method for transcribing music of claim 1, wherein said step of extracting pitch information comprises the steps of:

converting said acoustic signal into digital form;

calculating an autocorrelation function of said acoustic signal in the digital form;

detecting an amount of deviation giving the maximum of the local maximum for said calculated autocorrelation functions by an amount of deviation other than zero;

detecting an approximate curve through which said autocorrelation functions of a plurality of sampling points including that giving said amount of deviation pass;

determining an amount of deviation resulting in the local maximum of said autocorrelation on said calculated approximate curve; and

detecting a pitch frequency in accordance with said determined amount of deviation.

20. The method for transcribing music of claim 1, wherein said step of extracting pitch information comprises the steps of:

converting said acoustic signal into digital form;

calculating an autocorrelation function of said acoustic signal in the digital form;

detecting a pitch information in accordance with the maximum information of said calculated autocorrelation function;

judging whether the local maximum point of said autocorrelation function exists approximate to two-times of the largest frequency component of said detected pitch information; and

outputting pitch information corresponding to said local maximum if the result of said judge is positive.

21. The method for transcribing music of claim 1, wherein said step of correcting said pitch information comprises the steps of:

classifying totals of said pitch information;

detecting a deviation from the absolute musical interval axis using said classified totals; and

shifting the pitch of said pitch information by the amount of said detected deviation.

22. An apparatus for transcribing music, comprising:

means for inputting an acoustic signal;

means for amplifying said inputted acoustic signal;

means for converting the analog acoustic signal into digital form;

means for processing said digital acoustic signal for extracting pitch information and power information;

means for storing the processing program;

means for controlling said signal processing program; and

means for displaying the transcribed music,

wherein said means for amplifying, said means for converting, and said means for processing are formed in a hardware construction.
 Description Submit all comments and votes
 


BACKGROUND OF THE INVENTION

The present invention relates to automatically transcribing music (vocal music, vocal humming, and sounds of musical instruments) into a musical score.

In such an automatic music transcription system, it is necessary to detect the basic items of information in musical scores: sound lengths, musical intervals, keys, times, and tempos.

Generally, since acoustic signals are the kind of signals which contain repetitions of fundamental waveforms in continuum, it is not possible immediately to obtain the above-mentioned items of information.

Therefore, the present applicants have already proposed an automatic music transcription system as disclosed, for example, in Unexamined Patent Application No. 62-178409.

This automatic music transcription system is shown in FIG. 1. The system is provided with autocorrelation analyzing means 14 for converting hummed vocal sound signals 11 into digital signals by means of analog/digital (A/D) converter 12. The digitized sound is called vocal sound data 13. Pitch information and sound power information 15 is then extracted from the vocal sound data 13. Segmenting means 16 divides the input song or hummed sounds into a plural number of segments on the basis of the sound power information. Musical interval identifying means 17 identifies the musical interval on the basis of the afore-mentioned pitch data with respect to each of the segments as established by the afore-mentioned segmenting means. Key determining means 18 determines the key of the input song or hummed vocal sounds on the basis of the musical interval as identified by the afore-mentioned musical interval identifying means. Tempo and time determining means determines the tempo and time of the input song or hummed vocal sounds on the basis of the segments established by division by the afore-mentioned segmenting means. Musical score data compiling means 110 prepares musical score data on the basis of the output of the afore-mentioned segmenting means, musical interval identifying means, key determining means, and tempo and time determining means. Musical score data outputting means 111 generates musical score data 112 prepared by the afore-mentioned musical score compiling means 110.

It is to be noted in this regard that such acoustic signals as those of vocal sounds in songs, hummed voices, and musical instrument sounds consist of repetitions of fundamental waveforms. In an automatic music transcription system for transforming such acoustic signals into musical score data, it is necessary first to extract for each analytical cycle the repetitive frequency of the fundamental waveform in the acoustic signal. This frequency is hereinafter referred to as "the pitch frequency". The corresponding cycle is called "the pitch cycle." This "pitch" information is taken into account, in order accurately to determine various kinds of information on such items as musical interval and sound length in acoustic signals.

Two extracting methods, frequency analysis and autocorrelation analysis, have been developed in the fields of vocal sound synthesis and vocal sound recognition. Autocorrelation analysis has hitherto been employed because it extracts pitch without being affected by noises in the environment and because it permits easy processing.

In the automatic music transcription system mentioned above, the system calculates the autocorrelation function after it converts acoustic signals into digital signals. Therefore, an autocorrelation function can be calculated for each analytical cycle.

Pitch extraction accuracy is similarly dependent upon the sampling cycle. If the resolution of a pitch so extracted is low, then the musical interval and sound length determined by the processes described later will have a low degree of accuracy.

It is conceivable to use a higher frequency for sampling, but such an approach is liable to result in the inability of the system to perform real-time processing, as well as a larger-sized, more expensive, automatic music transcription system apparatus. The disadvantages are a consequence of the increase in the amount of data processed in arithmetic operations such as the autocorrelation function.

Acoustic signals have the characteristic feature that their power is augmented immediately after a change in sound. This feature of sound is utilized in the segmentation of on the basis of power information.

Unfortunately, acoustic signals, particularly those appearing in songs sung by a man, do not necessarily take any specific pattern in the change of their power information. Songs have fluctuations in relation to the pattern of change. In addition, the sound to be transcribed also often contains abrupt sounds, such as outside noises. In these circumstances, a simple segmentation of sound with attention paid to the change in the power information has not necessarily led to any good division of individual sounds.

In this regard, it is noted that acoustic signals generated by a man are not stable in sound length, either. That is, such signals have much fluctuations in pitch. This has caused an obstacle to the performance of good segmentation based on pitch information.

Thus, in view of the fluctuations existing in pitch information, conventional systems often treat two or more sounds as a single segment in some cases.

With existing transcription equipment, even sounds generated by musical instruments do not readily lend themselves to segmentation based on pitch information. This shortcoming is due to ambient noises intruding into the pitch information after capture by the acoustic signal input apparatus for converting acoustic signals into electrical signals.

When musical intervals, times, tempos, etc. are determined on the basis of sound segments (sound length), the process of segmentation becomes a very important factor in the preparation of musical score data. A low accuracy of segmentation reduces the accuracy of the ultimately developed musical score data. A high initial accuracy of segmentation is therefore desired when final segmentation utilizes the results of the power information. A high initial accuracy is also desired when final segmentation utilizes the results of both pitch information segmentation and the results of power information segmentation.

Acoustic signals, particularly those acoustic signals uttered by a man, are not stable in their musical interval. These signals have considerable fluctuations in pitch even when the same pitch (one tone) is intended. Accordingly, it is very difficult to identify musical intervals in such signals.

When a transition occurs from one sound to another, it often happens that a smooth transition is not made to the pitch of the following sound. Pitch fluctuations occur before and after the transition. Consequently, the segments on either side are often mistaken for another sound segment. The result is that sound segments with pitch transitions are often identified as belonging to a different pitch level in the identification of a musical interval.

In order to explain this in specific terms, methods permitting simplicity in arithmetic operation are considered for the automatic music transcription system mentioned above. For example a given sound can be identified with a pitch closest on the absolute axis to the average value of the pitch information within the segment. The sound can also be identified with the pitch closest on the absolute axis to the medium value of the pitch information of the segment.

With a method like this, it is possible to identify the musical interval well when the interval difference between two adjacent sounds is a whole tone, for example do and re on the C-major scale. But, if the difference between two adjacent sounds is a semitone, for example of mi and fa on the C-major scale, there may sometimes be an inaccuracy in the identification of the musical interval. For example, the sounds intended to be mi on the C-major scale can be identified as fa.

In addition to sound length, the musical interval is a fundamental element. It is therefore necessary to identify the interval accurately. If it cannot be identified accurately, the accuracy of the resulting musical score data will be low.

The key, on the other hand, is not merely an element of musical score data. The key gives an important clue to the determination of a musical interval. A key has a certain relationship to a musical interval and to the frequency of occurrence of a musical interval. In improving the accuracy of the musical interval, it is desirable to determine the key and to review the identified musical interval.

Furthermore, as mentioned above, the musical intervals of acoustic signals, particularly those of vocal music, deviate from the absolute musical interval. The greater the deviation, the more inaccurate the musical interval identified on the musical interval axis. The deviation of the musical intervals in vocal music heretofore has resulted in lower accuracy in music transcription.

In summary, the automatic music transcription system and apparatus disclosed in the present applicants' published patent application No. 62-178409 may generate musical score data with low accuracy. It has so therefore not found widespread practical use.

SUMMARY OF THE INVENTION

The present invention has been made in consideration of the problems mentioned hereinabove. Therefore, a primary object of the invention is to provide a practically usable automatic music transcription system and apparatus which improves the accuracy of the final musical score data.

Another object of the present invention is to provide an automatic music transcription method and apparatus which further improves the accuracy of the final musical score data by segmentation based on power information segmentation and pitch information segmentation. This accuracy is to be achieved without being influenced by fluctuations in acoustic signals or abrupt intrusions of outside sounds.

The present invention is a method of identifying musical intervals which both identifies musical scales with accuracy and also provides for an automatic music transcription system for further improving the accuracy of the final musical score data.

Still another object of the present invention is to provide an automatic music transcription method and apparatus which further improves the accuracy of the final musical score data by obtaining more accurate information on the musical interval. The more accurate musical interval is achieved through correction of the pitch of segments (identified with musical intervals whose pitch differs from those pitches intended by the singer due to pitch fluctuations occurring at the time of transition from one sound to the next). The pitch of the segment is corrected with reference to musical interval information on the preceding segment and on the following segment.

Still another object of the present invention is to provide an automatic music transcription method and apparatus capable of accurately determining the key of acoustic signals.

Still another object of the present invention is to provide an automatic music transcription method and apparatus capable of detecting the amount of deviation of the musical interval axis of an acoustic signal in relation to the axis of the absolute musical interval, correcting the pitch information in proportion to the detected deviation, and making it possible to compile musical score data more accurately in the subsequent process.

Still another object of the present invention is to provide a pitch extracting method and pitch extracting apparatus capable of extracting the pitch of an acoustic signal with high accuracy without employing a higher sampling frequency.

In order to attain these and other objects, the automatic music transcription system according to the present invention involves extracting pitch information and power information from the input acoustic signal, correcting pitch information in proportion to the deviation of the musical interval axis from the absolute musical interval axis, dividing the acoustic signal into single sound segments on the basis of the corrected pitch information and on the basis of changes in the power information, making more detailed divisions of the acoustic signal on the basis of the segment information, identifying musical intervals amid the individual segments referencing the pitch information, and dividing the acoustic signal again into single-sound segments on the basis of whether or not the identified musical intervals of the segments in continuum are identical, determining the key of the acoustic signal on the basis of the extracted pitch information, correcting the prescribed musical interval on the musical scale for the determined key on the basis of the pitch information, determining the time and tempo of the acoustic signal on the basis of the segment information, and finally compiling musical score data from the information on the determined musical interval, sound length, key, time, and tempo.

Similarly, the automatic music transcription system according to the present invention comprises a means for extracting from the input acoustic signal the pitch information and the power information thereof, a means for correcting the pitch information in accordance with the amount of deviation of the musical interval for the acoustic signal in relation to the axis of the absolute musical interval, a means for dividing the acoustic signal into single-sound segments on the basis of the corrected pitch information, a means for dividing the acoustic signal into single-sound segments on the basis of the changes in the power information, a means for making further divisions of the acoustic signal into segments on the basis of both of these sets of segment information thus made available, a means for identifying the musical intervals for the acoustic signals in the individual segments along the axis of the absolute musical interval, a means for dividing the acoustic signal again into single-sound segments on the basis of whether or not the musical intervals of the identified segments in continuum are identical, a means for determining the key for the acoustic signal on the basis of the extracted pitch information, a means for correcting the prescribed musical interval on the determined key on the basis of the pitch information, a means for determining the time and tempo of the acoustic signal on the basis of the segment information, and a means for compiling musical score data from the information on the musical interval, sound length, key, time and tempo so determined.

The automatic music transcription system according to the present invention is further characterized by a means for inputting acoustic signals, a means for amplifying the acoustic signals thus input, a means for converting the amplified analog signals into digital signals, a means for extracting the pitch information by performing autocorrelation analysis of the digital acoustic signals and extracting the power information by performing the operations for finding the square sum, (the means for extracting the pitch information and the power information being constructed in hardware) a storage means for keeping in memory the prescribed music-transcribing procedure, a controlling means for executing the music-transcribing procedure kept in memory in the storage means, a means for starting the processing by the control means, and a means for generating the output of the musical score data obtained by the processing.

The present invention has made it possible to provide an automatic music transcription system with sufficient capabilities for its practical application owing to the extremely significant improvement in its accuracy in generating the final musical score data. This is so because the system accurately extracts pitch information and power information from acoustic signals such as vocal songs, humming voices, and musical instrument sounds, divides the acoustic signals accurately into single-sound segments on the basis of such information, and identifies the musical interval and the key with high accuracy. These performance features therefore have proven effective in reducing the influence of noise and power fluctuations in the processing of acoustic signals.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram illustrating the automatic music transcription system leading to the present invention.

FIG. 2 is a block diagram illustrating the first hardware embodiment of the automatic music transcription system according to the present invention.

FIG. 3 is a flow chart showing the automatic music transcription process in the first embodiment of the present invention.

FIG. 4 is a summary flow chart illustrating the segmentation process based on the power information pertinent to the present invention.

FIG. 5 is a flow chart illustrating an example of the segmentation process in greater detail.

FIG. 6 is a characteristic curve chart illustrating one example of segmentation by such a process.

FIG. 7 is a summary flow chart illustrating another example of the segmentation process based on the power information according to the present invention.

FIG. 8 is a flow chart illustrating the segmentation process in greater detail.

FIG. 9 is a flow chart illustrating an example of the segmentation process based on the power information according to the present invention.

FIG. 10 is a characteristic curve chart presenting the chronological change of the power information together with the results of the segmentation.

FIG. 11 is a flow chart illustrating an example of the segmentation process based on the power information according to the present invention.

FIG. 12 is a characteristic curve chart presenting the chronological changes of the power information and those of the rise extracting functions, together with the results of the segmentation.

FIG. 13 and FIG. 14 are flow charts each illustrating an example of the segmentation process based on the power information according to the present invention.

FIG. 15 is a characteristic curve chart presenting the chronological changes of the power information and the rise extracting functions, together with the results of the segmentation.

FIG. 16 and FIG. 17 are flow charts each illustrating an example of the segmentation process based on the pitch information according to the present invention.

FIG. 18 is a schematic drawing providing an explanation of the length of the series.

FIG. 19 is a flow chart illustrating the reviewing process for the segmentation according to the present invention.

FIG. 20 is a schematic drawing provided for an explanation of the reviewing process.

FIG. 21 is a flow chart illustrating the musical interval identifying process according to the present invention.

FIG. 22 is a schematic drawing providing an explanation of the distance of the pitch information to the axis of the absolute musical interval in each segment.

FIG. 23 is a flow chart illustrating an example of the musical interval identifying process according to the present invention.

FIG. 24 is a schematic drawing illustrating one example of such a musical interval identifying process.

FIG. 25 is a flow chart illustrating an example of the musical interval identifying process according to the present invention.

FIG. 26 is a schematic drawing illustrating one example of such a musical interval identifying process.

FIG. 27 is a flow chart illustrating one example of the musical interval identifying process according to the present invention.

FIG. 28 is a schematic drawing showing one example of such a musical interval identifying process.

FIG. 29 is a flow chart illustrating an example of the process for correcting the identified musical interval according to the present invention.

FIG. 30 is a schematic drawing illustrating one example of the correction of such an identified musical interval.

FIG. 31 is a flow chart illustrating an example of the musical interval identifying process according to the present invention.

FIG. 32 is a schematic drawing illustrating one example of such a musical interval identifying process.

FIG. 33 is a flow chart illustrating an example of the musical interval identifying process according to the present invention.

FIG. 34 is a chart for explaining the length of the series applicable to the present invention.

FIG. 35 is a schematic drawing illustrating one example by such a musical interval identifying process.

FIG. 36 is a flow chart illustrating an example of the process for correcting the identified musical interval according to the present invention.

FIG. 37 is a schematic drawing explaining such a correcting process for the identified musical interval.

FIG. 38 is a flow chart illustrating an example of the key determining process according to the present invention.

FIG. 39 is a table presenting some examples of the weighing coefficients for each musical scale established in accordance with each key.

FIG. 40 is a flow chart illustrating an example of the key determining process according to the present invention.

FIG. 41 is a flow chart illustrating an example of the tuning process according to the present invention.

FIG. 42 is a histogram showing the state of distribution of the pitch information.

FIG. 43 is a flow chart showing an example of the pitch extracting process according to the present invention.

FIG. 44 is a schematic drawing presenting the autocorrelation function curves to be used for the pitch extracting process.

FIG. 45 is a flow chart illustrating an example of the pitch extracting process according to the present invention.

FIG. 46 is a schematic drawing showing the autocorrelation function curves used in the pitch extracting process.

FIG. 47 is a block diagram illustrating the second embodiment of the construction of the automatic musical transcription system.

DESCRIPTION OF THE PREFERRED EMBODIMENTS

Detailed descriptions of the various embodiments of the present invention with reference to the accompanying drawings are given below.

FIG. 2 is a block diagram illustrating the construction of the automatic music transcription system to which the first embodiment according to the present invention is applied. FIG. 3 is a flow chart illustrating the processing procedure for the system.

In FIG. 2, the Central Processing Unit (CPU) 1 performs overall control for the entire system and executes the music score processing program shown in FIG. 3. This program is stored in the main storage device 3 which is connected to the CPU through the bus 2, to which input device keyboard 4, output device display unit 5, auxiliary memory device 6 for use as working memory, and analog/digital converter 7 are connected. CPU 1 and main storage device 3 are also connected to bus 2.

To analog/digital converter 7 is connected acoustic signal input device 8, which is composed of a microphone. This acoustic signal input device 8 captures the acoustic signals in vocal songs and transforms them into electrical signals. The electrical signals are supplied to analog/digital converter 7.

CPU 1 begins the music transcription process when it receives a command to that effect as entered on the keyboard input device 4. CPU 1 then executes the program stored in the main storage device 3, temporarily storing the acoustic signals as converted into digital signals by the analog/digital converter 7 into the auxiliary memory device 6. CPU 1 thereafter converts these acoustic signals into musical score data by executing the above-mentioned program so that the musical score data may be output as required.

After CPU 1 has input the acoustic signals, processing for musical score transcription occurs. This processing is described in detail with reference to the flow chart shown in terms of functional levels in FIG. 3.

First, CPU 1 extracts pitch information for the acoustic signals for each analytical cycle through its autocorrelation analysis of the acoustic signals. CPU 1 also extracts power information for each analytical cycle by first processing the acoustic signals to find the square sum, and then performing post-treatments. Post-treatments may include the elimination of noises and an interpolation operation (Steps SP 1 and SP 2). Thereafter, CPU 1 calculates, with respect to the pitch information, the amount of deviation of the musical interval axis of the acoustic signal in relation to the axis of the absolute musical interval. This deviation is calculated on the basis of the distribution around the musical interval axis. CPU 1 then performs the tuning process (Step SP 3), which involves shifting the pitch information in proportion to the amount of deviation of the musical interval axis. In other words, the CPU corrects the pitch information to reduce the difference between the musical interval axis of the (singer or musical instrument) and the axis of the absolute musical interval.

Then, CPU 1 executes the segmentation process. This process divides the acoustic signals into single-sound segments, each of which have continuous durations of pitch information. CPU 1 treats the resulting segments as indicating one musical interval. The CPU then executes the segmentation process again on the basis of the changes in the obtained power information (Steps SP 4 and SP 5). Each resulting set of segment information has continuous pitch. CPU 1 then calculates the standard lengths corresponding respectively to the time lengths of a half note, an eighth note, and so forth and execute the segmentation process in further detail on the basis of these standard lengths (Step SP 6).

CPU 1 thus identifies the musical interval of a given segment with the musical interval on the absolute musical interval axis to which the relevant pitch information is considered to be closest. This determination is made on the basis of the pitch information of the segment obtained by segmentation. CPU 1 then further executes the segmentation process again on the basis of whether or not the musical interval of the identified segments in continuum are identical (Steps SP 7 and SP 8).

After that, CPU 1 finds the product sum of the frequency of occurrence of the musical interval.