Font Size: a A A

Monaural Musical Melody Extraction Based On Computational Auditory Scene Analysis

Posted on:2016-04-16Degree:MasterType:Thesis
Country:ChinaCandidate:B Y LiFull Text:PDF
GTID:2298330467492079Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
Melody define as the fundamental frequency curve of voice in music, melody extraction is the basis of many research fields such as song structure analysis, querying by humming and so on. The purpose of computational auditory scene analysis (CASA) is segregating speech from a variety of background noise, so CASA is a feasible scheme in melody extraction in theoretically. CASA demands high SNR of mixture signals so it been widely used on segregation of voice from common noise recently. But music dissatisfy high SNR feature and the interference of instrument in music is more similar to voice than background noise. the interference of instrument in music has harmonic structure and will cause interference to melody extraction. So it is a valuable topic that how to use the knowledge of CASA to extract melody from music. In this paper, we proposed an CASA-based melody extraction system. The main work as follows:(1). Two preprocessing methods of music signal is studied in this paper.The harmonic instruments and higher harmonic components of music will make a interference in melody extraction based on CASA. We used two preprocessing methods to solve this problem in this paper. The first one is harmonic and percussive signal separation to solve the interference of harmonic instruments.As harmonic component and percussive component are aeolotropism at spectrogram in music signals. So, we used HPSS to restrain harmonic component in music for reducing the difficulty of the processing. The second preprocessing methods is low pass filter to solve the higher harmonic components problem. According to the features of voice and instrument, the low pass filter can fall off the high frequency in music to increase SNR. Experiments show that preprocessing methods can effectively improve the accuracy of the melody extraction.(2). A melody extraction algorithm based on CASA is studied in this paper.Music has mount of sound source which bring a challenge to the melody extraction based on CASA. We proposed various methods to solve this problem. First, signals are processed by auditory periphery, then extract six-dimension features of T-F unit from the correlation and instantaneous frequency of the gammatone filter response and its envelope. Second, we use three multi-layer perceptron (MLP) to find most possible dominant frequency as dominant frequency estimated value in each T-F unit, and use this estimated value to determine ideal binary mask (IBM). Finally, we use estimated value and IBM to get up to2dominant frequency in one frame, and get melody curves by Short-term continuity. In the last part, the algorithm improves the estimation of pitch contours and mask in an iterative manner. We keep up to2pitch in each time frame according to above steps. Then determination one dominant pitch in two candidate pitch using temporal continuity of signal according to the determination algorithm proposed in this paper. Experiments show that the algorithm can accurately extracts the melody of the music.(3). A pitch estimate methods based on harmonic energy ratio is studied in this paperVoice signal and accompaniment signal have different energy distribution in the high frequency region, voice signal has obvious attenuation in high frequency region, on the contrary accompaniment signal has small attenuation in the high frequency region. We first calculate the energy of low-frequency harmonic component and total harmonic component of possible pitch point in potential pitch range of voice, then calculate the ratio of low-frequency harmonic component energy among total harmonic component energy, use this ratio to get the initial estimation pitch of each time frame by sorting the energy ratio and filtering the fake pitch candidate according to the decision rule, last we use this estimation pitch as the initial estimation of melody extraction algorithm and eventually get the pitch curve. Experimental results show that using the energy ratio characteristics has the big enhancement in melody extraction compared to the acoustic characteristics.
Keywords/Search Tags:CASA melody-extraction multi-layer perceptron (MLP), HPSS harmonic energy ratio
PDF Full Text Request
Related items