Font Size: a A A

Music Similarity Comparison And Approximate Search

Posted on:2010-06-15Degree:MasterType:Thesis
Country:ChinaCandidate:L YeFull Text:PDF
GTID:2178360272997089Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
There are many MIDI files on the Internet,MIDI files is different from sample waveform files,which use MIDI message to record each note,it is equivalent to the preservation of the music on the computer,this allows these computer to identify at these notes to skip the analog signal the conversion process.Therefore,these MIDI files for the bulk of the analysis,comparative music data provided strong support.Typically,songs are searched through its name,author,artist,lyrics,such as keywords, but sometimes the user doesn't know above-mentioned information or ignorant confusion, for them only as a search keyword is the fragment of melody,then the melody-based search engine will help them find the target song.This paper completely builds a search engine which uses melody fragments as key words,including the download of data,data processing,search the three main modules.In the download module,the paper uses Heritrix and a self-implement spider program. Spider is an important part of search engine,it download page on the World Wide Web. The traditional spider start from one or a number of pages,it gets URLs from these pages, in the process of crawling web pages,the spider continuously extracted from the current page URL Add a new queue,until the system meet the conditions to stop.Heritrix is the Internet Archive's open-source,extensible,web-scale,archival-quality web crawler.The best of Heritrix is developers can extend its various components,to achieve their own crawling logic.This paper describes the configuration and manipulation of Heritrix,and uses its download a large number of MIDI files.However,Heritrix can't download the resources which need user to register and log in.For this problem,we designed a special spider to download dynamic pages,and it can analog user's log in.Nowadays the Internet, static pages are more and more less,in particular,information or resources are included in the larger web,have to use dynamic pages to display,because this paper is to download MIDI files,so during visiting the website which includes such resources,the steps to download each file are the same,that is,the user opens the order of dynamic pages are fixed,such as to open a page,click some links,and then in the new page open new links, until the page contains the resources is found.Therefore start from the home,users traverse all the pages,forming a tree structure.In this paper,the path of the tree stored in the XML file,using JDOM to parse the XML file,the tree will be passed on to the spider,spider program using HttpClient to download web pages and maintaining the client's Cookie,with HttpParser analysis of web content,in accordance with the tree path,the URLs in pages will be flitted or downloaded,finally,MIDI files are gotten. In each track of MIDI files,all notes are express with note on and note off message,in order to facilitate the analysis,this paper constructs the following objects with Java Sound API:ScoreTrack,Bar,TimeStamp,Note.These objects were included in the relationship. Among them,the Track,Bar and Note in the same meaning with the score,The TimeStamp is a collection which contains the Notes in the same time.This paper presents a method for auto main melody track identification.This method is mainly divided into four steps:1.Exclude the track that can't be melody.2.For each track,build a Boolean array with the length of the number of bars in that track,called characteristic array,for the identification of each bar in the track has the characteristics of melody.Analyze each bar,for the array assignment.If the invalid bar,the array element values recorded as false.3.Analyze the characteristics Boolean array to filter out unqualified tracks,add the remaining to candidate track set.4.By comparing the candidate tracks,select the main melody track.Experimental results show that the correct recognition rate of the method is 91%, compared with methods based on probability and statistics,this method eliminates manual marking steps,and for the canonical MIDI files which indeed contain the main melody track,the correct recognition rate reaches 96.8 percent,this method provides a strong support to the melody search engine.After extracting the melody,it is transformed to string and stored in database.This paper implements an approximate search algorithm,which is divided into two parts: multi-pattern search and check the edit distance.In the single pattern matching algorithm, BM algorithm is a classic matching algorithm,QS algorithm is a simplified form BM,full use of pattern unmatched.Aho-Corasick algorithm is a classic multi-pattern matching algorithm,this paper build a multi-pattern search algorithm base on QS algorithm,the Aho-Corasick algorithm is used to the match step of the QS algorithm,this can reduce multi string comparison to once.This paper obtains edit distance between strings by the dynamic programming.Finally,this paper discusses two approaches to compare music similarity.In music theory,there is no specific criteria for measuring the similarity,the so-called similarity is often from people's feeling,but because of the number of people heard the music are limited,so the job to look for similarity among large numbers of music works is fit by the computer.Starting from this point to this paper discussed these two methods,aims to narrow the scope of comparison.One is to extract the chord sequence of the work,anther is count the notes in each section.The central idea of these two methods are trying to convert the music segment fragment to a character,and extract patterns from character sequence,so the problem is converted to comparison between pattern strings.The experimental results show that the match the number of the first method is larger;the second is too small,this is because the similar requirement of the first is more lenient.Although both methods should not become a sufficient condition(in fact there is no sufficient condition exist) for similarity comparison,but the first method,when the chord sequences extracted are accurate,it can be a necessary condition to narrow the scope of comparison,has a reference value.
Keywords/Search Tags:melody search, main melody extract, music database, MIDI
PDF Full Text Request
Related items