Font Size: a A A

Hierarchical Multiword Expression-Based Text Matching Research

Posted on:2012-07-11Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhaoFull Text:PDF
GTID:2178330335460779Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
Massive information increasing in network makes information retrieval (IR) becomes an important way to obtain information. Information retrieval based on keywords has been extensively researched and applied. But in many occasions keywords-based IR can't meet the increasing demand for a variety of information acquisition. Take searching existed two side of supply and demand for example, such as job hunting, more effective way of searching is using resume as input and directly matching it with job description text in job database. At this moment, the problem of retrieval is no longer the keywords-based matching in retrieval source, but the text-based matching in a retrieval source.The texts includes many multiword expressions such as company names,position description,place names and some fixed words. They are crucial in text matching. So the paper proposes the text expression based on MWEs and text matching technology to meet such information retrieval demand.Based on text expression on MWEs, to extend the minimum edit distance between two strings to the distance between two string sets, the paper proposes a measurement based on Minimum Edit Distance (MED) to the similarity between two sets of Multiword Expressions (MWEs), which we use to calculate matching degree between two documents. We test the matching algorithm in the position searching system. Experiments show that the new measurement has higher performance than the cosine distance.
Keywords/Search Tags:multiword expression, text matching, text similarity, minimum edit distance
PDF Full Text Request
Related items