Font Size: a A A

The Research And Realization Of Text Retrieval Technology Based On Semantic Field

Posted on:2013-10-21Degree:MasterType:Thesis
Country:ChinaCandidate:Y WangFull Text:PDF
GTID:2248330374490025Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The rapid development of human society has entered the information age. Vastamounts of information have been produced at an explosion-like speed. With theever-changing sea of information, it becomes an important problem needed to solvehow to help people to find the interesting information more quickly, more effectivelyand accurately.It is the main contents of this paper, in the special document database, to find thedocuments which is relevant to the text given in advance by user. It differs from thegeneral search engine, the user’s retrieval request becomes an document, not a simplekey word. The informations of retrieval content increase greatly. It has promoted thedifficulty of extracting keyword. That becomes difficulty how to extract the keywordaccurately in order to present the articles which are to be tested and seized. Thetraditional VSM is based on the frequency of the keyword, and ignores the role ofsemantics in the text. Therefore, the result often shows a gap between people’s hopesand the fact. In this paper, we use the semantic field that is the important moderntheory of semantic, take the HowNet as a tool. The semantic relations of the words inthe context document and the traditional VSM are combined to eliminate theambiguity of words. By building various types of semantic fields, it makes theextraction of keywords more accurate and comprehensive, and improves the textretrieval system of recall and precision rate.These specific works have been done as follows:1、It analysis the internal structure of HowNet deeply, and designs the programbased on the HowNet.2、 Using the HowNet, a new variable coefficient of homonym similaritycomputing is proposed according to the count of homonym. Secondly, it takes part ofspeech into account and argues that that of homonym is different in contributions toword similarity and remove the combinations of homonyms with different part ofspeech. The experiments prove the result by this newly-improved computationmethod is better and this calculation causes less complexity and more efficiency.3、Using the HowNet, a new algorithm of semantic relevancy is proposed. Thetransverse relationship formed in interpreting sememes in HowNet was used to bringforward the notion that to take the relationships between the sememes’ interpretations as sememes’ relevancy, computing the similarities of the interpretations of eachsememe then the maximum value is regarded as the sememes’ relevancy. Theexperimental result indicates that with the operational volume being equal, thesemantic relevancy computed by the stated method is promoted effectively andmatches people’s understanding to a much larger scale.4、The semantic field theory is applied to the vector space model. Through theconstruction of various types of semantic fields of the extraction keyword, it greatlyimproves the retrieval recall and precision rates. The search results more in line withpeople’s expectations...
Keywords/Search Tags:semantic field, HowNet, semantic similarity, semantic relevancy, VSM
PDF Full Text Request
Related items