Font Size: a A A

Research On Word Sense Disambiguation Based On The Strategy Of Field Priority Selection

Posted on:2022-03-07Degree:MasterType:Thesis
Country:ChinaCandidate:L ZhangFull Text:PDF
GTID:2518306338470734Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
Ambiguity is common in human language,which brings difficulties to natural language processing technology.As a method to eliminate language ambiguity at the word level,word sense disambiguation aims to determine the meaning of a polysemous word according to its context.Solving the problem of word sense disambiguation is of great significance to upper-level applications such as machine translation and content analysis.Word sense disambiguation technology is usually aimed at general texts in life,and needs to use context knowledge and construct a disambiguation model to achieve disambiguation.The word sense disambiguation of texts in specific domains still faces the problem of domain adaptability.The existing research work on word sense disambiguation does not pay enough attention to the mining and utilization of domain knowledge.With the rise of research on word sense disambiguation in specific fields,how to fully excavate and use domain knowledge to improve the performance of word sense disambiguation methods has become an urgent problem to be solved.Based on the above problems,this paper strives to find ways to improve the performance of word sense disambiguation by taking the mining and utilization of domain knowledge as a breakthrough.The main work of this article mainly includes the following three aspects:1.Aiming at the problem of low disambiguation recall rate due to the low quality of domain related words extracted by current disambiguation algorithms,a word sense disambiguation method based on improved log-likelihood ratio(PPRank-LLRF)(LogLikelihood Ratio and word Frequency,LLRF)is proposed.).Combining log-likelihood ratio and word frequency,extract related words that are more relevant to the target field,and introduce a graph model to determine the meaning of ambiguous words through the Personalized PageRank algorithm.The Koeling dataset is used to test the performance of the word sense disambiguation method.Compared with the previous method,the disambiguation recall rate in the Sports field is increased by 1.81%,which verifies the effectiveness of the proposed method.2.Aiming at the problem that the existing semantic understanding methods in domain disambiguation are insufficient to determine the domain,a semantic understanding method based on the field preference strategy(Field Preference Strategy,FPS)is proposed.Comprehensively consider the word sense domain and document domain information,determine the true domain of ambiguous sentences,and select the corresponding disambiguation context to construct a graph model to represent semantics.Use the Koeling data set to test the performance of the word sense disambiguation method.Compared with the previous method,the recall rate on the Sports and Finance field data sets is increased by 0.11%and 0.31%,respectively,which verifies the feasibility and effectiveness of the field priority selection strategy..3.Aiming at the problem of insufficient utilization of domain knowledge in general text,generalize the FPS algorithm and propose an improved field preference strategy(IFPS),and introduce extended domain knowledge to adapt to general text Word sense disambiguation task.The performance of the word sense disambiguation method is tested using the BNC examples in the Koeling dataset.Compared with the previous method,the recall rate of word sense disambiguation is increased by 0.11%,which verifies the effectiveness of the IFPS method.
Keywords/Search Tags:Natural Language Processing, Word Sense Disambiguation, Domain Knowledge, Graph Model
PDF Full Text Request
Related items