Font Size: a A A

Research On Chinese Word Sense Disambiguation Based On Knowledge

Posted on:2022-12-21Degree:MasterType:Thesis
Country:ChinaCandidate:Y C HuangFull Text:PDF
GTID:2518306758491784Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
Every language is ambiguous,and ambiguous words generally exist in natural language texts.A word often contains multiple word meanings.The task of word sense disambiguation is to distinguish the specific word meaning of the word according to the context in which the word is located,so as to determine the entire paragraph.The process of contextual semantics.This is easy for people.People can usually judge the meaning of an ambiguous word accurately and quickly according to the context information to understand the meaning of the whole sentence,but it is a great challenge for the computer.As a basic research in natural language processing,word sense disambiguation occupies an important position in many research fields,and is directly related to the effects of research in upper-level natural language processing fields such as machine translation,information retrieval,text classification,text generation,and emotion recognition.Therefore,how to quickly and accurately identify ambiguous words is an urgent research problem of great significance.However,most of today's word sense disambiguation models focus on the processing of the context information of the disambiguated words,thus ignoring the relevant external knowledge information of the words to be disambiguated.The reason why people can quickly and accurately distinguish the specific word meanings of the words to be disambiguated,not only depends on the context of the word to be disambiguated,but also depends on the external knowledge accumulated by people in daily life.Therefore,in order to effectively improve the disambiguation effect of the word sense disambiguation model,external knowledge can be added to assist disambiguation.In order to solve the problem of lack of computer knowledge data in the process of word sense disambiguation,this paper firstly builds a Chinese knowledge base,integrates most of the available Chinese word dictionary data,and builds a conceptual semantic network based on the structure of the WordNet English knowledge base,and uses this Based on this,a knowledge-based Chinese word sense disambiguation model Bert-Sense model is constructed,which fully integrates Chinese external knowledge.The specific research and work contents are as follows:(1)This paper builds a Chinese knowledge base.In order to integrate the data sources of various Chinese word dictionaries,give more semantic information to Chinese words,so as to provide rich external knowledge for the word sense disambiguation model,this paper first defines the Chinese knowledge base.The construction framework of the Chinese knowledge base,and then defines the data schema layer of the Chinese knowledge base,including the conceptual entities and their relationships in the Chinese knowledge base,mainly including seven conceptual entities,namely,glyphs,radicals,characters,word meanings,vocabulary,vocabulary A collection of word meanings and synonyms,while combing the relationship between conceptual entities.The main data of the knowledge base in this paper comes from three open-access Chinese word dictionaries.Through data acquisition and data preprocessing,the Chinese knowledge base is finally stored in the Mysql database and the graph database Neo4 j.At the same time,in order to obtain the external knowledge of the semantic relationship between words,this paper imitates the WordNet knowledge base to build a concept semantic network.The Chinese knowledge base provides a large amount of high-quality external knowledge for the Bert-Sense model proposed in this paper.This paper selects word meanings,example sentences and semantic relations between words as the external knowledge input of the model.(2)This paper constructs a knowledge-based word sense disambiguation model Bert-Sense,which uses the pre-trained language model BERT to train related text data so that it can fully extract additional features of related texts.The model is mainly divided into input module,context encoder module,external knowledge encoder module and fusion module.The input module is responsible for the input preprocessing of the entire model,the context encoder module is responsible for encoding the context of the word to be disambiguated,and the external knowledge encoder module is responsible for encoding the external knowledge information of the word to be disambiguated to assist disambiguation,that is,word meaning,example sentences and vocabulary In addition,the fusion module is responsible for fusing external knowledge into the word sense disambiguation model.In the experimental part,we set up comparative experiments and ablation experiments.The experimental results show that the word sense disambiguation model Bert-Sense in this paper has achieved the best disambiguation effect.Finally,we found that after adding word sense,example sentences and semantic relations between words Compared with the model that only uses lexical context to disambiguate,the disambiguation effect is significantly improved,which further verifies that external knowledge has a positive impact on the word sense disambiguation effect.
Keywords/Search Tags:Word sense disambiguation, BERT, knowledge base
PDF Full Text Request
Related items