Font Size: a A A

Research On Transfer Learning Method Oriented To Knowledge Graph Reconstruction

Posted on:2022-09-30Degree:MasterType:Thesis
Country:ChinaCandidate:J Z LiFull Text:PDF
GTID:2518306539469224Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Industrial knowledge graph is an important foundation of industrial knowledge automation,and knowledge extraction(named entity recognition NER and relation extraction RC)is very important for the construction of knowledge graph.In the two tasks of NER and RC,there are very few recent results on text tasks related to vertical specific neighborhoods.Directly using other people's network models will inevitably encounter many narrowness of the neighborhood text tasks in the vertical industry field.At present,knowledge extraction algorithms in the industrial field mainly have the following difficulties and challenges: on the one hand,due to the professional field,the text is full of professional knowledge,which makes it difficult to construct the graph schema layer and brings a lot of annotation costs;on the other hand,,There are many industrial scenarios,and the problems of difficulty in obtaining industrial-related texts and low data volumes abound,and the generalization of the training model is not high.Therefore,related research needs to be further improved.In response to the above-mentioned problems,this paper carries out the following research work:In the task of industrial text named entity recognition,in order to make good use of the multiple meanings of Chinese character word combinations,this paper uses the dual feature input of Chinese character vector and word vector to make full use of the word features of the domain text corpus.The use of bidirectional LSTM in the downstream model pays great attention to the contextual semantic relationship,and finally obtains the sequence prediction label through linear transformation and log-softmax.The model comparison experiment shows that the model shows good competitiveness in the recent advanced models.Finally,in order to improve the NLP model in the case of small samples,five improvements have been made in this article:(1)related knowledge transfer,(2)hyperparameter tuning,(3)joint data set transfer,(4)training embedding vector,(5)Optimize out-of-vocabulary vocabulary.After five steps of improvement,the F1 score value of the dual feature input LSTM model has a good improvement of 9.5.For industrial text relation extraction tasks,this paper uses a prototype network based on contextual attention for Few-Shot relation classification tasks.The use of contextual attention mechanism is to highlight the importance of instances under relational prototypes by assigning weights to instances,so as to generate satisfactory prototypes to alleviate the problem of prototype deviation.Experimental results show that the model can improve the accuracy and convergence speed of Few-Shot relation classification tasks,and achieves the highest accuracy rate of 92.36 on the target data set in this paper.For the project in this article,I developed a knowledge graph visualization platform based on Springboot,Neo4 j and other technologies,and made a report at the end of the article.You can see the screenshot of the platform in the appendix.
Keywords/Search Tags:Knowledge Graph, Named Entity Recognition, Relation Extraction, Few-Shot Learning, Natural language processing
PDF Full Text Request
Related items