Research On Named Entity Recognition And Knowledge Graph Construction Of Chinese Classical Literature Texts

Posted on:2022-07-23

Degree:Master

Type:Thesis

Country:China

Candidate:Z Yang

Full Text:PDF

GTID:2505306557468204

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

As people’s living standards continue to rise,the economy grows rapidly and artificial intelligence technology advances by leaps and bounds,the ways in which people are interested in classical Chinese literature are becoming increasingly diverse.The use of natural language processing and knowledge mapping technologies has been shown to facilitate the dissemination and development of classical Chinese academic texts,which have become necessary for the dissemination of traditional Chinese classical literature.Ancient chinese,represented by classical Chinese literature,is an important part of the Chinese language,and its grammar and vocabulary are more complex and elaborate compared to modern chines,presenting opportunities and challenges for natural language processing in Chinese.Most of the existing natural language processing tasks in Chinese have been conducted with datasets of modern chinese,and there are very few publicly available datasets of ancient chinese and less relevant research today,which makes it very difficult to extract knowledge and build knowledge graphs from unstructured texts like Chinese classical literature texts.To address the above problems,Natural language processing and knowledge graph technology are used to study the knowledge graph construction method for the representative Chinese classical literature text,The Romance of the Three Kingdoms,as an example.In the area of word segmentation and part-of-speech tagging,based on the existing joint segmentation and part-of-speech tagging models,a bidirectional gated recurrent cell network model based on a hybrid character embedding approach is proposed to extract features by combining pre-trained embedding and radical embedding with n-gram embedding to fully consider the rich semantic information inside Chinese characters.In named entity recognition,based on the existing iterated dilated convolutional network model,an attention mechanism-based iterated dilated convolutional network model is proposed to focus on local contextual information by introducing an attention mechanism,and finally obtaining the best sequence labels by conditional random fields to obtain the final labeled entity information.In terms of knowledge graph construction and visual query,a workflow method for constructing knowledge graphs from unstructured texts is proposed,and a Flask-based knowledge graph visual query system is designed,which applies the best models of the above two tasks to the complete text of The Three Kingdoms for decoding,extracts entities and their semantic links as nodes and relations in the knowledge graphs respectively,and uses the graph database for knowledge storage,which eventually forms a visual query system combining the front and back ends.The improved network models for joint word segmentation and part-of-speech tagging,and named entity recognition in this paper were all experimentally compared and analysed on the ancient chinese dataset,and validated using unified evaluation criteria,all of which were improved compared to the original models.The constructed knowledge graph and visual query system clearly show the information of the association relationship between the entities,allowing a better understanding of learning classical Chinese literature.This study demonstrates the effectiveness of the above model-building methods and exemplifies the potential of applying natural language processing and knowledge mapping techniques to classical Chinese literary texts.

Keywords/Search Tags:

Chinese classical literature, Knowledge Graph, Joint word segmentation and part-of-speech tagging, Name entity recognition

PDF Full Text Request

Related items

1	A Named Entity Recognition Method For Text Of Han Dynasty Paintings
2	Research On Tibetan Word Segmentation And Part-of-speech Tagging Based On Pre-trained Language Models
3	Research On Tibetan Word Segmentation And Part-of-speech Tagging Based On GNN
4	Tibetan Segmentation And POS Tagging Study
5	Research And Application Of Knowledge Graph Oriented To The Field Of History
6	Research On Thai Word Segmentation And Part-of-speech Tagging Based On Multi-granularity Feature
7	Entity Recognition And Part - Of - Speech Tagging Of Ancient Chinese Chronology
8	Research On The Methods Of Ancient Chinese Word Segmentation And Part-of-speech Tagging
9	A Study On The Mining Of Entity Knowledge In Ancient Chinese Classics
10	Research On Construction Technology Of Knowledge Graph Of Network Film