Research On The Construction Of A Silk Road Themed Knowledge Graph

Posted on:2024-01-24

Degree:Master

Type:Thesis

Country:China

Candidate:X R Wang

Full Text:PDF

GTID:2568307061492004

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

Humanities and social sciences research is an essential tool for studying and understanding human society,culture,and history.In recent years,with the rapid development of big data and artificial intelligence technologies,combining AI technology with internet-scale data has gradually become a hot topic in humanities and social sciences research for knowledge discovery.The Silk Road is a famous ancient trade route that connected China with the Eurasian continent,influencing world history,culture,and economics.However,the available data related to the Silk Road primarily exists in semi-structured and unstructured textual formats,making manual data extraction costly.This research focuses on the application of natural language processing techniques and deep learning models to automatically extract Silk Road-themed knowledge and construct a Silk Road knowledge graph.The main research contents include:(1)Research on Named Entity Recognition(NER)Methods: Preprocessed Silk Road literature is annotated with named entities using a combination of precise string matching and manual labeling,resulting in a standardized dataset containing 13 types of named entities and 35,810 entities.A BERT-IDCNN-Bi LSTM-CRF model is employed,where BERT pre-trained model encodes the input text,and a hybrid model of IDCNN and Bi LSTM extracts local and contextual features.Finally,a CRF model imposes global constraints on contextual information for named entity recognition.Comparative experiments are conducted with Bi LSTM-CRF,IDCNN-CRF,Bi LSTM-Attention-CRF,BERT-CRF,BERT-Bi LSTM-CRF,and BERT-IDCNN-CRF models on MSRA,People’s Daily,and Silk Road datasets.The F1 scores of the BERT-IDCNN-Bi LSTM-CRF model on the three datasets are 94.90%,94.14%,and 90.98%,respectively,outperforming other models.(2)Research on Entity Relation Extraction Methods: Entities are labeled with relations using a combination of rule templates and manual annotation,resulting in a standardized dataset containing seven types of relations and 4,095 relation instances.A joint relation extraction framework based on BERT and attention mechanism is proposed,dividing relation extraction into three modules: entity extraction,relation extraction,and entity semantic recognition.The entity extraction module employs BERT-IDCNN-Bi LSTM for encoding and extracting entity start and end positions.The relation extraction module encodes a custom relation set using BERT and calculates potential relations in sentences based on attention mechanism.The entity semantic recognition module enumerates candidate triplets in the sentence and identifies the semantics(subject or object)of entities.In the Silk Road dataset,this method achieves precision,recall,and F1 scores of 81.13%,80.46%,and 80.79%,respectively.Furthermore,in-depth exploration is conducted on overlapping relation extraction and recognition of triplets with varying sentence structures.(3)Construction of the Silk Road Knowledge Graph: The relation triplets extracted through relation extraction and manual annotation are used to compute entity similarity based on Word2 Vec cosine similarity and Levenshtein distance,facilitating the matching of entities with similar meanings.This process results in 2,421 entities and 6,157 relation instances,which are stored in a Neo4 j database.Finally,a Silk Road knowledge graph question-answering system is built,enabling automatic Q&A and knowledge graph query functionalities.

Keywords/Search Tags:

Silk Road, Named Entity Recognition, Relationship Extraction, BERT

PDF Full Text Request

Related items

1	Research On Chinese Named Entity Recognition Based On BERT
2	Joint Extraction Of Named Entity Recognition And Entity Relationship Based On Neural Network
3	Research On Key Techniques Of Named Entity Recognition And Relationship Extraction
4	Research On Joint Extraction Method Of Entity Relationship Based On Deep Learning
5	Research And Implementation Of Open Domain Knowledge Graph Intelligent Question And Answer System Based On BERT Transfer Learning
6	Research On Named Entity Recognition Algorithm In Mathematics Field
7	Research And Implementation Of Entity Recognition And Relationship Extraction Method Based On Deep Learning
8	Research On Bert-based Named Entity Recognition
9	Research On Wetland Named Entity Recognition And Open Relationship Extraction
10	Knowledge Mining Based On Statistical Snowball Models