Font Size: a A A

Research On Archives Knowledge Graph Construction Technology

Posted on:2020-03-03Degree:MasterType:Thesis
Country:ChinaCandidate:X W GuoFull Text:PDF
GTID:2428330620451726Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of computer information technology,the types of archival data become diversified from single structured data,and the scale of archival data also increases significantly.This paper focuses on the archival knowledge graph construction technology,aiming to provide a new idea for digital informatization of archives by changing the storage method of archives data and the use of archives resources.Based on the theoretical standard of the conceptual model of archives,this paper proposes a seven-step method to construct the ontology of archives,analyzes the domain scope of the knowledge graph of archives,and defines the hierarchical relationship between archive entity types and entities.After completing the analysis of archive ontology construction,this paper designs the archive entity recognition module,and proposes two file entity recognition algorithms to extract the file entity knowledge.Then the quality of the two archival entity recognition algorithms is evaluated through experiments.It is concluded that the entity recognition algorithm based on LSTM network has improved the correct rate compared with the entity recognition algorithm based on rule matching.After completing the entity identification work,the paper proposes to extract the relationship between the archive entities by using the relationship extraction algorithm based on entity part of speech and the relationship extraction algorithm based on dependency syntax analysis.Finally,the quality evaluation of the two archival relationship extraction algorithms is carried out through experiments.It is concluded that the correctness rate of the relation extraction algorithm based on dependency syntax analysis is higher than that based on entity part-of-speech.In order to solve the problem of knowledge duplication in the archival knowledge graph,the paper designs the archival knowledge fusion module and proposes to reduce the workload of knowledge fusion by establishing a partition index.After demonstrating that the similarity of the attribute weight vector has a correlation with the entity similarity,the author proposes two pairs of entity alignment methods,and then further analyzes the collective knowledge fusion technology.The quality of the four algorithms is evaluated through experiments,and it is concluded that the pairwise algorithm is better than the collective algorithm.In this paper,the archival knowledge graph construction technology is deeply studied from three aspects:ontology construction,knowledge extraction and knowledge fusion.Subsequently,it will continue to consider how to conduct archival knowledge reasoning to further enrich and expand archival knowledge graph.
Keywords/Search Tags:knowledge graph, archival informatization, entity recognition, relationship extraction, knowledge fusion
PDF Full Text Request
Related items