| Knowledge graph is an effective way of knowledge organization,and the professional field conducts knowledge management by building vertical domain knowledge graph.In view of the increasing scale of knowledge,multi-source heterogeneous knowledge has become the main data source of knowledge graph.Multi-source heterogeneous knowledge cannot guarantee the accuracy and correctness of knowledge,coreference resolution technology and entity alignment technology can effectively disambiguate entities to achieve the goal of building a knowledge graph.At present,although a lot of research results have been achieved in coreference resolution technology and entity alignment technology,there are still many difficulties that have not been solved.For example,most neural network coreference resolution models focus on document structure information while ignoring syntactic and semantic information in documents,resulting in a decrease in knowledge utilization.Knowledge graph embedding representation does not fully utilize entity attributes and relationships between entities,resulting in knowledge graph modeling information missing.The structural heterogeneity of equivalent entities in knowledge graphs is rarely mentioned,resulting in a decrease in the accuracy of entity alignment in knowledge graphs.In view of the above problems,the main research works of the thesis are as follows:Aiming at the problems that most coreference resolution models ignore syntactic and semantic information,resulting in low document utilization,in order to integrate the syntactic and semantic information in documents,a coreference resolution model based on semantic embedded network is proposed.The model mainly includes two stages.The first stage uses syntactic parser and SRL parser to extract syntactic and semantic features,and constructs heterogeneous graph according to their feature types.The second stage adopts GAT to selectively propagate syntactic and semantic features in heterogeneous graphs and extract the most relevant information through attention mechanism.Aiming at the problems of low utilization of knowledge in knowledge graph embedding,which leads to low embedding expressivity in modeling,a knowledge graph embedding representation method combining entity attribute embedding and relation embedding is proposed.The method mainly includes two stages.In the first stage,the pre-trained word vector is used to obtain the initial embedding of the entity name,the BERT language representation model is used to obtain the entity attribute embedding,and the relationship embedding is calculated according to the initial embedding.The second stage first integrates the information generated in the first stage to obtain entity embedding representation,and then performs joint learning to iterate.Aiming at the problem that the equivalent entity structure of knowledge graph is heterogeneous,which leads to the low accuracy of entity alignment,an entity alignment method based on matching graph is proposed.The method mainly includes two stages.The first stage samples the one-hop neighborhood of the central entity and extracts the one-hop neighborhood with a large amount of information.The second stage uses the sampling of the first stage to obtain one-hop neighbors,constructs a new matching graph for the central entity,and performs the entity alignment task through the matching graph.Finally,the method proposed in this paper is experimentally verified on OntoNotes5.0,DBP15 K and DWY100 K respectively.Experiments include coreference resolution experiments and entity alignment experiments.The coreference resolution experiment is mainly divided into three processes: language model embedding,heterogeneous graph construction,and coreference link calculation.The average F1 value of indicators are used as the main evaluation index.The entity alignment experiment is mainly divided into three processes: joint embedding representation,a neighborhood sampling and matching graph alignment.Hits@1 and Hits@10 values are used as the main evaluation indicators.The experimental results show that the performance of the method proposed in this paper has been greatly improved compared with previous studies,and it has a positive effect on the task of knowledge graph construction. |