| Knowledge graphs are semantic webs that store knowledge information in the form of graphs,essentially describing the relations between entities in a "head entityrelation-tail entity" triad,forming a web-based knowledge system.Entity-relation extraction is the core technology for building knowledge graphs,which can identify entities and their corresponding relations from data sources.However,fragmented and unstructured data in the real world is difficult to be utilized by knowledge graphs,thus affecting their application in specific task scenarios and therefore entity-relation extraction techniques have attracted widespread attention.This dissertation focuses on the problem of how to effectively extract entities and relations from unstructured data automatically by improving existing algorithms,with two main parts of work as follows.(1)To address potential contextual information loss and the discrepancy between the two tasks of entity relation extraction,this dissertation designs an entity relation extraction model with multi-feature fusion and task specificity.First,the model obtains character and word embedding vectors through neural network operations.Contextual information is encoded through multi-head self-attention mechanism to capture wordto-word correlations.Then,the semantic features at different levels are stitched together to obtain an efficient semantic representation.The bidirectional long short-term memory neural networks are used to capture the long-range dependencies of sentences.In addition,additional bidirectional long short-term memory networks are designed in this dissertation to enhance the task specificity by adjusting the number of shared layers and task-specific layers on different datasets respectively.Finally,the method is experimented on the dataset and the results show that the method works well for both entity and relation extraction.(2)To address the lack of interactivity between entity and relation information and the inefficiency of the complex triple extraction process,an extraction model based on node information fusion and global correspondence matrix is proposed.The model first represents relations and words as nodes based on graph neural networks,and iteratively fuses the vector representations of these nodes using a message passing mechanism.This allows better use of information about entities and relations to obtain node representations that are more suitable for the relation extraction subtask.Then a subset of potential relations is predicted,which can effectively alleviate the redundancy of relation extraction.Finally,a global correspondence matrix is designed which can efficiently implement the alignment problem of head and tail entities.Experimental results show that the model in this dissertation outperforms the baseline model on the same data set and has good performance in the overlapping relational triples extraction task.Through the study of entity relation extraction technology,this dissertation proposes an extraction algorithm with multi-feature fusion and task specificity as well as an extraction algorithm based on node information fusion and global correspondence matrix,establishes an entity relation extraction system for unstructured data,solves the problems of existing models in terms of lexical ambiguity,relation diversity and information interactivity,and provides technical support for the establishment of knowledge graphs. |