Font Size: a A A

Research On The Key Techniques Of Constructing The Knowledge Graph In Financial Field

Posted on:2021-04-05Degree:MasterType:Thesis
Country:ChinaCandidate:S Y XiaoFull Text:PDF
GTID:2518306113961969Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
In the traditional financial industry,various types of data are rich and contain great value.How to effectively use these data and extract useful information from them to help users make decisions is a major problem faced by the financial industry.The construction of knowledge graph in the financial field can be used as the key technology of semantic understanding and search in this field,which provides a strong support for text analysis,data mining and decision reasoning in the future.Knowledge Graph is a new concept proposed by Google in 2012.It is a technical method to describe the connection between knowledge and everything in the world by using a graph model.Knowledge graph is composed of nodes and edges,and the triplet form of < entity,relation and entity > is the basic constituent unit of knowledge graph.Generally,entities are used to represent nodes in the diagram to describe transactions in the real world,such as people,companies,concepts,etc.,and relationships are used to represent edges in the diagram to represent some kind of connection between entities.For example,< jobs,creation,apple > represents the relationship between jobs and apple is creation.In the early stage,knowledge graph was mainly used to enhance search engine,and then it showed rich application value in intelligent question answering,semantic understanding,recommendation calculation and other aspects.This paper will explore the key technologies of constructing the knowledge graph of the financial field,mainly investigate and solve how to extract entities and relationships from the unstructured text of financial teaching materials.In our future research plan,we will dig into the well-constructed knowledge graph of the financial sector,and carry out the work related to the understanding of financial texts such as financial knowledge question-and-answer and automatic financial report review.And that work must be based on the graph of financial knowledge.Therefore,the key technology to study the construction of knowledge graph is the very important work of implementing artificial intelligence(or fintech).The main work of this paper is as follows.(1)the key construction techniques of knowledge graph are introduced,including entity extraction and relationship extraction.This paper mainly summarizes the development process of named entity recognition and relationship extraction,analyzes the advantages and disadvantages of existing models,and discusses the advantages of deep learning model in solving Chinese entity recognition task and relationship extraction compared with other models.(2)In this paper,entity recognition in the financial field is improved on the traditional BI-LSTM-CRF model.In the input layer,the single word vector is transformed into the embedding based on the fused word vector,and the self-attention mechanism is introduced to learn the long-distance dependence inside the sentence,so as to enhance the feature extraction ability of the model.Compared with traditional word vectors,word vectors contain more semantic information.Word vectors and word vectors are combined by weighted sum and then average.Using word vector as input can reduce the cost of manual annotation.The added self-attention mechanism can identify the relatively important contents in the sentence and enhance their importance by assigning different weights to them.The experimental results show that the improved Bi-LSTM-CRF model can effectively identify the proper noun entities in the financial field and has better performance than the previous model.(3)this paper has made some changes in the acquisition of word vector and word vector respectively.BERT's Chinese pre-training model is used to obtain the character vector of each word in the corpus.This model USES the massive wikipedia corpus to conduct language model pre-training,which can learn potential semantic information.At the same time,the fast Text algorithm is used to train the word vectors in the specialized fields based on the textbook corpus of the financial field,and the fusion of the word vectors and the word vectors is used as the input of the enhanced word vectors to the Embedding layer of the entity recognition model.(4)this paper designs an unsupervised relation extraction method based on dependency syntax analysis.For the data processed by named entity recognition,the LTP toolkit of Harbin Institute of Technology was used to carry out part of speech tagging and dependency syntax analysis,and the syntactic structure was summarized by analyzing the dependency relationship between morpheme units.Based on the syntactic structure of the corpus,this paper designs seven Relational Extraction Semantic Forms,such as modified structure,verb structure,and parallel structure,covering most of the Chinese grammar rules,and proves its effectiveness in relation extraction through experiments.
Keywords/Search Tags:Knowledge Graph, Named Entity Recognized, Relation Extraction, Self-attention mechanism, word vector
PDF Full Text Request
Related items