Font Size: a A A

Research On Semantic Role Labeling Technology And Its Application In Financial Information Extraction

Posted on:2020-10-16Degree:MasterType:Thesis
Country:ChinaCandidate:H B LiuFull Text:PDF
GTID:2428330578955251Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Due to the rapid development of computer technology and network communication technology in recent years,there is an increasing demand for natural language processing related technologies.People want to process large amounts of text by using natural language techniques to get useful information more quickly.Chinese information processing is an important branch of natural language processing,and has achieved remarkable results in basic theoretical research and technology development and application.Semantic role labeling,as a simple implementation of shallow semantic analysis,is a kind of natural language processing task.In recent years,with the popularity of deep learning technology,the use of deep learning technology in natural language processing tasks has become a trend..In the field of natural language processing,the deep learning algorithm based on Long Short-Term Memory is suitable for processing long sequences and learning the long-distance dependency information in the sequence.It effectively relieves the problem of gradient disappearance and gradient explosion that may occur in RNN,so it is especially suitable for processing text information.This paper mainly uses the neural network model of bidirectional LSTM and combines CRF(conditional random field)as the semantic role labeling model of this paper to label the financial corpus used in this paper,and the best value of F1 is 71.65%.The main work of this paper is as follows:First,using the financial related corpus in the Chinese corpus of the University of Pennsylvania,18 semantic role tags were identified and the corpus was preprocessed.Second,construct a Bi-LSTM network with word vector as input and CRF as a semantic role model: This step uses the word as the basic labeling unit and uses the word embedding(Word2Vector)method to train the vector expression form of the word.The word vector is then taken as input and processed through the Bi-LSTM network layer to obtain a feature vector representation.Finally,the obtained feature vector expression is trained and processed by the conditional random field algorithm to obtain the semantic role tag.Third,the final vector based on the fusion of part of speech information is trained as the input of the semantic role labeling model: firstly,the part-of-speech tag in the experimental corpus is used as the output vector.The word vector is then input as an input vector to the Bi-LSTM network layer,trained to learn the vector representation of the part of speech tag,and then the part of speech vector is combined with the word vector.Construct and train the Bi-LSTM + CRF model to predict the corresponding semantic role tag for each word.Finally,the parameters of the model are tested and analyzed.Experiments show that the features after the fusion of part of speech information contribute to the recognition and classification of semantic characters,which makes the model perform better on the corpus.
Keywords/Search Tags:Bi-LSTM, CRF, finance, semantic role labeling
PDF Full Text Request
Related items