Font Size: a A A

Research On Text Causality Extraction Based On Deep Learning And Sequence Labeling

Posted on:2022-11-22Degree:MasterType:Thesis
Country:ChinaCandidate:C HeFull Text:PDF
GTID:2518306788494974Subject:Automation Technology
Abstract/Summary:PDF Full Text Request
Generally,a large number of causal relations exist in natural language texts,which make causality mining task plays a crucial role in natural language processing research fields,such as information extraction,relational reasoning,and event prediction.With the development of deep learning technology,causality research has been deepened from traditional template matching and machine learning methods to training neural networks to achieve extraction.It is not only to identify causal sentences but also obtain causal pairs in text through sequence labeling methods.Furthermore,the neural network models trained by deep learning techniques can significantly improve the extraction accuracy.At the same time,the obtained causality pairs through sequence labeling can be better used to build causality networks on an upper task.In this paper,we mainly focus on extracting causality extraction from Chinese and English collection respectively by using deep learning and sequence labeling technologies.Firstly,to address the problems of weak boundary delineation and difficulty in capturing deep causal semantic features in the current English causality extraction research,we propose a Graph Relative Attention(GRAT)network by combining Relative Attention and dependency syntactic graph.With the Dependency-Guided LSTM-CRF framework,we construct the DGLSTM-GRAT-CRF model that has a strong ability to capture long-range dependencies.It can correctly identify causal event boundaries and accurately extract the causality pairs in English datasets.Secondly,to address the problems of insufficiency of semantic representation,difficulty in capturing local information and long-distance dependent features in current Chinese causality extraction research,we fuse lexical information in the traditional character-dependent sequence labeling model by using Soft Lexicon technology,and merge all possible matches between characters into the corresponding character embedding.Meanwhile,we design the Soft Lexicon-Star-BiLSTM model combined with Star-Transformer,which can capture long-distance causal semantic information and local features to achieve causality extraction of Chinese text.Finally,we collect various publicly available causality corpus in Chinese and English,and analyze and accordingly construct causality datasets respectively.To address the problems of inconsistent annotation and incomplete semantic expressions in English causality datasets,we propose a causal sequence labeling criterion based on dependency syntactic relation,which can effectively solve the problem of incomplete semantic expressions of extracted causal pairs.At the same time,we manually label the experimental datasets according to the labeling criterion,and construct Chinese and English causality datasets respectively.The aforementioned two causality extraction models are compared with benchmark models on Chinese and English causality datasets respectively.Experimental results show that the performance of our proposed models is better than the comparison models.
Keywords/Search Tags:causality extraction, sequence labeling, attention mechanism, neural network, natural language processing
PDF Full Text Request
Related items