Font Size: a A A

Research On Document Key Information Discovery And Location Method

Posted on:2021-05-24Degree:MasterType:Thesis
Country:ChinaCandidate:F J XuFull Text:PDF
GTID:2428330614471287Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
With the rapid development of the Internet,network data is growing explosively.How to quickly and accurately obtain the key information needed by users in the mass text data has become a challenge faced by people.Search engines provide a convenient tool for people to find information,but when the search content belongs to a certain range with blurred boundaries,it is difficult to accurately define the query or search conditions by keywords.Therefore,there are great difficulties in how to define and identify such key information.For the above problems,this paper proposes a method of defining key information by means of a sample description.Combining deep learning theory and natural language processing technology,two key information discrimination models based on neural network are constructed to realize the accurate positioning of the key information in documents.The work of this paper is supported by the National Key R & D Program Project "Internal and External Connected Trial Execution and Litigation Services Collaborative Support Technology Research"(2018YFC0831300).The main work of the paper is as follows:(1)Aiming at the task of finding key information at the sentence level of a document,a key information recognition method based on attention mechanism is proposed.This method starts with the feature extraction of the key information and candidate sentences,obtains the semantic representation of them by means of using the bidirectional Gated Recurrent Unit,and realizes the discrimination of candidate sentences by calculating the semantic similarity of text.In terms of text semantic representation,three attention calculation methods are designed to extract semantic features at different levels of text.Experimental results show that the proposed recognition method can effectively identify the key information in the document,and compared with the absence of attention mechanism method,the accuracy is improved.It shows that attention mechanism can better capture the semantic representation of text and improve the effectiveness of key information searching.(2)Aiming at the defect that the aforementioned method ignores the document contextual information,a key information location method combining context is proposed.The aim of this method is to construct a document representation that combines contextual information and key information perception,and to locate any segment of the document.In this approach,a bidirectional LSTM network is applied to directly model the document,so that the document encoding representation contains contextual information.Then the two-way interactive attention between the document and the key information is used to obtain the text representation of the key information perception.Considering the loss of information in the process of text representation,the self-attention mechanism is integrated to enhance document coding.Experimental results show that the accuracy and F value of this method are improved by 3.3% and 1.6% compared with the previous method.
Keywords/Search Tags:Sample Description, Attention Mechanism, Text Representation, Contextual Information, Neural Networks
PDF Full Text Request
Related items