Font Size: a A A

Research On Text Summarization Technology Based On Abstract Meaning Representation Graph

Posted on:2019-08-09Degree:MasterType:Thesis
Country:ChinaCandidate:T S Y MingFull Text:PDF
GTID:2428330596959447Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Facing the explosive growth of information on the Internet,text summarization technology,a key method to extract useful information from massive information,has attracted the attention of researchers in various fields.Text summarization is a research hotspot and difficulty in the field of natural language processing.It mainly studies the generalized literal description summarized from the big data document information that is easy to read and understand,and effectively reduces the users' information overloading problem.The text summarization technology has been widely applied to many fields,including news headline generation,scientific literature summary generation,search result fragment generation,product reviews summary,etc.,and has extensive applied value.Currently,a large number of corresponding text summarization methods have emerged with the waves of advanced technology.However,there are still some shortcomings and limitations in the text summarization:(1)In general,most of the existing researches on semantic text summarization methods only focus on using superfacial semantic to generate summary,which can not make full use of complete semantic information at chapter-level,resulting in less qualified text summarization.(2)The deduplication methods of existing text summarization are all based on the same words to remove redundant information,but no effective methods can be applied to remove semantic duplicate content.(3)Although the existing text summarization technology can effectively extract the content of important words,the semantic coherence between the words in the generated summary still needs to be improved,and there is a lack of a method to improve the semantic structure of the summary.The method of Abstract Meaning Representation graph can well describe the complete semantic structure of a sentence,but the existing AMR graph based text summarization methods only utilize sentence-level semantic information.Based on the Abstract Meaning Representation graph,a method was proposed to generate the text summarization with the chapter-level semantic information.The research mainly focuses on three important steps: extracting important semantic contents,removing semantic redundancy information,and improving the semantic structure of summary.The main contents and innovations of the research are listed as follows:1.A semantic summary subgraph algorithm based on weighted AMR graph was proposed.On the basis of the AMR graph,this paper creatively proposed a method of constructing the overall semantic graph of document by using the general AMR graph to take advantage of the chapter-level semantic information,fusing multiple features by using sparse autoencoder.According to the fused features,corresponding weights were assigned to Abstract Meaning Representation graph nodes which originally did not have weights.The semantic summary subgraph was generated by extracting the important semantic content from weights,and then the text summarization could be restored.The experimental results show that ROUGE value has been improved remarkably and the proposed method effectively improve the accuracy of text summarization.2.A semantic redundancy information filtering algorithm based on AMR graph was proposed.Aiming at solving the problem of semantic duplication in existing text summarization,the concept of semantic redundancy was proposed to describe semantic redundancy information,and the method of using Word Net semantic dictionary to judge the semantic redundant information of AMR graph was introduced.Finally,the redundancy information of summarization was filtered by AMR graph fusion method.The experimental results on the related summarization datasets show that the values of ROUGE and Smatch can be improved simultaneously,and the proposed algorithm can effectively reduce the occurrence of semantic redundancy information when compared with existing methods which is based on the same words to remove redundancy.3.A semantic summarization algorithm based on Integer Linear Programming(ILP)for AMR graph structure reconstruction was proposed.In order to improve the semantic structure relationship among AMR graph nodes,an Integer Linear Programming method was proposed to reconstruct the semantic structure of AMR graphs,which effectively improves the readability of text summarization.Due to the incomplete semantic structure of summary subgraph nodes,the Integer Linear Programming method was proposed to reconstruct the semantic edge relationship among important semantic nodes by determining the objective function and constraints,and generated a semanticly complete text summarization.The experimental results on the summarization datasets show that,according to the apparent improvements in the value of semantic feature structure evaluation metric Smatch,the algorithm significantly improve the coherence and semantic integrity of the summary,and effectively improve the readability of the generated summary.
Keywords/Search Tags:Text summarization, Abstract Meaning Representation Graph (AMR graph), semantic summarization, semantic summary subgraph, semantic redundancy information, Integer Linear Programming(ILP), semantic structure reconstruction
PDF Full Text Request
Related items