Research On NLP Generation Task Based On Transformer And Semantic Supervision

Posted on:2022-03-06

Degree:Master

Type:Thesis

Country:China

Candidate:S Q Hu

Full Text:PDF

GTID:2518306524980839

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

As a branch of Artificial Intelligence,Natural Language Processing(NLP)has many research directions.Among them,generative task is the current research hotspot,because generative task is more challenging,it mainly includes automatic text summary and machine translation two sub-tasks.Automatic text summarization means that the machine extracts the key contents of the entire text from a given text.It can promote the efficiency of extracting key information from large amounts of data,and can also be used to generate headlines for short news texts.At present,multi-layer encoders are used in text summarization models based on neural network.Although more content can be mined through multi-layer encoders,it is easy to generate semantic bias.The aim of Machine Translation is to let machines to translate the source language text to the target language text.At present,Neural Machine Translation(NMT)models are mostly used in Machine Translation research.Most current NMT models are improved based on the Transformer model,but these models do not make use of the key information of the source language text.And the translation result of the key information determines the final translation quality directly.To solve these problems,the following work was been completed:1.Firstly,we modifed the mask matrix according to the idea of the UNILM model because Bert cannot adapt to the text summarization task,so that Bert can complete the text summarization task.Secondly,we proposed a semantic monitoring method based on capsule network to solve the problem of semantic deviation after input is encoded by multiple layers.The results of the first layer and the last layer were clustered by the capsule network,and then the semantic features after clustering were monitored by distance,so as to play the role of semantic monitoring.Finally,we conducted experiments on LCSTS and CNN/Daily Mail datasets.2.We proposed a NMT model based on Transformer,which integrates key information.To improve the problem of omission and mistranslation of key words in NMT models,we conducted a comparative study of the current keyword extraction algorithms.Finally,we chose Text Rank algorithm with the best effect to extract keyword information.About the fusion of key information,we used the method of multi-head attention,and controlled the interference of key information to deep coded information by threshold.To verify the effectiveness of our method,we have conducted experiments on two datasets of WMT,Chinese-English and English-German.3.To verify the effectiveness of our model,we implemented an online text summarization system.The system is designed based on B/S,and it is deployed in the way of separating front and rear.The system includes the main function text summarization.We also tested effect of the system.The results showed that the system is easy to use,and can provide relatively accurate text summarization service.

Keywords/Search Tags:

Natural Language Processing, Automatic Text Summarization, Semantic Supervision, Neural Machine Translation, Key Information Fusion

PDF Full Text Request

Related items

1	Automatic Summarization System Based On Natural Language Processing
2	A Study On Neural Network-based Natural Language Semantic Representation
3	Research On Text Generation Technique Based On Deep Neural Networks
4	Automatic Summarization Of Multimedia Information And Related Technology Research,
5	Research On Automatic Text Summarization Based On Deep Neural Networks
6	Research On Key Techniques Of Query-focused Multi-document Summarization
7	Intelligent Machine Translation In The Context Of Information Processing
8	Research On Automatic Text Summarization Algorithm For Chinese And English Long Text
9	Research On Text Abstract Generation Method Based On Deep Neural Network
10	Research Of Automatic Chinese Text Summarization Based On Feature Information Extract