Font Size: a A A

Research On Key Technologies Of Pun Recognition And Generation

Posted on:2021-03-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y F DiaoFull Text:PDF
GTID:1485306302461414Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Pun is a main way to describe and understand the ambiguity of word meaning.It mainly leverages the polysemy and homonymy of words to achieve the goal of sentence with dual meaning by context,which could make language expression more implicit,humorous and impressive.Pun is not only a linguistic phenomenon,but also an implicit emotional expression.In recent years,with the continuous development of the internet and its technology,social media,such as microblog,twitter,forum and so on,have become the largest public data source in the world.Pun appears on a more diversified social media platform that has attracted more attention of researchers.Therefore,people urgently need to apply natural language processing technology to process pun information.The research of pun aims to endow computers with the ability to analyze puns like human beings,which is a challenging research topic.In view of the lack of semantic information utilization and research in pun research at home and abroad,this thesis conducts the in-depth study on the linguistic phenomenon and expression characteristics of puns,and carries out pun recognition research,pun word location research and pun generation research.The specific work is as follows:(1)For the task of pun recognition,pun can be classified into homographic pun and heterographic pun that needs to understand separately.Firstly,aiming at the problem of insufficient semantic understanding caused by polysemy of words in homographic pun,this thesis deeply learns the linguistic characteristics of homographic pun,and proposes a homographic pun recognition model based on contextual representation and gating attention mechanism.This approach solves the polysemy of homographic pun with contextual semantic representation by importing different environments.On the other hand,aiming at the problem that the semantic representation caused by homonymy in heterographic pun,this thesis deeply studies the language characteristics of heterographic pun,presents heterographic pun recognition model based on an attention mechanism by integrating pronunciation and spelling,and solves the ambiguity of heterographic pun by semantic expression vector of pronunciation and spelling.Experiments demonstrate that our proposed pun recognition methods exceeds the existing classification model based on artificial features and the mainstream deep learning models.(2)For the task of pun location,in view of the fact that homographic pun location methods ignore the linguistic and pragmatic information,this thesis learns the semantic characteristics of homographic pun,considers the low-dimensional distribution of semantic space and the synonym information provided by external semantic resources,and proposes a homographic pun location algorithm based on the multi-semantic relationship and semantic similarity matching to realize the homographic pun location.On the other hand,in view of the existing methods of heterographic pun location ignore the linguistic and pragmatic information,this thesis deeply studies the expression of heterographic pun,projects the fine-grained semantic representation integrating character,phoneme,part of speech,position,and word level,and then proposes a heterographic pun location model based on the fine-grained semantic representation and BiGRU-CRF.The experimental results show that the above-mentioned pun location methods have achieved better results than the current state-of-the-art methods which can effectively locate pun.(3)For the task of pun generation,aiming at the problem that the generated homographic pun lacks ambiguity and fluency,which leads to the poor quality of text.So this thesis proposes a homographic pun generation method based on ambiguity and fluency.The model consists of a generator and a discriminator,where generator is composed of a hierarchical on-Istm attention mechanism.The discriminator judges whether the text is real or generated by homographic pun and their different meanings.The generator is trained by a hierarchical reward mechanism and a reinforcement learning model to generate homographic pun with more ambiguity and fluency.On the other hand,aiming at the problem that the generated heterographic pun lacks contextual information,this thesis proposes a method to generate heterographic pun based on context understanding and semantic modification.The former includes two parts:local context understanding and global context understanding.The latter uses the pre training model as a generator to generate heterographic pun,and then applies the pre-training model as a generator to generate heterographic pun.A heterographic pun classifier is constructed to obtain the reward score,and the generated text is optimized through reinforcement learning mechanism.Experimental results show that the proposed method can effectively generate homographic pun and heterographic puns.This thesis makes an effective attempt at the task of pun generation.The experimental results demonstrate that our proposed methods can effectively generate homographic pun and this thesis effectively attempts to generate pun.
Keywords/Search Tags:Sentiment Analysis, Pun Recognition, Pun Location, Pun Generation, Natural Language Processing
PDF Full Text Request
Related items