Font Size: a A A

Deep Learning For Text Representation And Application

Posted on:2023-03-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:B MaFull Text:PDF
GTID:1528306914976309Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Natural language processing(NLP)is one of the most important research directions in the artificial intelligence,also known as computational linguistics.With the rapid development of the internet in recent years,text data on the internet is growing explosively.There are more and more application requirements in the NLP field.Text representation uses low dimensional dense vectors to represent the semantic and grammatical information of text,also known as distributed representation.From the perspective of text granularity,text representation technology can be divided into word representation technology and sentence representation technology.Word representation technology is a basic research direction in the deep learning-based NLP field.Nowadays,most of NLP models take word vector as the main input.Sentence representation is based on word representation,and then the dependence of words in sentences is captured by convolution neural network,current neural network or other structure.Word representation and sentence representation technology greatly improve the performance of some basic tasks,such as sentence classification,part of speech tagging,named entity recognition and so on.At the same time,there are many important NLP tasks that require in-depth understanding of long text,such as automatic text summarization,multi-turn response selection and so on.The understanding of long text is based on word representation and sentence representation technology,and needs a deeper understanding of text structure and text semantics.It is an important and difficult problem in the NLP.This paper is arranged according to the understanding of words(word representation),sentences(sentence representation)and paragraphs.The specific research contents are as follows:1.We propose a word representation method based on attention mechanism and Chinese semantic structure.Chinese words can be divided into more detailed elements such as characters and radicals.These elements contain rich semantic information,and the semantics of words can be deduced from the elements that make up them.Although some researchers have noticed that fine-grained information such as characters and radicals in Chinese words can be used to improve the quality of word vectors,no one has deeply explored the semantic structure of Chinese words and how to use the semantic structure of Chinese words to design a word representation model.This work regards Chinese words as a semantic structure composed of words,characters,radicals and components,and studies how to better combine sub-word information and capture the semantic structure of Chinese words.In addition,an asynchronous training strategy is designed to decouple the updating of word embedding matrix and attention weight.The experimental results show that the proposed method achieves better results than the baseline method in word similarity evaluation,word analogy task and text classification task.2.We propose a semi supervised sentence classification model based on user polarity in social scenarios.This work studies how to use user information and large-scale unlabeled data to solve the problem of insufficient labeled data in social scenarios.We use ELMo and convolutional neural network to extract sentence semantics to obtain sentence representation,propose the concept of user polarity to represent user bias,and design two calculation methods to quantify user polarity.At the same time,the self-training method is used to train on the large-scale unsupervised data.Quantifiable user polarity is used in(1)The training and predicting phase to alleviate the problem of data sparsity;(2)The self-training phase to reduce the noise of.pseudo-labeled data and evaluate the reliability of pseudo-labeled samples.The experimental results show that our method obtains the best performance on the Qatar dataset.3.We propose a contextual sequential matching network for multi-turn response selection in retrieval-based robots.This work studies how to absorb global context information in the matching stage to improve the performance of the retrieval-based multi-turn response selection model.Firstly,the matching method based on attention mechanism is extended to a contextual fashion,which has the ability to absorb context information dynamically in the utterance-response matching stage.In addition,we design a fusion module composed of recurrent neural network and Transformer to capture the short-term dependence and long-term dependence in the matching vector sequence simultaneously.Experiments on three public large-scale datasets show that the proposed method is significantly better than other baseline methods.4..We propose an extractive dialogue summarization model based on distantly supervised machine reading comprehension.This work studies how to solve the problems of traditional summarization methods:(1)It is difficult to extract all key points exactly;(2)It is required to annotate the segments to be extracted in the training set;(3)The speaker information in the dialogue is ignored or not fully utilized.We transform the summarization task into the machine reading comprehension problem,and use the predefined problem to locate all the key points accurately.A distantly supervised method is proposed to solve the problem of lack of labeled data.The proposed model can be trained without annotated samples.A new task called "solver classification" is proposed to absorb the role information of the speakers.We collected a real-world summary dataset from the logs in a customer service platform.Experiments show that the proposed method outperforms strong baseline methods more than 6%on ROUGE-L.
Keywords/Search Tags:Deep Learning, Word Representation, Sentence Representation, Text Understanding
PDF Full Text Request
Related items