Font Size: a A A

Research On Methods Of Text Semantic Analysis Oriented Towards Different Views

Posted on:2022-05-07Degree:DoctorType:Dissertation
Country:ChinaCandidate:D JiangFull Text:PDF
GTID:1488306350488874Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Acquiring text semantics is the ultimate goal of most natural language processing(NLP)tasks.Text semantics refers to the concept and true meaning contained in text information.Through building the analysis model,text semantics makes the computer simulate human to understand the deep semantics of natural language and identify the true meaning contained in the information.Obtaining the true text semantic can improve the effect of various top-level tasks of NLP,such as machine translation,question answering system,dialogue robot,and so on.When the data distribution is unbalanced,text semantic feature extraction is particularly important.How to do not ignore the characteristics of small categories is an urgent problem to be solved.In the task of semantic relationship recognition,the relationship between Chinese sentences is mostly implicit without connectives,and its recognition is the difficulty of semantic relationship analysis.It is of great significance to extract the central semantics and generate highly readable abstracts for long text.There is still much room for improvement in the completion effect of the automatic generation of abstract abstracts.From the perspective of different research tasks,aiming at the problems in text semantic analysis,this paper proposed several text semantic research models based on deep neural network.The research contents are as follows.1.Text semantic feature extraction model oriented towards the long text classification view.To solve the problem of the unbalanced distribution of text dataset,the method of long text semantic feature extracting is proposed from the perspective of semantic classification of long text.The proposed model of feature extraction of long text is based on the Recurrent Neural Network and the improved loss function.The model first applied the residual network to the semantic analysis of long text,and solved the problem of data distribution imbalance by increasing the Gauss weight in the loss function.In the aspect of model structure,the input text is transformed into vector text representation by word embedding.Secondly,the residual network with attention mechanism learns the semantic features of text from the text representation matrix and reduces the dimension of feature matrix.Then,the neural network further studies the deep semantic features of sequences.Finally,the weights of small classes are increased by the improved loss function.In the experimental part,the results on four open datasets show that the model is better than the baseline models in the task of long text classification.On the CAIL2018 dataset,the experimental result of CRAFL model receives 89.0%on the terms of macro F1.2.Discourse semantic analysis model oriented towards the view of discourse relationship analysis.This paper proposed a text semantic analysis model from the perspective of relationship recognition between Chinese implicit discourse.There are no obvious connectives in Chinese implicit discourse,so the model can only identify the semantic relationship between sentences through semantic analysis.The BERT-Tree model can analyze the semantics of Chinese sentences and identify the implicit relationship between sentences.The model is based on the pre-training language model and tree semantic framework.Firstly,two sentences with semantic relationship are input into the pre-training language model to train text feature matrix.The two feature matrixes are fused to form the relationship features between sentences.Then,the semantic framework of tree structure is used to identify the semantic relations of discourse.In the experimental part,the model is trained and tested on the Chinese text corpus,and verified on the public CDTB dataset.On the CDTB dataset,the experimental result of BERT-Tree model receives 54.3%on the terms of macro F1.The results show that the model can automatically recognize the implicit relationship between Chinese sentences,and the effectiveness of the pretraining model and the tree semantic framework are proved in the task of text semantic analysis.3.Text semantic understanding model oriented towards the view of generation.In this paper,we proposed an end-to-end automatic text summary model,which can directly convert the input text into abstract output without complicated data preprocessing.The difference between the BSSA model and the previous researches is that the text pre-training method is integrated into the sequence generation model,and the attention mechanism is added in the model.First,the model uses the pre-training method to learn the text features before the sequence generation,and generates the feature expression of the text,excavates the deep features of the text,and obtains the text semantics.Secondly,the attention mechanism is added into the sequence generation process,which increases the weight of important features and highlights the text features in the results.Finally,the text summary is generated by the sequence generation model.On the CSL dataset,the experimental result of BSSA model receives 35.1%on the terms of macro RL.The results on several open datasets show that the model is better than the baseline models,and every part of the model supports the overall effect of the model.The experiment results show that the proposed models are successful in the tasks of extracting text semantic features,text semantic analysis,and text semantic understanding.
Keywords/Search Tags:deep learning, feature extraction, neural network, semantic analysis, semantic understanding
PDF Full Text Request
Related items