A Study On Neural Network-based Natural Language Semantic Representation

Posted on:2019-07-16

Degree:Doctor

Type:Dissertation

Country:China

Candidate:Q Chen

Full Text:PDF

GTID:1318330542497980

Subject:Signal and Information Processing

Abstract/Summary:

PDF Full Text Request

Artificial intelligence consists of three major levels:computational intelligence,perceptual intelligence,and cognitive intelligence.Natural language understanding is an important area of cognitive intelligence research,and it has also been widely con-sidered as one of the problems of AI-complete.Natural language usually uses words as the basic unit to form sentences and documents.Semantic representation of natural language is core factor for understanding of words,sentences,and documents.Cur-rently,word embedding based on distributed semantic hypothesis has achieved great success for word representation.However,the semantic representation of sentences and documents still faces many challenges.How to represent variable-length sentences and documents into fixed-length low-dimensional dense vectors under the framework of neural networks is still the core issue in current semantic representation research.Nat-ural language inference and automatic text summarization are typical natural language understanding tasks that rely on sentence and document-level semantic representations.At the present stage,neural networks-based natural language inference methods often ignore the inherent syntactic structure information,and they are lack of the use of ex-ternal semantic knowledge when represent the sentence semantics.The automatic text summarization methods are mainly aimed at the automatic summary of the short para-graph,and do not take into account the possible redundancy problem of dealing with the long document in semantic representation of the document.The thesis focuses on the semantic representation of natural language based on neural networks.It conducts research on semantic representation methods,natural lan-guage inference methods,and automatic text summarization methods,including:Firstly,the thesis studies enhancing sentence embedding with generalized pool-ing.A vectorized multi-headed self-attention mechanism is proposed to obtain a fixed-length sentence semantic vector representation that includes the widely used max pool-ing,mean pooling,and scalar multi-headed self-attention as special cases.The proposed method improves the semantic representation ability of sentences.The model benefits from properly designed penalization terms to reduce redundancy in multi-head atten-tion.The thesis evaluates the proposed model on three different tasks:natural language inference,author profiling,and sentiment classification.The experiments show that the proposed model achieves significant improvement over existing sentence-encoding-based methods.Secondly,the syntactic semantic representation and the natural language inference method which combine syntactic structure and sequence modeling are studied.Aiming at exploiting potential on the sequential inference networks and alleviate the deficiency of syntactic structure of the language,the thesis proposes an enhanced sequential in-ference model(ESIM),and the syntactic structure information is further integrated into both local inference modeling and inference composition components to improve the performance of natural language inference tasks.Thirdly,the thesis studies the methods of semantic representation and natural lan-guage inference models enhanced with external semantic knowledge.Aiming at the problems that existing neural network-based natural language inference methods rely too much on end-to-end training and their semantic representation model lacks the use of external semantic knowledge,a neural natural language inference models enhanced with external semantic knowledge is proposed.The method integrates external semantic knowledge on the three components:co-attention calculation,local inference informa-tion collection and inference composition based on the ESIM mentioned above,and improves the generalization ability of the original model in the case of less training samples.The proposed method has achieved performance improvement on two natural language inference benchmarks.Finally,the thesis studies the distraction-based semantic representation of docu-ment and the automatic text summarization method.Existing neural network-based au-tomatic text summarization methods usually use sentence-level sequence-to-sequence models,which have insufficient attention to the global text.This results in the problem of information redundancy when dealing with long document summarization.The the-sis proposes a text summarization method based on recurrent neural network.The thesis proposes neural models to train computers not just to pay attention to specific regions and content of input documents with attention models,but also to distract them to tra-verse between different content of a document so as to better grasp the overall meaning for summarization.Without engineering any features,the models achieve performance improvement on both Chinese and English datasets.

Keywords/Search Tags:

Natural Language Understanding, Semantic Representation, Natural Language Inference, Textual Entailment, Automatic Text Summarization, External Semantic Knowledge

PDF Full Text Request

Related items

1	Research On Enhancing Natural Language Inference Through Knowledge Graph Embedding And Cross-lingual Transfer
2	Research On Deep Learning Based Textual Entailment Techniques
3	Automatic Summarization System Based On Natural Language Processing
4	Research On Natural Language Semantic Representation And Reasoning Based On Neural Networks
5	An Integrated Method To Recognize Textual Entailment
6	Research On Text Abstract Generation Method Based On Deep Neural Network
7	Research On The Technology And Key Problems Of Automatic Video Clip And Mixing Based On Natural Language Processing
8	Natural language interference from textual entailment to conversation entailment
9	Research On Natural Language Understanding Of Air Travel Based On Joint Modeling
10	Automatic Understanding Of Natural Language Questions For Chinese Querying Bases On Ontology