Font Size: a A A

Research On Automatic Text Classification Methods Based On Neural Interaction Representation Under The Hierarchical Structure

Posted on:2019-12-09Degree:MasterType:Thesis
Country:ChinaCandidate:J M ZhengFull Text:PDF
GTID:2428330611493240Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
In this data-driven age of information,the diverse methods of military intelligence acquisition and the specified approaches of data analysis have given combatants not only an easier access to the mass information,but also a serious information overload.It is the fundamentals of better services provision of military information system to extract valid knowledge from mass information as quickly and efficiently as possible.As one solution to tackle this challenge,automatic text classification can divide the targeted text into one or several given types by learning the text information beforehand,which saves amounts of time for users to read,thus realizing the effective acquisition of information.This paper aims to study text representation in the field of Natural Language Processing?NLP for short?.With Deep Learning employed,our computer can induce and arrange information of the targeted text quickly and accurately.The research on text representation and the implementation on automatic text classification can gratify users'needs for information and improve the accuracy of service level of information system.This thesis focuses on the problem of automatic text classification,and mainly made several contributions as follows:?1?The proposal of a hierarchical-architecture-based neural network classification model.This paper takes the hierarchical structure of text as the prior knowledge and establishes the overall framework of hierarchical neural network classification model.By comparing the differences of algorithms'complexity between models with and without the hierarchical architecture,we find that the inclusion of hierarchical architecture is able to greatly reduce its algorithm complexity in some specific network structures without affecting that in others.Moreover,on the public dataset,the hierarchical-structure-based neural network classification model can significantly improve the performance of text classification.Especially,the improvement is magnified as the text length rises.?2?The proposal of a text classification model based on the self-interactive attention mechanism.This thesis finds that the standard attention mechanism in text representation is in need of external prior knowledge as context.Therefore this method can not be universally applied and according to this we modify it with self-interactive attention mechanism?TextSAM for short?.For different information aggregation strategies,this thesis proposes three classification models,i.e.,TextSAMAVE,TextSAMMAXand TextSAMATT,respectively.Through enumeration,each component of the text is regarded as context knowledge in the attention mechanism,which not only enhances the interaction among the components,but also reduces the trouble of acquiring external knowledge.Through analysis on the public dataset,we find that the text classification model with self-interactive attention mechanism can significantly improve the overall performance,especially for short text.?3?The proposal of a sentence classification model based on interaction representation among components in a sentence.Based on the interaction concept proposed earlier,this thesis combines the syntactic structure of a sentence with the interaction representation of words.Thus,from different perspectives of model construction,two different levels of interaction representation embedding models,i.e.,a local interaction representation model?LIR?and a global interaction representation model?GIR?,are proposed.Combined with these two models,a hybrid interactive representation model,namely HIR,is generated.Through analysis on the public dataset,we find that the text classification model with interaction representation has better performance than that of the state-of-the-art models.In particular,we find that the interaction representation method applied in short texts shows better performance,by analysis of sentence length.
Keywords/Search Tags:text representation, automatic text classification, hierarchical architecture, interaction representation
PDF Full Text Request
Related items