Font Size: a A A

Research On Text Classification Of Deep Learning Mixing Model Based On Map Reduce

Posted on:2020-06-15Degree:MasterType:Thesis
Country:ChinaCandidate:X PangFull Text:PDF
GTID:2428330575987992Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of big data era,the "information explosion" has become a serious problem that human beings have to face,and the text information occupies the main position.Traditional text categorization methods have been unable to deal with large-scale,high-dimensional,and diverse text data in the context of big data.How to deal with the technical problems and challenges faced by big data,and efficiently manage and organize this information has become an urgent problem for people.Deep learning is a deep nonlinear mapping structure with multi-layer neural networks.It can complete complex function approximation with fewer parameters and perform multi-layer feature learning on text data to improve classification accuracy.MapReduce is a computational model,platform and framework for high-performance parallel processing of big data.It can solve the problem of insufficient space storage and long time consumption in text classification process.This paper uses MapReduce parallel computing framework and deep learning algorithm to classify text.The specific research results are as follows:1?An improved DAE text feature learning method is proposed:Aiming at the problem that the traditional Denoising Auto-encoder(DAE)model has slow convergence speed and long training time in the feature expression process.It has been improved with additional momentum terms and adaptive learning rates(Mom-Ada-DAE).Finally,text classification experiments were compared by using KNN classification algorithm,traditional DAE and Mom-Ada-DAE model respectively.The experiment proves that its Mom-Ada-DAE can effectively reduce the sensitivity of DAE to the local details of the error surface,reduce the unstable trend of the text feature learning process,and improve the convergence speed and classification accuracy of the model.2?A deep learning mixed text classification model Mom-Ada-DABN is proposed:A text classification method based on the deep learning hybrid model(Mom-Ada-DABN)is designed.The model consists of a 2-layer Mom-Ada-DAE,a 3-layer Deep Belief Nets(DBN),and a Softmax regression classification layer.Finally,the proposed Mom-Ada-DABN hybrid model is compared with a single DAE,a single DBN and a KNN classifier.Experiments show that the Mom-Ada-DABN model effectively improves the efficiency of feature extraction and classification accuracy.3?Apply the deep learning hybrid model to the MapReduce platform:In order to solve the problem of insufficient space storage and long time consumption during classification,the hybrid model training process is placed on the Hadoop platform for MapReduce distributed processing.And compare the efficiency of processing text data with a single machine.The experimental results show that the time and space of classification are greatly saved by using MapReduce parallel mode,and good classification results are obtained.
Keywords/Search Tags:text classification, deep learning, Denoising Auto-encoder, Deep Belief Nets, MapReduce
PDF Full Text Request
Related items