Font Size: a A A

Behaviors Modeling And Analysis Of Big Data From Web Apps Using Machine Learning And Deep RNN Techniques

Posted on:2017-03-14Degree:MasterType:Thesis
Country:ChinaCandidate:WAHIU MICHAEL KAMAUFull Text:PDF
GTID:2428330488971876Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
This thesis proposes an application of mobile computing big data service-an approach involving contextualized text data collection from web platform users,ideas/behaviors modeling and data analysis using data mining and deep machine learning techniques.The study demonstrates how to apply Deep RNN algorithms and ML in the area of pattern recognition on text data,assoc iative memory preservation and experiments optimization,using Python programming machine learning libraries.In these works,a Python language-based model architecture,which consists of deep recurrent neural network algorithm,is developed to provide concepts that support building of text-based behavior models;through contextualized word features identification,that leads to behavior(s)detection for overall data analysis.The study first,demonstrates how a sample prototype of a web application,a system for obtaining peoples' text based items reviews-herein called behavior data,can be used on digital devices for data collection;thus acting as a data source for a behavior analytics project.The main study investigates how Deep RNNs algorithms and behavior modeling concepts can be applied for detecting behavior patterns in such types of categorical datasets;by observing the underlying context features(word ideas attributes in text data)which can be associated with some behaviors patterns as subsets of a sentence/paragraph/document/record.The approach consists of:firstly,manual data annotation as a localized ideas modeling and behaviors detection method that identifies and tags records sets.Secondly,building vectors array of the words sets(context aware features)and then applying a Deep RNN algorithm to implement a supervised machine learning technique that learns by detecting the orderly occurrence of these contextualized feature vectors;to build intelligence in form of a text model.The text model should accurately generate acceptable target predictions,which portray a significant similarity with some known class categories,the behaviors learnt when training the algorithm,through automated clause inference mechanism.Subsequently,we are able to define appropriate data representations(behavior modeling),learn from that data and build models to infer from those representations.The methodology was explained by conducting experiments using ML algorithms,specifically to demonstrate Deep RNN learning,by training and testing the algorithm(s)using arrays of numerical vectors inputs that contain context preserving words features.First a text corpus is transformed into a matrix of' vectors i.e.an array of sentences,words in numerical format,which groups the occurrence similarly appearing words close together.These vectors enable prediction of the words' context in a document using the distributed words representation(DM)objective.Next,an architecture with a Deep neural network algorithm is made to learn the vectors patterns of occurrences by computing a features identification function.Here,low level algebra computation is adopted to optimize mathematical expressions that support differentiation,on the multi-dimensional matrix,in the neural network design.In addition,memory cell elements which act as neuron units are positioned in the network,where they effectively build,remember and sustain identified signals at each stage of recurrence.Even though the proposed methodology may be applicable to a wide range of the web and mobile computing environment situations,much attention was focused to people's sentiments data.Case studies experiments were performed on sentiments analysis(SA),using both synthetic and real world data;to do features modeling,algorithms training to get text models,analysis and evaluation.The algorithm's performance was evaluated using some reliable metrics and a deployment of the intelligent text model for real behavior detection scenario and finally analytically discussed the results.The experiments results proved that my approach achieved an accurate,robust and reliable solution that could overcome some of the challenges that exist on previous but related methods.Further,the research method efficiency was tested by comparing it with other popular multi-classifier algorithms including:kNN,Random Forest and Passive-Aggressive classifiers,as baselines and then evaluated the results from the comparison performances.Preliminary results showed that this approach can be more effective as compared to other methods and is as good as any other machine learning method which incorporates context awareness.Along the course of study,some critical discussion was offered based on the acquired practical skills and knowledge about how to;identify potential ideas about behaviors,derive behavior models,and also report about data analytics works of interest.Further,the study suggests some areas of improvement in features modeling,which maintain accuracy standards,algorithm design and optimization while incorporating other data types to help realize a better system without much complexity.Therefore,the study of behavior modeling is realized in four stages:(1)Identifying a behavior aspect and the needed context features as text data.(2)Collecting context data from users via the web and cloud storage as defined by the web-app functional service requirements.(3)Using the data collected to perform deep machine learning,analysis and facts presentation,using the most appropriate data mining tools.(4)Evaluation,validation and benchmarking with other works.This study contributes to machine learning knowledge by providing both practical,theoretical and an analytical understanding of web-generated data processing,ML and Deep learning methods using contextualized data features,behavior modeling using pattern analyzer tools and Big data analytics.The study is meant to aid further research on appropriately better applying and the designing solutions that could be effective in the deployment of internet services,pervasive computing and Deep ML techniques as business models,especially in developing countries.
Keywords/Search Tags:Deep RNN(Recursive Neural Networks), ML(Machine Learning), DM(Distributed Memory), SA(Sentiment Analysis)
PDF Full Text Request
Related items