Font Size: a A A

Sentiment Analysis Of Roman Urdu Sentence With Deep Neural Networks

Posted on:2021-05-05Degree:MasterType:Thesis
Country:ChinaCandidate:Muhammad Arslan ManzoorFull Text:PDF
GTID:2518306512992349Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Sentiment analysis is the computational study of reviews,emotions,and sentiments expressed in the text.In the past several years,sentimental analysis has attracted many concerns from industry and academia.All over the world,social media has set the trend for people to share their personal views in their native language.For sentiment analysis of these reviews,machine Learning Algorithms were the dominant choice of researchers.After unleashing the complex algorithm of machine learning and upgraded versions of hardware to run the experiment,the research community is shifted to implement deep learning for sentiment analysis tasks.Background study of the last five years confirmed that deep neural networks(CNN,RNN,and extended version LSTM)have achieved a remarkable score.Most of these neural networks are unidirectional and demand high resources of time,memory and hardware.Moreover,current methods mainly focus on the sentences of English language analysis.It is still challenging for some minority languages,like Roman Urdu,which has a more complex sentence structure and numerous lexical variations of a single word.In this study,for sentiment analysis of Roman Urdu,we propose the novel “Self-Attention Bidirectional LSTM(SA-BiLSTM)”network to deal with the sentence structure and inconsistent manner of text representation.In SA-BiLSTM,Self-Attention takes charge of the complex structure by correlating the whole sentence and BiLSTM extracts context representations to tackle the lexical variation of attended embedding in preceding and succeeding directions.Our network addresses the limitation of the unidirectional nature of the conventional network as well.Moreover,to measure the performance of deep learning language models for Roman Urdu analysis and make comparisons with SA-BiLSTM,we preprocessed and normalized Roman Urdu sentences.We trained and evaluate all models on both datasets of Roman Urdu.The effective design of SA-BiLSTM uses less computation resources and yields a high accuracy of 68.4% and 69% on preprocessed and normalized datasets respectively.Experimental results show SA-BiLSTM achieves better accuracy as compared with other state-of-the-art deep language architectures.Results analysis and comparison confirmed that the normalization of the dataset leads the model to achieve higher accuracy.Furthermore,as a supplementary work,we fine-tuned state-of-the-art network BERT(Bidirectional Encoder Representation Transformer)on Roman Urdu datasets.The BERT is trained on the massive multilingual dataset,and its architecture comprises of the bidirectional transformer(Self-Attention layers)that makes it consistent with SA-BiLSTM.Results achieved by the BERT on a testing set of preprocessed and normalized data are higher than its result on the XNLI dataset translated into Urdu.
Keywords/Search Tags:Sentiment Analysis, Attention Mechanism, Long short-term memory, Text Classification, Roman Urdu
PDF Full Text Request
Related items