Font Size: a A A

Technology And Application Of Deep Semantic Embedding For User Generated Data

Posted on:2020-07-10Degree:DoctorType:Dissertation
Country:ChinaCandidate:G Y LvFull Text:PDF
GTID:1368330575466581Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of technology and the strong promotion of the country,the Internet gradually goes to every corner and every person.At the same time,various net-work services also go deep into all aspects of people's life.Network users have changed their roles from mere information consumers to producers and generated a large amount of user-generated data.Among them,real-time reviews and product reviews,as two types of the most influential user-generated data,have attracted much attention in the industry.Real-time comments,also known as danmu,lead to an emerging interactive mode that allows users to send real-time comments against the online video.This mode greatly improves user activity and user experience,and contains an attractive develop-ment prospect for the online entertainment industry.Product review is the evaluation of products after users purchased products online,which has a significant and quantifiable impact on the purchase decisions of other consumers.It has been playing an irreplace-able role in improving the competitiveness of enterprises and implementing marketing.Obviously,as the most representative user-generated data,danmus and product reviews have very high application value.However,their informal expression,subjec-tivity,and diversity,as well as the dynamic evolution of domain professionalism also bring great challenges to their applications,i.e.,how to express the semantics of informal expression,how to model relationships between diverse semantics,and how to contin-uously model domain data effectively?The fundamental problem of these challenges lies in the effective semantic representation,i.e.,mapping the data to semantic vectors of real numbers in low dimension space which can be utilized in end-to-end model ef-fectively.With the development of deep learning in semantic representation,this thesis starts from the semantic level to explore the technology of deep semantic embedding.Based on this,it then go to the space level and the time level.In order to deal with the challenges in "subjectivity and diversity" and "dynamic evolution of domain pro-fessionalism" in user-generated data,embedding space mapping and lifelong learning techniques are studied.Details and the corresponding applications are given as follows:First,to deal with the practical problems faced by the online video platform in video management and the challenges brought by the informal expression of Danmus,online video labeling with time-stamps based on the Danmu deep semantic embedding is proposed.Specifically,in order to better understand the Danmu semantics,this thesis designs the Temporal Deep Semantic Structured Model.To train this model,it takes advantage of the "time-tempora",assumption of Danmus.Then,features of video seg-ments are constructed based on the corresponding semantic vectors of Danmus.Finally,a supervised method is utilized to extract and label the highlight segments.Experiments on a real-world dataset show the effectiveness of the proposed model to make temporal labels for videos.Second,in order to improve the user experience for online video sharing platforms,and to deal with the "gossiping" characteristics of Danmus,this thesis studies the method to generate real-time video comments based on semantic space mapping.The method is divided into two parts:semantic representation and embedding space mapping.For the semantic representation,on the one hand,it requires that the semantic vectors can pre-cisely represent the semantics of data.On the other hand,it needs to obtain the Danmus with diversified expressions through the semantic vectors.To this end,this thesis de-signs embedding models based on variational auto-encoders for image and text data,and also make it possible to balance the capacities between embedding and generating.For the embedding space mapping,in order to obtain Danmus with diversified semantics,it further proposes the embedding space mapping method based on generative adversarial network(GANs)to achieve Danmu generation.Finally,the model is evaluated on a real-world dataset with various metrics including human evaluation.At last,to deal with challenges brought by the dynamic evolution of product re-views',domain professionalism,we explored sentiment classification based on lifelong embedding.In the application scenario of lifelong sentiment classification,a learner continuously performs a series of classification tasks over time.The goal is to apply the knowledge obtained from the previous task to the new task and make it perform better than without using any knowledge.The existing lifelong sentiment classification methods are mainly based on naive Bayes.Due to the limitation of its basic model,its performance still has some room for improvement.Therefore,this thesis proposes the lifelong sentiment classification method using the recurrent neural network(RNN)as the basic model,which focuses on the fusion of short-term and long-term knowledge.It should be noticed that to deal with the catastrophic forgetting problem in the incre-mental learning of neural networks,this thesis also designs a partial update mechanism to reduce the impact of catastrophic forgetting.Finally,it shows the effectiveness and stability of the proposed model through the experiments on a real dataset.
Keywords/Search Tags:User Review, Danmu, Product Review, Deep Semantic Embedding, Generative Adversarial, Lifelong Learning, Sentiment Classification
PDF Full Text Request
Related items