Font Size: a A A

Technologies Research On Text Analysis In Online Social Networks

Posted on:2018-03-06Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y P NieFull Text:PDF
GTID:1368330569998499Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of Web 2.0 technology,social networks become an essential part of people's real life.The information of social networks influence people's thought,behaviour and cognition.Now,the most common form of social networks' information is the text.Social networks' text analysis technologies can help for language expression mechanism and behaviour characteristics analysis.It can also promote the development of information retrieval and information extraction technologies.And improve the service quality of human-computer interaction and auto question answering.However,there are many challenges for social networks' text analysis.Those challenges include: wide spare of text,high noise,heterogeneity of text and mass of data.In this paper,we combine the previous works and propose several text analysis technologies.The main contribution of this paper is shown below:1.Identifying Users Across Social Networks Based on Text Topic ModelWith the development of social networks,most of users hold several accounts in different social network platforms.It is a very important task to match users' varying identities in the internet.Plenty of existing approaches attempt to link users via comparing social structures,mapping users' profiles and analyzing users' authority.Those existing approaches fail to consider the dynamic changes of users.In the paper,we introduce human behaviour limitations in social networks.And then based on the limitations,we propose a dynamic core interests mapping(DCIM)algorithm,which jointly consider the users' social network structures and users' article content to identify users over platforms.The algorithm firstly models user's core interests and then calculates the similarity of two target users using DCIM.Our experiments use real world datasets from Twitter and BlogCatalog.The results of experiments show that our method is effective on mapping users across social networks.And the algorithm is significantly more effective than baseline methods.2.A Bidirectional LSTM Model for Question Text Analysis and summarizationQuestion text analysis will help to understand user's intention.Community Question Answering(CQA)services become one of the most popular social networks recently.In most existing CQA services,the question posted by real user is usually divide into title and body.It is of great significance to measure these two parts of question.In this paper,we propose a deep neural network based method to analyse and quantify the relation between title and body.The proposed method employs two bidirectional LSTM to read the title and body.And finally output a relevant score to measure the relation between the title and body.The first experiment,we evaluate our model on the Yahoo!Answer dataset and the experimental results show our method can be more effective than the existing approaches.In the second experiment,we use TREC 2016 LiveQA track to test our summarization method.Our approach ranks the first place in all participates.3.Attention based encoder-decoder model for answer selection in question answeringQuestion answering is a classic method based on information retrieval and text analysis.One of the key challenges for question answering is to bridge the lexical gap between questions and answers because there may not be any matching word between them.Machine translation models have been shown to boost the performance of solving the lexical gap problem between question-answer pairs.In this paper,we introduce an attentionbased deep learning model to address the answer selection task for question answering.The proposed model employs a bidirectional long short-term memory(LSTM)encoderdecoder,which has been demonstrated to be effective on machine translation tasks to bridge the lexical gap between questions and answers.Our model also uses a step attention mechanism that allows the question to focus on a certain part of the candidate answer.Finally,we evaluate our model using a benchmark dataset and the results show that our approach outperforms the existing approaches.Integrating our model significantly improves the performance of our question answering system in the TREC 2015 LiveQA task.4.A Dynamic Dropout for Recurrent Neural Networks in Text ClassificationText Classification is a classic task in text analysis domain.And Deep Neural Network based methods obtain the state-of-art performances in many sentiment classification tasks.Dropout is a powerful regularization method for neural networks to prevent the co-adaptation and over-fitting.Original dropout method can not be applied on recurrent layers.And the existing RNN dropout methods dropped each neuron with the same drop probability p independently.In this paper,we discuss if using dynamic dropout method,which treated each neuron with different drop probability could improve the neural network's performance in semantic domain.We evaluate our method in a classic text classification dataset.And experimental results show that dynamic dropout is more effective than original dropout in text classification.
Keywords/Search Tags:Social Network, Text Analysis, Machine Learning, Topic Model, Question Analysis, Answer Retrieval and Text classification
PDF Full Text Request
Related items