Font Size: a A A

Representation Learning For Fake Information Detection

Posted on:2018-02-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:L Y LiFull Text:PDF
GTID:1318330536481112Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Fake text information detection is one of great interest in the field of natural language processing,whose aim is to detect and filter untruthful or incorrect information.The research of fake information detection is to recognize wrong information and sources with low quality,and help people to get the true information and avoid to be misleading during consuming.The Web information consists of objective information and subjective information.The objective information is the objective statement of the events or things,the truth of which is unique.The contradictory information of the truth is fake information.The subjective information is about the individual subjective experience,the truth of which does not have uniqueness.According to the subjective information,the difference between the true information and fake information is whether it is from the real experience.For detecting fake information in objective information and subjective information,the researchers have to design specific models to resolve the individual problems.When the corpus has additional information about information sources or user feedbacks,the combination method with the additional information is a valuable research direction.In the task of fake information detection,the core issue is how to effectively represent the text and additional information.The current methods developed for the task mostly use traditional machine learning algorithms and feature engineering,which is because that the features depend on the performance of the fake information detection methods.Since the fake information is mostly made up by liars,who aim to imitate the true information,the fake information is hard to be recognized.The deceptive character of fake information brings challenges on feature engineering,since the features are chosen based on the expert's experience.The representation learning can learn the latent pattern from data,which has the power to abstract and process the data.The representation learning methods bring opportunities in developing the method for fake information detection.On this basis,we take the research on representation learning for text information and additional information to improve the performance of fake information detection.We summarize our research works into the following four aspects in this dissertation:Firstly,we propose a dependency analysis based method as well as a contradictionspecific word embedding learning approach to detect contradiction.Because the fake information is contradictory with the truth of the objective information,we present a contradiction detection method to detect fake information.It is difficult to recognize contrasting words in contradiction detection,since wordnet and other similar lexical resources cannot easily detect those contrasting words.For this reason,we learn a contradiction-specific word embedding and incorporate it into a sentence-level contradiction detection model.As shown in experiments,the model improves the performance of contradiction detection.Secondly,we present a sentence weight based representation learning model to detect fake information.It is difficult to detect fake information from the subjective information due to the lack of evidences.However,some latent cues exist in the lies.Therefore,we use document-level representation learning method to mine the laws hiding in the data.We present a sentence weight based neural network to learn the document representation to detect fake information,which uses sentence representation to replace traditional feature engineering.Thirdly,we develop a memory network model to incorporate source reliability analysis in fake information detection.People usually meet contradictory information from multiple sources,such as encountering different boarding times of the same flight.Because this kind of information is simple and consequently owns low information in the situation without rich contextual text,and it is difficult to tell real information from the fake one due to the reason that they are hardly to discern in syntax and semantics.It is difficult to detect fake information.We incorporate the analysis of the reliability of information sources,and present a memory network based model to detect fake information.Finally,we apply attention model to incorporate user feedbacks to learn the representation of the fake information.In the social media,people can forward microblogs to make comments on the original microblogs,which contain the attitudes such as supporting,denying or doubting.We treat the forwarding microblogs as a kind of group intelligence,and apply attention-based model to learn the representation of the original microblog and the forwarding microblogs to detect the fake information.In summary,this dissertation systematically and deeply investigates the application of representation learning on fake information detection in different research state situations,including objective information,subjective information,the combination with information sources and the combination with user feedbacks.We hope that our research could be helpful to the researchers in the area of fake information detection.
Keywords/Search Tags:fake information detection, contradiction detection, representation learning, task-specific word embedding, reliability analysis of information source
PDF Full Text Request
Related items