Font Size: a A A

Research On Social Network Construction Based On Self-supervision

Posted on:2016-05-19Degree:MasterType:Thesis
Country:ChinaCandidate:S Y ZhuFull Text:PDF
GTID:2308330464453285Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Nowadays, social network construction and analysis has become one of the most popular research topics in the field of computer science. Extracting various types of person relations from the WWW text with high precision has a great significance on social network construction. In this paper, we investigate the methods of personal social network construction based on self-supervised learning. The main contributions are as follows:First, we build a corpora of person entities from Chinese Wikipedia. We screen out the person entity pages on Wikipedia according to their category information. Then we pre-process the texts and the Infobox data on person entity pages to construct the corpus of person entities. This work lays a foundation for our further research on personal relation extraction.Second, we investigate the method of personal relation extraction based on self-supervised learning on the constructed corpus. We map the relation triples to the free texts on Chinese Wikipedia in order to automatically generate annotated training data. Then personal relations between person entities are extracted from texts using feature-based method, laying a foundation for further research on social network construction.Finally, we propose a method to reduce noise from training data based on dependency rules. With the noise in training data reduced, the performance of relation extraction improves. We also proceed with a real social network mining experiment to evaluate the performance of social network mining.The experimental results on Chinese Wikipedia corpus indicate that, personal relation extraction based on self-supervised learning can induce rich types of relations with promising precision, implying that self-supervised learning can effectively construct social network. Dependency rules-based noise reduction method can further improve the extraction performance by enhancing the quality of automatically generated training data.
Keywords/Search Tags:Self-Supervised Learning, Relation Extraction, Social Network, Dependency Structure
PDF Full Text Request
Related items