Research On Auto-Construction Technology For University Teacher Social Network

Posted on:2012-04-02

Degree:Master

Type:Thesis

Country:China

Candidate:C W Wang

Full Text:PDF

GTID:2218330362450424

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

With the rapid development of Internet, the number of web pages has grown explosively. This makes it possible for people to obtain information from web. But how to acquire the useful information quickly and effectively from information-sea has become an urgent problem. On the other hand, the rise of social networking has effectively promoted the communicatin among people, and to some extent changed the way people access information. This subject aims to use machine learning, datamining and other natural language processing technologies to automatically build a social network of university teachers. Not only to provide Internet users personal information and research information, realize a more direct, high integration, all-round, multi-angle information platform, but also to create an academic exchange platform for researchers. This article focuses on the following issues:First, this article implements a block segmentation model for teacher information extraction. Teachers personal information refers to name, university, professional titles, and so on. They are basic components of teacher's information. We firstly do the pretreatment with teacher introduction web pages, and then divide them into discrete information blocks. Conditional random fields model are employed to label information fields in the block. For basic information and contact information, word level feature can archieve a good result. By expanding features from word level to block level, it's can effectively solve the long distance dependence problem with education related information fields.Secondly, as published papers best reflects teacher's research information, we design a framework to obtain the paper set of a teacher. There are name non-exact match and name ambiguation errors in the paper set. We can easily remove the first type of errors with rules, so this article focuses on the author name disambiguation problem, using a hierarchical clustering based method. Only basic paper information are used as features. The method uses two cluster termination conditions, prior knowledge based and similarity threshold based.Finally, based on teacher personal information and research information, we studied the construction of teacher social network and community detection. There are multiple relationships between teachers, here we build the teacher network according to teacher's research area. Two methods are employed to achieve the goal. In the first method, topic model are used to find the topic distribution of one teachers's paper set. We calculate the distance between two teacher according to the distribution feature. Then Markov clustering model is applied to find communities. Another method uses keyword collection of papers to establish links among teachers. Two complex network clustering algorithms are employed to detect the communities in the network. We then analysis the two methods on the community quality and time complexity.

Keywords/Search Tags:

information extraction, name disambiguation, social network, community detection

PDF Full Text Request

Related items

1	Research Of Disambiguation Of Internet People Information Technology
2	Research On Community Detection And Influence On Information Dissemination Based On Structural Analysis
3	The Research Of Community Detection Based On Incremental Clustering Algorithm On Dynamic Social Network
4	Research On Community Detection And Resource Sharing Mechanism In Social Networks
5	Research On Community Detection Algorithms Based On The Node Following Relationship
6	Research On Community Detection And Influence Analysis In Social Networks
7	Research On Community Detection Algorithm For Social Information Networks
8	Research On Adaptive Community Detection Algorithms In Social Networks
9	Research On Some Key Technologies Of Multi-element Social Network Extraction And Analysis Based On The Web And The Email
10	A Framework Of Community Evolution Analysis In Social Network