Font Size: a A A

Experts Based On Enterprise-class Corpus Search System

Posted on:2010-07-07Degree:MasterType:Thesis
Country:ChinaCandidate:J YaoFull Text:PDF
GTID:2208360275991463Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Today is the age of information.More and more companies have realized the information management,thus those big enterprises or organizations have quite a lot of Intranet data.Automatic information retrieval systems aiming at extract useful information from a large set of document repository have attracted considerable interest in recent years.Expert search is a hot research topic among these systems.There may be thousands of employees working for one multinational enterprise,whose branches locate all over the world.Expert search system which can find persons with specific experience and skills is very valuable assistant management tools for enterprises.The goal of the expert search system is to find persons having certain knowledge and skills in one field,namely domain experts from large text corpus. The key problem is how to model topic-document-expert relationship.In previous studies,all kinds of information retrieval technologies such as language model,social network,the text classification were applied to the expert search system.But these technologies do not take full advantage of the information of the expert as a web object.In this paper,we proposed the improved expert search model which improves the expert search result from two aspects:role determination and resume mining.Role determination improves the topic-document-expert relationship model through the introduction of the concept of "role".For the documents in the intranet corpus have more unified structures and higher relevance,resume mining increased the judgment accuracy of candidates' expertise by analyzing resume pages.At the same time,we propose a procedure of modeling web objects by the use of traditional object oriented analysis method. We propose a way of mining attributes of web objects to return the complete expert information to the user.The work of this paper is mainly in the following parts:●We give a brief survey of the current research about expert search in enterprise corpora and analyze limitations of the existing methods.●We introduce the conception "Role" in our model to better describe the relationship among topic,document and expert.We propose resume page mining to improve search results by mining specific types of web pages.●Traditional object oriented analysis method are used in the procedure of modeling web objects,which extracts web objects from general web pages. We also propose the web object attribute mining model to mine attributes of experts.●The expert search system in enterprise corpora is designed and realized.The experiments are given on the TREC 2007,2008 Enterprise Track data set. The results show the improved expert search model performs much better than general approaches overall.The experiments of attribute mining model also show the effectiveness of the model.
Keywords/Search Tags:Expert Search, Enterprise Corpora, Role Determination, Resume Mining, Attribute mining
PDF Full Text Request
Related items