Research And Implementation Of The Knowledge Search System Based On Wikipedia

Posted on:2013-11-25

Degree:Master

Type:Thesis

Country:China

Candidate:C Z Wu

Full Text:PDF

GTID:2248330374475325

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

In recent years, with the rapid development of Internet technology, Wikipedia hasbecome one of the largest open content knowledge platforms. The number of knowledge itcontains is being updated and increased almost all the time, which makes Wikipedia can beapplied in more and more fields. Research in natural language processing which useWikipedia as natural large-scale corpora has made a lot of achievements.With the number of contributors being constantly increased, the scale and number ofarticles of Wikipedia are keeping growing steadily, and there are more and more people wouldlike to use Wikipedia to find what they want. However, the search engine within theWikipedia is still searching in the way of traditional full-text matching, although everydocument contains lots of internal links which link to other documents, most of them have nosemantic relationship with the current document. This paper argues that searching processshould be based on semantics, so how to add semantic functionality during the searchingprocess in the Wikipedia is a search priority.As to adding semantic functionality in the searching procedure, an ordinary method maybe that doing the search work meanwhile searching in the other documents and computing therelatedness between the two documents. But due to the Wikipediaâ€™s huge amount of data andthe time complexity of the algorithm of computing the semantic relatedness, the wholeprocess will spend a lot of time, which will make a negative effect on retrieval efficiency anduser experience. To solve this problem, this paper proposes a method, which uses theWikipedia corpus resources to build a semantic knowledge base, to improve query efficiency.Firstly, we have done a detailed study on the characteristics of Wikipedia, including itsclassification structure, page structure, page link structure as well as a variety of data storageformat. And then a set of processes, which can effectively extract the structured informationfrom the Wikipediaâ€™s backup data, have been developed, resulting in achieving the basiccorpus resources which is the base of the research and the semantic relatedness algorithmproposed in this paper. Then we deeply studied the traditional semantic knowledge baseâ€™ssemantic features and the manifestation of a semantic knowledge, and then built a knowledgebase. Finally, on the basis of the knowledge base we built, a simple knowledge search systemwas implemented, which allows the user to find some knowledge and that semanticallyrelated in a convenient way.

Keywords/Search Tags:

Wikipedia, semantic relatedness, knowledge base, knowledge search

PDF Full Text Request

Related items

1	Automatic Construction Method For Domain Concepts Based On Wikipedia Semantic Knowledge Base
2	A Collaborative Method On Association Semantic Knowledge Base Construction
3	Research On Building Wikipedia Semantic Knowledge Base And Its Application In Text Classification
4	Research Of Semantic Relatedness Measure Based On Wikipedia Structure
5	Mining Semantic Knowledge From Chinese Wikipedia
6	Wikipedia Based Conceptual Graph Model And Its Application
7	Enabling Entity Retrieval by Exploiting Wikipedia as a Semantic Knowledge Source
8	Designing And Building A Chinese Knowledge Search System Based On Encyclopedic Knowledge
9	Research On The Methods Of Domain Semantic Knowledge Base Construction And Knowledge Service
10	Study On Constructing The Model Of Knowledge Organization In The Knowledge Base System