Research Of Information Retrieval Technology Based On Semantic Analysis

Posted on:2013-04-10

Degree:Master

Type:Thesis

Country:China

Candidate:F Y Zhu

Full Text:PDF

GTID:2248330377458957

Subject:Computer application technology

Abstract/Summary:

The current information retrieval technologies are mainly based on keywords matching.Most optimized researches nowadays emphasizes particularly on arithmetic rather thansemantics. But many problems can not be resolved fundamentally. For example, multiplicityof semantic, diversity of retrieval expression, omission of related website, how to removeirrelevant web pages, the order of web pages is not reasonable and so on. In view of theabove problemsï¼Œa model of information retrieval based on the semantic analysis wasproposed. The model mainly includes four key points: the method of eliminating ambiguity,the method of semantic expansion, the method of keywords matching and the page-rankingalgorithm. More problems such as multiplicity of semantic, the relative pages are notretrieved and so on, could be effectively resolved through the model. In addition to these,more pages which accord with the retrieval purpose and do not have the keywords alsocould be got, and the page-ranking in order to make relative pages with top positions wasimproved.A method of eliminating the irrelevant semantics of the keywords in the array ofkeywords based on semantic analysis was used. The method could get the similar conceptbetween the concepts of the polysemous word and the keyword in the array of keywords onthe theory of ontology and rule out the irrelevant semantics of the polysemous word basedon the concept similarity. In the field of semantic expansion, a method of semanticexpansion based on the tree of the ontology was used. Many new keywords could beincreased under the premise of the retrieval purpose was not changed, the problems that therelative pages are not retrieved and the basis of page-ranking were resolved through themethod. A method for keyword matching based on the expanded array of the keywords wasproposed. It made the difference between the old keywords and the expanded keywords, andensured that they could play a significant role to retrieve pages and page-ranking effectively.At last, the algorithm of the ranking by the word frequency and the location based onsemantic analysis was improved. In order to the final weight of the page could be moreobjective for the usersâ€™ retrieval purpose, the weight of the keywords was initialized throughthe algorithm. The experimental data obtained through the development tools include ProtÃ©gÃ©3.4.7ã€Nutch1.2and so on showed that the relative accuracy ratio on the base of the traditionalrecall ratio and accuracy ratio under conditions of the practical environment of thedevelopment and test were fully considered. The effectiveness of the model in reducing thenumbers of the pages are not retrieved and ordering the pages based on the importance ofthe page by analyzing the diffidence of retrieval results compared with other models wasproved. Finally, the idea of this dissertation was proved to be having feasibility through theexperimental.

Keywords/Search Tags:

Semantic analysis, Ontology, Information retrieval, Semantic similarity

Related items

1	Research On Semantic Retrieval & Its Semantic Similarity Based On Ontology Technology
2	Research On The Search Technology Of Geographical Information Based On Semantic Similarity
3	Research On Domain Semantic Retrieval Model Based On Ontology
4	Research Of Information Retrieval Technology Based On Semantic Analysis
5	Research On Information Retrieval Technology Based On Semantic Web
6	Ontology-Based Domain Resource Semantic Retrieval
7	Research On Ontology-Based Semantic Information Retrieval
8	Study And Implementation Of Tourism Information Retrieval System Based On Domain Ontology
9	The Application Of Ontology-based Retrieval In Teaching
10	Research And Implementation Of Method For Component Testing Information Semantic Retrieval System Based On Ontology