Research On Full-featured Text Search In Natural Language Understanding

Posted on:2014-04-29

Degree:Master

Type:Thesis

Country:China

Candidate:C P Huang

Full Text:PDF

GTID:2268330425961359

Subject:Computer application technology

Abstract/Summary:

With the development of the network technology, the amount of information that exists in the network is growing, too. More and more people focus on how to obtain information from the huge sea of information efficiently, quickly and accurately that meet peopleâ€™s requirement. The traditional search engine only is fixed on keyword information matching, and now more and more people have begun to tend to combine natural language with the search engine technology, what is called the intelligent search engine. In this paper, we introduce and analyze the Full-text Retrieval technology which is popular in area of the search engine, the Full-text Retrieval technology for the unstructured text content pay attention to the all content of a text, Through the text processing we can get the plain text information which can be indexed, then do the Chinese word segmentation and create index for the segmental words which is to make the indexing library and text message. When there is people searching for information, the search engine conducts the key words that the word tapped in the text box and does matching in the indexed database with the processed words, then gets the information that meet the userâ€™s requirement from the index database. Based on the Full-text Retrieval search technology, we do research by adding the natural understanding language processing level which is Chinese word segmentation. The following content is the specific research content and the achievement:First, In this paper, we analyze and do research in the key principle of the Full-text Retrieval and the natural understanding language in the way of the basic theory, combined with the SS(Struts+Spring)framework we make a Full-text Retrieval prototype system that is based on natural understanding language what is the Chinese Omni-segmentation, the prototype system is aiming at the all content of a typical unstructured format document and do text pretreatment for it, Chinese word segmentation, making indexed database, doing information retrieval in the indexed database;Second, in the case when there is only smaller document information in the document database the developed prototype system works in a relatively high efficiency. But can be expected, when the document database contains a very large amount of information, the time and space must also be at quite large costs for doing text pretreatment, Chinese word segmentation and making indexed database. Aiming at this defect, in this paper we propose a thought that is only making indexed database for part of the content in the document, and based on the developed prototype system we make a further research and compare the two different types of document processing mechanism, through the test, we make a conclusion that make indexed database for part of the content in the document is valuable to research in the field of the information retrieval technology.

Keywords/Search Tags:

Natural understanding language, Inverted Index, Full-text Retrieval, Chineseword segmentation, Local Index

Related items

1	A Research Of Full-Text Retrieval Based On Inverted Index
2	Research On Full-Text Retrieval Technology For XML Documents Based On Inverted Index
3	The Research Of Full-Text Retrieval And Its Relative Security Technology For Chinese
4	Understanding Of Web-based Document Inverted Row Of Full-text Index Research And Realization
5	Study Of Indexing Techniques For Encrypted Full-Text Retrieval System
6	Research Of Index In Chinese Full-text Retrieval System
7	The Full-text Indexing Technology Index Merge Algorithm Research And Analysis
8	Research And Implementation Of An Open High-Performance Platform Of Full-Text Retrieval
9	Military Retrieval System Design And Implementation
10	Research On Dynamic Indexing Technologies In Full-text Retrieval System