Font Size: a A A

Knowledge Service Platform And Passage Retrieval

Posted on:2011-01-23Degree:MasterType:Thesis
Country:ChinaCandidate:T HuangFull Text:PDF
GTID:2178360332458118Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
A great deal of information are distributed on the Internet with various forms,how to find comprehensive and accurate information has always been the goal ofmany network applications. Search engines could satisfy users'information need insome certain degree by implementing simple key words for retrieval. However,information need itself is usually too complicated to be expressed with words,sentences or even paragraphs. In real world, some means such as classification andcomparison could help people find out what they really want, which is unavailablefor search engines. In the traditional search engine, we always use the keywords asthe content of the query but not to meet the user large grained requirements.Itbecame a hot topic to how to meet the user requirement in all face as much aspossible.The aim of this paper is to build the knowledge and information serviceplatform based on network information integration. Then discuss the passageretrieval which is the part of the machine auto service. The work of this paperinclude:1. The paper proposes a concept of knowledge and information service.Knowledge and information service, not only include information service,information retrieval, but also include increasing knowledge value., providing theplace for communication, knowledge innovation, then build the platform forswitching between the network information and knowledge, in order to provide aconvenient for information retrieval, learning knowledge and expressing personalviews.2. We exponded the methods of network information. We use the Naturallanguage processing technology to collect network information, then classify andarrange it. As a result of these processing, the network information become orderlyand useful. Face the different user requirement. They will be make the networkinformation more useful.3. In this paper, we exponded the algorithem and technology in passageretrieval. We improved the old keyword search engine to meet the user largegrained requirements. We used window-fixed partition to divid the document in the library into many passages, used TF/IDF to select several keyword represent thequery,sorted them according their weights, then used the permutations of them toretrieval in the document library. User will get the result documents which includesimilary passages. In this paper, it tried two methods while passage patition:Window-fixed-no-overlap partition and Window-fixed-overlap partition, then useexperiment to compare them. In the experiment, Window-fixed-no-overlap will useless retrieval time, though its paraghraph accuacy is lower a little than Windowfixed-overlap partition. When permutate the keywords, it also use two differentmethods: 5 keywords permutation and 10 keywords permutation according topriority, and use experiment to compare them. In the comparision, we discovered, 5keywords permutation get less retrival time, while 10 keywords permutationaccording to priorty get more result documents.4. We give the evaluation indicators for passage retrieval. The paper proposedthe concept of paragraph accuracy and paragragh recall. Two indexes aboveevaluate the ability of the similar passages retrieval and the suspected plagiarismparagraph detection. Because of its grained requery, the timeliness is one of theimportant evaluation indicators.
Keywords/Search Tags:knowledge service, knowledge service platform, passage retrieval, passage retrieval evaluation indicators
PDF Full Text Request
Related items