Font Size: a A A

Study Of An Information Retrieval Technology Based On Improved Vector Space Model

Posted on:2006-04-18Degree:MasterType:Thesis
Country:ChinaCandidate:D X LinFull Text:PDF
GTID:2178360182477389Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Technical and quick development of Internet, make Web have already become information inside the scope of world's share and one of the most outlets of the information dissemination, its text on the net originally the quantity also becomes the index number class growth. How can inspect the information that customer need to have become with by the square in extensive information ocean quickly the important research topic nowadays.Text information retrieval is a task that involves finding more relevant documents for a user query in a collection of documents. While carry on the information retrieval, with the information that the customer need match usually not in the search result, but a great deal of information with not demanding customer, but take up the search result very big of a part. Therefore, improve inspectional function of text information retrieval; raise the inspectional quantity also becomes a problem needs to resolve urgently.The main research purpose of this thesis is, the factor-- headline position characteristic item that was easily neglected, but possible influence inspectional effect. Aim at this purpose on the traditional vector space model of foundation put forward a kind of improvement vector space model. The power heavy problem of the vector space model to the item of the improvement carried on the research, putting forward conclude to match the heavy calculation method of the item power that appears the position. The method can raise the search type and the text files to match the degree, then raising the Precision. The thesis put forward the concept of many layers vector space model, the new model can compare to resolve the traditional vector space model layer too many problem and can't distinguish analyses the keyword position language righteousness of problem, should help and aim at the search speed of the exaltation index system and the research of the accuracy problems. The thesis still put forward the search condition of the improvement and a calculation of likeness method of the documents, joining the adjustable parameterηin the calculation type of the likeness degree, according to the ability dissimilarity of a position of item expression document topic, set the different value ofη. The thesis put forward a kind of percolation of adjustable stanza value (related threshold value) enactment method, user can according to need to choose to filter the accurate grade to regulate the exportation quantity of the interest web.This page put forward the improvement vector space model algorithm on the...
Keywords/Search Tags:Information retrieval, Vector space model, Term, Recall, Precision
PDF Full Text Request
Related items