Font Size: a A A

Research On Deep Web Oriented Object-level Information Retrieval

Posted on:2009-09-20Degree:MasterType:Thesis
Country:ChinaCandidate:C LinFull Text:PDF
GTID:2178360245463701Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of Internet, Web has become a large depository of information. There are various kinds of objects on the Web, large percentage of which are hidden deeply in varied kinds of online databases. Users have to fill out forms and submit query to obtain the data. We call them as Deep Web. If these information can be integrated and provide object-level information retrieval, users can find what they want more efficiently.This thesis does researches on key technologies about Deep Web oriented object-level information retrieval and proposes related algorithms and models. The main work includes:(1) Use focused crawling technology to deal with the problem of Deep Web source discovery. Propose framework and algorithm of search interfaces oriented focused crawler.(2) Research methods based on URL schema and query to obtain database content. Introduce methods to extract object information by DOM and regular expression.(3) Model the rules of how Web objects change. Propose that the synchronization frequency should be decided according to the average change frequency of the object.(4) Propose a hybrid object matching model. This model considers multi-level errors in data extraction and uses precision of attribute extraction to balance structured and unstructured features of objects in object matching.(5) Participate in the design and development of platform of Deep Web oriented object-level information retrieval.Moreover, this thesis also performs experiments on the methods mentioned. Experiments show these methods are effective.
Keywords/Search Tags:Deep Web, focused crawling, Poisson process, object matching, object ranking, data integration
PDF Full Text Request
Related items