Research On Deep Web Oriented Object-level Information Retrieval | Posted on:2009-09-20 | Degree:Master | Type:Thesis | Country:China | Candidate:C Lin | Full Text:PDF | GTID:2178360245463701 | Subject:Computer application technology | Abstract/Summary: | PDF Full Text Request | With the development of Internet, Web has become a large depository of information. There are various kinds of objects on the Web, large percentage of which are hidden deeply in varied kinds of online databases. Users have to fill out forms and submit query to obtain the data. We call them as Deep Web. If these information can be integrated and provide object-level information retrieval, users can find what they want more efficiently.This thesis does researches on key technologies about Deep Web oriented object-level information retrieval and proposes related algorithms and models. The main work includes:(1) Use focused crawling technology to deal with the problem of Deep Web source discovery. Propose framework and algorithm of search interfaces oriented focused crawler.(2) Research methods based on URL schema and query to obtain database content. Introduce methods to extract object information by DOM and regular expression.(3) Model the rules of how Web objects change. Propose that the synchronization frequency should be decided according to the average change frequency of the object.(4) Propose a hybrid object matching model. This model considers multi-level errors in data extraction and uses precision of attribute extraction to balance structured and unstructured features of objects in object matching.(5) Participate in the design and development of platform of Deep Web oriented object-level information retrieval.Moreover, this thesis also performs experiments on the methods mentioned. Experiments show these methods are effective. | Keywords/Search Tags: | Deep Web, focused crawling, Poisson process, object matching, object ranking, data integration | PDF Full Text Request | Related items |
| |
|