Font Size: a A A

Research On The Evaluation Model Of Spider Detection Based On The Trap

Posted on:2012-11-25Degree:MasterType:Thesis
Country:ChinaCandidate:Z H YuFull Text:PDF
GTID:2248330371458238Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Spider (Web robot) is a program for harvesting internet resources, which not only speeds up the flow but also accelerates the load of the network, so it is necessary to regulate and monitor behaviors of spiders visiting website. Currently, the evaluation of these detection techniques mainly relies on manual analysis of web log data to calculate the recall rate and accuracy. In order to avoid the subjectivity of manual analysis, to find a new effective method is of great significance.The features of spiders and the common spider detection techniques are described in this paper. The advantages and disadvantages of existing evaluation methods of which are analyzed in detail. According to the fault of traditional evaluation methods, a novel evaluation method of spider detection techniques based on trap and combined with the binomial probability theory are proposed, which puts forward the calculation method of relevant parameters and indicators by utilizing the layout of trap Links and the process information of users’ access to the website.The evaluation model based on the trap with strong accuracy does not rely on manual analysis, which makes full use of the trap features and skillfully combines the information of user access to website and the binomial theory, and it can evaluate the existing spider detection techniques from various angles. The model above can also analyze the influence of evaluation result caused by different time threshold values and trap layout ratio.The Experiments results show that this method is consistent with the manual evaluation, and the evaluation methods rely on trap has great advantages compared to the methods by artificial, which a very simple automatic evaluation method with accuracy and objectivity. The method is disturbed to some extent, but if the changes are limited by 10%, the result will become reliable and credible.
Keywords/Search Tags:Trap, Spider detection, Precision, Recall
PDF Full Text Request
Related items