Font Size: a A A

Design And Implementaion Of Service Crawling And Analyzing Module

Posted on:2016-09-04Degree:MasterType:Thesis
Country:ChinaCandidate:L DongFull Text:PDF
GTID:2298330467992848Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the further development of Web service, RESTful Web service has been rapidly developed because of the advantages of lightweight and extensibility, etc. However, most of RESTful Web services documents are just ordinary HTML pages, which makes how to effectively identify and crawl RESTful services an important issur in the field of Web service discovery. Meanwhile, the market of mobile applications is also rapidly expanding with the advantages of the large number of applications and the wide coverage of users. But they also have to face a problem that it’s difficult to extract and analyze the information and user comments of applications because of the complex and numerous service information.Against this background, this paper designs and implements a service crawling and analysis module, including a service crawling submodule based on the service crawler engine and a service information analyzing submodule based on web extraction and topic analysis, aiming to recognize, crawl and analyze RESTful Web services and mobile applications.This paper first introduces the related research of service crawling and analyzing at home and abroad and introduces some technical knowledge about service crawler, RSTful service recognizing, web extraction and user reviews analyzing in details. Then this paper analyzes the requirements of service crawling and analyzing system and delves into the key issues of RESTful services recognizing and reviews topic analyzing. We propose an algorithm for RESTful service identification based on Naive Bayes Classifier and Vector Space Model, which can analyze the page content and page structures respectively and get the composite recognition result. Experiments prove that our method can work effectively with high recall and precision. For the issue of reviews topic analyzing, we adopt the analsis method based on sentiment classification and LDA topic model to extract the topic words for both positive reviews and negative reviews, which shows resonable modeling results with expirements. The following is the introduction of the overall design of the service crawling and analyzing module, and a detailed introduction of funtions and processings of the submodules. Then the integrated testing results prove that the service crawling and analyzing module can fulfill our requirements. In the end, this paper summaries the research and looks forward to the future work.
Keywords/Search Tags:RESTful Web service, Naive Bayes, crawler engine, information extraction, topic analysis
PDF Full Text Request
Related items