Font Size: a A A

The Research On Key Technology Of Internet Public Service Search Engine

Posted on:2017-07-01Degree:MasterType:Thesis
Country:ChinaCandidate:Y J ZhangFull Text:PDF
GTID:2348330491964082Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Within 21st century, service oriented architect (SOA) is regarded as an important architect and get rapidly development. At first, SOA was only used as standard protocol for inner-communication of enterprise. With the development of the internet, many communities and organizations publish their business as Internet public service. This kind of Internet public service don't have strict principle like Web Service, but an HTML document to describe service.In order to discover and query these internet public services, this paper conducts a research into them. To solve the discovering, indexing and searching problem of Internet public service The main work and technology include:(1) The research on internet public service discovering crawler. The crawler solves the discovering problem of Internet public service, which discover service through the document of service. For the feature of internet public service document, this paper raises a method to filter example in the document, and solved the inaccuracy of document content. In order to make crawler be able to recognize internet public service document, this paper investigates many classification algorithms and determines most suitable argument.(2) The research on internet public service index technology. This research solves the storing problem of Internet public service and supports the solution of querying problem. Term-document matrix is chosen as the index structure. For sparse matrix problem in search engine environment, this paper designs compression structure, which saves the space and guarantee the performance of searching.(3) The research on processing and execution of user query. This research solves the querying problem of Internet public service. User query is processed as three steps:word splitting, matching and result merging. In the step of result merging, this paper raise an ordering based on mean and standard deviation of TF-IDF vector. Compare to generic search engine, the result is better.(4) Based on the above research, this paper design and implemented APISE, a prototype of internet public service search engine. Through the data crawled from internet, some experiments are performed and prove that APISE can help user find the service fast and accurately.
Keywords/Search Tags:Internet Public Service, Search Engine, Spider, Text Modeling
PDF Full Text Request
Related items