Font Size: a A A

Research And Implementation Of Key Technology Of Data API Retrieval Platform Based On Topic

Posted on:2018-12-29Degree:MasterType:Thesis
Country:ChinaCandidate:J FengFull Text:PDF
GTID:2348330512483438Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In the current age of Internet,with the rapid growth of data,information becomes more complex.There is a problem that users want to search from the massive data which are useful for them becoming time-consuming and laborious.For this situation,this paper proposes a data API retrieval platform based on topic.A distributed,scalable platform based on the topic of data API retrieval is designed in this paper.The platform collects a large amount of Internet information into the subsystem and classifies the data.Each type of data is provided to the user in a convenient and easy-to-check manner.The user selects the information of interest to consume the data in the platform.In order to provide such a platform,the first is to be able to crawl down a large number of Web pages,and need this feature to support the level of reduce labor costs,but also effective extraction of information on the page.A semi-automatic method based on template extraction is designed for that.The platform needs to provide a reasonable way to categorize the data in the face of vast amounts of Internet document data,so users can select useful data based on the category.Therefore,a data classification and retrieval system based on the topic is designed.Based on the LDA model to infer the document subject distribution,then according to the topics and the corresponding topics distribution to establish the corresponding API-Topic and API-Key.And the system provides retrieval methods of these API-Topic and sorts the results of API-Topic through the method based on the similarity.Finally,the data set are returned to users.This platform provides various data,presented to the user through a simple API interface.Both scientific research and business users can consume data from the platform.By experimental analysis,the platform is proved feasible.
Keywords/Search Tags:Internet document data, data API, web information extraction, topic model, API topic retrieval
PDF Full Text Request
Related items