Font Size: a A A

Research And Development Of Internet Video Website Oriented Data Crawler And Parser Technology

Posted on:2018-10-25Degree:MasterType:Thesis
Country:ChinaCandidate:J X HanFull Text:PDF
GTID:2428330518996609Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,the market of video site in China has been expanding with an explosive growth of user data.There are a variety of charts and graphs available on the video sites showing the analysis of video data.To perform the analysis,a large number of stable and reliable data are necessary.Due to the confidential video data,external teams or individuals need to collect video data first if they wanted to analyse video data.The most feasible way is using data acquisition and analysis technology.This thesis firstly introduced the principle of the technology of reptiles,the mainstream of reptile structure and data crawler and parser system.It is found that there is still some progress can be made to improve the stable of data crawler and the speed of parser.A data crawler and parser system were designed and a framework composed of modularized generic components for the key functions were designed and developed.Log module was also developed to provide insight into the system running status.Besides,key functions including HTTP access,multithreading and URL queue were used to achieve the function and performance of the system.Based on the framework,data crawler and parser System of Iqiyi,Tencent,LeTV,Youku and Sohu video website were developed independently.Long time experiments were also carried out to verify the function and efficiency of this system,data result is of high quality.
Keywords/Search Tags:video website, crawler, parser, log
PDF Full Text Request
Related items