Font Size: a A A

The Video Download Method And Distributed Crawling System Design And Implementation

Posted on:2013-09-14Degree:MasterType:Thesis
Country:ChinaCandidate:J J ZhuFull Text:PDF
GTID:2248330395475443Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the development of Internet technology and people’s constant pursuit of televisionand film technology, video websites have been developed rapidly in recent years. P2P livewebsite, BT download stations, as well as the local player software, all have become thedriving force of the video industry, especially network video industry. Online video hasbecome one of the main applications of the Chinese Internet over the past few years, theoutput value increased year by year and will continue to develop rapidly in the next few years.Video resources are one of the most indispensable elements for individuals and businesses,accessing to a batch video resources is very difficult. Therefore it has great theoreticalsignificance and practical value to conduct research video download method on at home andabroad mainstream video site and to build a system to download these video resources.Brief introduction of the development of current mainstream video industry is made inthis research. On the basis of based on the theory introduction of web crawler, server shareddata, hidden of the video download links, we propose a method to study and analyzemainstream video site download, give an analysis process for each video download site, andthen build a distributed video capture system based on the study of these download methods,the system’s overall framework and the detailed design process to meet the needs ofbusinesses and individuals are described.Firstly, Perl modules distributed crawling which is frequently used in the project, theWeb crawler principles of knowledge on the current video site download as well as links Hideand different protocols are demonstrated. Followed by the description of the distributedsystem data sharing, and how to use open source tools to handle video download and mergeoperations. Finally, how to use httpwatch to analyze download website address of domesticand international mainstream video is emphasized.Secondly, according to the analysis results of the download address, we design an unifieddata format to deal with shared data structures on distributed video capture system fordifferent download task. Combined with these distributed download tasks, the overallarchitecture of distributed systems and data preservation are demonstrated. at last, accordingto the overall architecture of the system as well as file sharing principle detailed module design process are analysised,, and the project achievements are shown.Proposed methods of accessing to website address based on at home and abroadmainstream video is applicable to all domestic and foreign individuals and businesses. if addvideo resources, increase personal video collection and preservation of historical data areneeded, as long as it does not produce copyright reasons, these will bring them great role inpromoting the development.
Keywords/Search Tags:P2P, BT, video resource, web crawler, distributed crawling, httpwatch
PDF Full Text Request
Related items