Font Size: a A A

Implementation Of Advertisement Detection System For Stream Processing

Posted on:2014-09-24Degree:MasterType:Thesis
Country:ChinaCandidate:M NieFull Text:PDF
GTID:2208330434970798Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the popularity of the broadcast television technology, especially IPTV, television has brought better viewing impressions while richer entertainments. And also, the increasing spread of commercials shows a serious impact on the audio-visual experiences. And with the explosive growth of commercials in recent years, traditional manual detections have been unable to complete the filtering works. How to detect bad or illegal commercials online in technical means, has become to an urgent problem. This project originated from the National863Program of "Triple-play Evolutionary Technology and System Research", which breaks through the key technologies of fine-grained video data analysis, and then forms a Commercial Detection Demonstration System. However, with the sharply increasing of the commercial quantity, the existing demonstration system, which uses centralized software architecture and serial processing mode, has been unable to meet the real-time demands of the commercial detection processing.Aiming at the real-time problem exists in the existing demonstration system, from the aspect of the system software computing architecture, this project studies the two typical distributed computing architectures of batch processing and stream processing, explores a new computing architecture which is more suitable for massive data processing. The main innovative points are as follows:(1)Because of the sharply increasing in the commercial quantity, and the serial computing mode of the core algorithm used in existing demonstration system, it has been unable to meet the real-time demands in the practical applications. This paper deeply analyzes the factors affecting the real-time Commercial Detection Demonstration System, proposes an improvement idea used new computing and processing architecture, which has small changes in data structures, high capabilities in parallel processing and short system responding time.(2)Two mainly distributed computing and processing architectures are studied, i.e. the typical batch processing system Apache Hadoop, the typical stream processing system Twitter Storm. Then this paper respectively analyzes the two architectures in cluster constitution, calculation model and the ecosystem, then compares both of them in technical route, computing speed, data throughput, system flexibility, fault-tolerance processing and ecosystems. (3)A stream processing software architecture is structured, which uses a distributed processing mode featured by parallel computation and the memory-based distributed stream processing technology, replacing the old centralized processing mode featured by serial computation. Theoretical analysis and test results show that the architecture can effectively enhance the real-time performance.(4)Since depth analysis of offline data for existing stream processing systems cannot be achieved, this paper proposes a new massive data processing architecture which combines batch processing and stream processing. The architecture brings some improvements, such as, supporting real-time computation in ultra large data scale, meeting much more needs of multiple applications, offering an unified computing platform to obtain precise and flexible computing results, and also it is easy to implement and scalable.
Keywords/Search Tags:Commercial Detection, Distributed data processing, Batch processing, Stream processing, Software architecture
PDF Full Text Request
Related items