Font Size: a A A

A Near-Real-Time Data Processing Cloud Platform For Video QoE Analyzing

Posted on:2017-01-26Degree:MasterType:Thesis
Country:ChinaCandidate:Y Y ChenFull Text:PDF
GTID:2308330491951700Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
With the development of new technologies, there has been appearing some excellent video service providers. Researcher points are transmitted from QoS to User Engagemnet(one kind of QoE). At the same time, users wish high data processing capability and real-time requirement due to the increasing of video data. The traditional data analysis approaches can not satisfy the requirements of practical applications due to the poor timeliness, low accuracy, poor scalability and portability. Therein, it is urgent to build an integrated video QoE analysis system which has the ability to process the lage scale real-time data. To build such system, this thesis describes the choice and improvement of the algorithm, the deployment and modification of the data processing platform, and the improvement of real time data processing. The detailed information is listed as follows:(1) An enhanced AMKNN(Advanced modified K-Nearest neighbors) is proposedIn order to support both offline and online data, this thesis compares several algorithms over different data sets following the general rule of data analysis, and chooses the modified MKNN algorithm for real time data matching. In order to decrease the matching delay further, an enhanced AMKNN algorithm is proposed by combining with K-means, which accomplishes the features of high accuracy, fast processing and wide application range. This part mainly has two components: Firstly, MKNN algorithm is proposed after the modification of the simulated data set during the real time matching phase, which can decease 70% error rate compared with the classical KNN, and fits offline and online data both. Secondly, combined with K-means, AMKNN algorithm is proposed based on the introducation of cross clustering, which saves the computation complexity about 40% with the cost of 1% matching error.(2) Based on the Lambda architecture, a ‘LKS’ data processing system is enhanced and deployed on DCOSIn order to achieve the advantages such as easy deployment, scalability, portability, fast processing, and high throughput, this thesis proposes an enhanced LKS data processing solution, which is built on the DCOS. Besides that, the technique such as Kafka cache and WAL are introduced to improve the fault torelance. After experimantal comparison, the proposed LKS+WAL system built on DCOS works much better than the traditional ELK solution in terms of deployment complexity, scalability, portability, throughput and processing ability.(3) A Double-window data preprocessing method and passive positioning are proposedTo improve the reliability of the real-time processing platform and supporting the diversity of data, this thesis proposes a double-window method, which might reduce the error and computation complexity. Analysis results showed that the introduced double-window data preprocessing technique can reduce the error rate by 80%, and reduce 10% computational complexity under some specific circumstance. On the other hand, the passive location technique provides the extra data dimension for this real-time processing system.
Keywords/Search Tags:Spark, Lambda Architect, QoE, KNN, K-means
PDF Full Text Request
Related items