Font Size: a A A

Design And Implementation Of Internet Content Analyze System

Posted on:2019-01-14Degree:MasterType:Thesis
Country:ChinaCandidate:M M HaoFull Text:PDF
GTID:2428330545954541Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Nowadays,with the rapid development of the Internet industry,the Internet contents and relative resources have presented fragmentation and disorder,so it has become more and more difficult to capture the users' behavior and interests.Therefore,for the basic network operators,the urgent matters are how to guide users correctly and soreasonably and how to observe the Internet contents-led.Internet Content Analyze(ICA)is a system which can recognize and classify the contents towards the given Uniform Resource Locator(URL),and can restore the users'online scene.It can promote the precision and granularity on the recognition of users' on-line behaviors by the enhancement analysis to DPI towards the Internet content,which can provide the data support for establishing user label system and the accurate portrait.Besides,it can provide further evidence for accurate marketing and real-time marketing based on users' preference and opportunity.As an intern in the company,it is my responsibility to develop the Internet content analysis system.The system mainly includes nine functional modules.I am responsible for the development of seven function modules for online scene recognition,scene behavior recognition,scene content recognition,content classification,input and output modules,active and passive module identification,and rules maintenance platform.The major technologies are as follow:Hadoop distributed processing software framework,Map Reduce programming model,the URL recognition,secondary Recognition towards the APP application,high performance parsing engine.The rules maintenance platform is developed based on the Spring MVC framework,which improves the efficiency of the rule maintenance library.Concretely,it adopts Mybatis persistence framework for data access,uses Redis for data cache,and adopts Nginx for reverse proxy.It is necessary for telecom operators to acquire targeted information about the users'preference and browsing habits in the business sectors and contents in order to ensure the healthy and continuous development in their own business,slow down the process that would be pipelined,and copy Internet business and marketing mode effectively.In this way,they can provide a strong support for the business improvement and the marketing content update.The system has already been implemented and each part of the system is working well.Company has met the operational needs of various departments and their working efficiency has been improved a lot.
Keywords/Search Tags:Mass data processing, URL Recognition, DPI data processing, Redis
PDF Full Text Request
Related items