Font Size: a A A

The Research And Application Of Data Mining In The Quality Control Of Network Communication

Posted on:2018-10-28Degree:MasterType:Thesis
Country:ChinaCandidate:K ShenFull Text:PDF
GTID:2348330542459945Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In recent years,with the development of the Internet,a variety of e-commerce enterprises and business models have sprung up,online trading products cover all aspects of social life.However,in the rapid development of e-commerce at the same time,new problems and contradictions generated.E-commerce market,the product quality problems were multiple growth,fake and shoddy product quality and safety incidents occur frequently,to the health and property of consumers to bring serious harm,how to improve the quality and safety of e-commerce products,eliminate product quality problems China's e-commerce industry,the healthy and sustainable development constraints,has become a national and public issues of common concern.China's network communication quality information analysis,supervision is still in the initial stage,at present all the network communication quality information management system of our country is more dispersed,the lack of a unified network communication quality information management system.People use the search engine to retrieve quality information,can not be effective,timely retrieval of quality information.And the current network communication quality information management system lacks the network communication quality information analysis and the excavation system,can not effectively object to the information excavation and the analysis.Therefore,it is necessary to study the quality of network communication.The main work of this paper is as follows:(1)Aiming at the characteristics of network communication quality information,this paper studies the application of data mining technology in network communication quality information processing.This paper studies the network communication quality information preprocessing technology,such as the word segmentation of text data,studies the network communication quality information representation model,compares and analyzes the characteristics of various models,studies the network communication quality information clustering technology,analyzes the distribution of network communication quality inf ormation,Network communication quality information classification technology,comparison of commonly used quality information text classification evaluation method;analysis of the characteristics of various classification techniques;Finally,the compara tive test,and the test results were analyzed.(2)A distributed network crawler is designed for the characteristics of network communication quality information.In order to solve the problem of distributed crawling strategy,a distributed network crawler crawling strategy is proposed for the distributed environment,and a distributed web crawler URL task is proposed for the distributed task scheduling strategy.Allocation strategy design;(3)According to the characteristics of network communication quali ty information,the data mining technology is applied in the product quality information management system.Using the open source segmentation tool IKAnalyzer,to achieve the product quality information word segmentation;using Java for TF-IDF algorithm to achieve,as the calculation of product quality information text similarity;the use of open source clustering software JGibb LDA,to achieve product quality information text Clustering function;using the open source Weka(Waikato Environment for Knowledge Analysis)data mining software to achieve the use of Bayesian classification classifier.The innovation of this paper is as follows:(1)the use of data mining technology for network information dissemination information processing,mining.The application of LDA algorithm in network propagation information clustering is studied,and the open source tool JGibb LDA is used to realize the text clustering function of product quality information.The classification algorithm is studied in the network communication information clustering application,and the method of classification and evaluation of commonly used quality information is compared.Finally,a comparative test of various classifiers is carried out,and the test results are analyzed.Finally,a suitable classifier for network propagation information is selected.Finally,a classifier is used to analyze the various classifiers.(2)According to the characteristics of network communication information,a distributed network crawler suitable for network communication information is designed.In order to solve the problem of distributed crawling strategy,a distributed network crawler crawling strategy is proposed.In order to solve the problem of distributed task scheduling strategy,a distributed web craw ler URL task is proposed for the distributed crawler strategy.
Keywords/Search Tags:Network communication quality information, Distributed network crawler, Data mining
PDF Full Text Request
Related items