Application Of Improved K-means Algorithm To Forecast The Sales Of Dried Blueberry In DaXingAnLing Area

Posted on:2018-03-01

Degree:Master

Type:Thesis

Country:China

Candidate:L Liu

Full Text:PDF

GTID:2348330566950401

Subject:Software engineering

Abstract/Summary:

In today’s information age,sales data all the time in commercial area has an explosive growth,people are more concerned about how to draw the guiding significance to the future sales information from the huge amount of data,to avoid the " high price hurts buyers,low price hurts salers" embarrassing situation,so data mining and prediction the sales data has become a new challenge.In order to break through this bottleneck,the clustering technology and prediction model of data mining provide a solution for researchers.The traditional K-means algorithm is widely used in solving clustering problems.It has obvious advantages in clustering analysis of data mining because of its high degree of aggregation and strong operability.However,the traditional K-means algorithm is limited by some problems because of the lack of the determination of the noise points and the sensitivity of the algorithm itself.Sales forecast as the problem,if not its height data combing stage clustering,fully embodies the characteristics of the data packet,then it will affect the final prediction results,causing the deviation error is out of range.In order to ensure the accuracy of prediction,it is necessary to ensure the fine processing of data sets.This article from the two aspects of the research and application of the improved algorithm,we propose improved K-means algorithm DBSCAN algorithm denoising based on 2005-2014 and its application in the annual sales of DaXingAnLing Range five local companies the process of data mining,the formation of highly concentrated sample set,the original data data as a prediction model for the prediction of the sample set Sales.The main work of this paper is as follows:(1)Improved K-means algorithm to optimize the denoising process.No longer like the traditional K-means algorithm by virtue of human experience and pre conceived to eliminate noise points.Combined with the method of DBSCAN algorithm,the noise of the original data is reduced and the precision of the preprocessing data is improved.(2)Data clustering.According to the high,medium and low three levels,the data set is obtained after the noise points are removed.The results of the improved algorithm and the improved algorithm are compared and analyzed.(3)Forecast sales.The ARIMA prediction model is used to predict the sample set obtained by the improved algorithm.At the same time,the same prediction model is used to predict the clustering results of the improved algorithm.The two prediction results are compared with the actual sales volume to prove the feasibility and superiority of the improved algorithm.(4)Forecast prices.According to the 2005-2014 DaXingAnLing Range blueberry market prices,using four models were fitted to select the closest to the actual situation,and the future price trend analysis.Experiments show that,compared with the actual sales results,optimize the denoising process of the improved K-means algorithm to determine the noise point is reasonable,is obtained by using the denoised samples to cluster,can make the class are more similar,with greater differences between classes,true to clustering the effect and purpose.The clustering result obtained by the improved algorithm is used as the prediction sample set,and the predicted value is closer to the actual value.Based on the forecast of the price trend,the paper discusses the relationship between sales and production in two aspects of sales volume and price.

Keywords/Search Tags:

Data mining, K-means algorithm, DBSCAN algorithm, Sales

Related items

1	Data Mining In The School Computer Room Information Management Application
2	The Study Of Application And Analysis About Clustering Algorithm In Data Mining
3	Data Mining, Cluster Analysis Algorithm Research And Application
4	Research And Implement On Intrusion Detection System Based On Data Mining
5	Study On Data Partition DBSCAN Using Genetic Algorithm
6	Study And Application Of Clustering Analysis In Data Mining
7	Research On Telecom Lte Users Churn Algorithm Based On Data Mining
8	Research On Influencing Factors Of High-income Taxi Drivers' Income Based On GPS Data
9	The Optimization Research On Network Intrusion Anomaly Detection Algorithm Based On DBSCAN And LOF
10	Research On Parallization Of DBSCAN Clustering Algorithm For Spatial Data Mining Based On Spark Platform