Font Size: a A A

Trend Prediction Of Tobacco Purchase Based On Cluster Analysis

Posted on:2018-09-18Degree:MasterType:Thesis
Country:ChinaCandidate:J X ZhuFull Text:PDF
GTID:2428330518955050Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
A lot of concern has been raised over the processes of the purchasing,transportation,warehousing of tobacco,according to the field investigation and data analysis.To take a few examples:there are usually considerable differences of the acquisition time and speed between different places of tobacco origin;consequent,the accumulated waiting time of the freight transportation is farily and unnecessarily huge;detentions of the raw tobacco happen frequently because the processing capacity of the tobacco redry factory is less than the purchased tobacco.Those problems cause huge loss of marpower,material resources and financial resources,as well as the quality of tobacco leaves.The key to solve the problem and build a reasonable tobacco supply chain is to make a good estimation to the acquisition amount each day for each regionIn this paper,a few models are constructed to predict the trend of tobacco purchasing in 31 counties.The main work done includes a cluster analysis of the tobacco purchasing trend,a thorough analysis to the results of the clustering and a model fitting and prediction of the clustering results.Specifically,A hierarchical clustering is applied to the feature data of the 31 counties.The algorithm is chosen based on the characteristics of the data and the pratical situation The number of clusters is determined as 5 with respect to the information criteria,in which the intra-cluster variance is significantly lower than the inter-cluster variance.An elaborate analysis is made to explain the resulting clusters.This paper analyzes the reasons for the differences in the cumulative ratio of various types of acquisition,including climate,elevation,land type,variety,quality of tobacco leaf,the proportion of tobacco leaf,the size of the county company and the development of the county.The correlation analysis of these factors show that the climate,the quality of tobacco leaves,the proportion of the parts are significantly correlated with the clustering results.The Iogistic regression model is fitted to the cluster centers of each class.By comparing the fitting results with the actual acquisition trends,the results are modified.The revised model is as follows:The first kind of logistic regression model:The second kind of logistic regression model:The third kind of logistic regression modelThe fourth kind of logistic regression model:The fifth kind of logistic regression model:The main contributions of this paper are:1.Flexible combination of data mining and statistics knowledge,realize the acquisition trend with a small number of models fitting of 31 counties,reduced the number of parameters and models,so as to reduce the amount of calculation,simplified prediction.2.The correlation analysis of tbe results of the clustering,by data show that the impact of various factors on the acquisition of tobacco.
Keywords/Search Tags:Cluster analysis, Correlation analysis, Logistic regression model, Prediction
PDF Full Text Request
Related items