| Mining frequent itemset is an important research direction in the field of data mining.The purpose is to find items with higher frequency from the dataset.By studying the process of mining frequent itemsets in different business contexts,we can not only find the high frequency items,but also analyze the association rules of different items and discover the inner relationships between differe nt items.It is of great significance to study the mining process of frequent itemsets in large-scale data sets.The study are based on SPARK distributed computing framework.By using frequent itemsets mining technique and time-series model,we try to figure out the relations among different music.The scale of the music dataset and the user amount will limit the efficiency of mining frequent itemsets,and the periodic change of the amount of play will affect the fitting effect of time series modeling.For the problems in the process of user playing records mining,the paper puts forward the improvement strategy from two aspects.First of all,classifying the users according to the characteristics of their music playing history,such as,language,music a ge and singer type.By calculating the amount of frequent itemsets and frequent items in FP-Growth algorithm,the results of mining frequent itemsets before and after classification are compared.Under the same support threshold,the improved method can get more frequent itemsets,and the number of items is more frequently than before.Secondly,we analyze the time series of the broadcast,and add the penalty term in the sequence which has a rising trend.Establish ARIMA model for time series.According to the modeling results,the change of the playback volume is predicted,and the error between the predicted value and the actual value is compared.Through the establishment of time series model,the change law of the playback volume is studied.According to the thought of classification,this paper improves the method of mining frequent itemsets,improves the effect of frequent itemsets mining,and provides an effective method for personalized music recommendation.Besides,the effect of model fitting has bee n improved by adding penalty term to the irregular sequence.Besides,In order to meet the needs of large-scale data processing,a distributed computing framework called SPARK is introduced in the process of mining frequent itemsets.This paper studies the running efficiency and parallel situation of FP-Growth algorithm on SPARK,and improves the efficiency of mining frequent itemsets. |