Font Size: a A A

An Interactive Data Mining And Visualization System Using Parallel Computing

Posted on:2018-07-21Degree:MasterType:Thesis
Country:ChinaCandidate:D WuFull Text:PDF
GTID:2348330515973775Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the progress of information technology,the amount of data is showing the scale of explosive growth.Traditional data mining technology based on CPU cannot effectively deal with such a huge amount of data.In addition,the human brain is easier to identify colors and geometric shapes rather than the boring numbers.Data mining results can be more natural and visually presented to the operating interface by using data visualization technology.Besides,it is a good way to meet the needs of users.At present,traditional data visualization tools used by most data mining users can only draw 2D or 3D graphics,and it is lack of interactivity.Based on the above problems,an interactive and visual data mining system is proposed in this paper based on parallel computing.The traditional data stream mining algorithm is optimized based on the GPU programming technology in this paper.Traditional data mining technology based on CPU programming usually utilized serial data progressing methods.It is unable to meet the needs of parallel implementation on multiple computer resources and the times of iterators are big when dealing with big data.Besides,memory consumption is very big,processing speed is very slow and the efficiency is low.However,the GPU programming processes data in a parallel manner.Multiple threads are running independently at the same time.Operational efficiency is very high and this method is more suitable for the processing of large data.According to the problem of data independence and data dependency in the large data,clustering algorithm K-Means and connectivity detection algorithm(CCL)in data mining is optimized respectively based on the GPU programming technology in this paper.In the end,the big data clustering operation is better done.In order to realize data visualization,an interactive data visualization methods is proposed in this paper.The original data sets or data mining results are converted to a vertex,line,face,color and other infomation such as graphics by using DirectX software development kit.Meanwhile,multidimensional model is established by using all kinds of clear graphics functions provided by DirectX software development kit and the visual results are rendered at last.In addition,We also created a graphical user interface(GUI)and user can choose different intentions and different data representation methods to obtain visualization result which is conformed to their needs.Based on the above algorithm,In this paper,we use the energy consumption data of air conditioning to test,by using the GPU programming method to optimize the traditional algorithm,not only realizes the clustering analysis of large data,but also proves that both the speed and the efficiency are enhanced while dealing with huge amount of data.Meanwhile,the abstract data mining results are represented as specific four-dimensional graphics by DirectX.Users can obtain the visualization result they want by modifying parameters which meet the real needs of them.
Keywords/Search Tags:data mining, data visualization, GPU, Parallel Computing, a graphical user interface
PDF Full Text Request
Related items