Research On Clustering Algorithm And Its Application Based On Reinforcement Learning

Posted on:2022-01-20

Degree:Master

Type:Thesis

Country:China

Candidate:L Zhang

Full Text:PDF

GTID:2518306542980759

Subject:Electronics and Communications Engineering

Abstract/Summary:

PDF Full Text Request

In recent years,with the progress and development of computing technology,various industries produce millions of data and information every day.In order to effectively use the information hidden in the data,data mining technology arises at the historic moment.For example,in the field of industrial production,the photovoltaic industry has shown a rapid development trend in recent years.Solar cells dominate the photovoltaic market by virtue of their high-cost performance.Polycrystalline silicon is one of the main materials of solar cells.A large amount of production data will be generated in the production process of polycrystalline silicon materials.How to make full use of these data to mine useful information to guide production is the research focus of this project.Cluster analysis,as a research hotspot in the field of data mining,can be used to classify unlabeled data into clusters according to data distribution rules.In this paper,cluster analysis is used to select the allocation schemes of polysilicon production data,in order to find out the optimal scheme of raw materials with low cost and high economic benefit from hundreds of raw material data.The main purpose of cluster analysis is to divide the data into multiple clusters,so that the less similarity between samples in different clusters,and the more similarity between samples in the same cluster..There are many kinds cluster algorithms.Among them,the partition-based clustering algorithm is widely used in applications because of its simplicity,high efficiency,easy implementation and expansion,and approximately linear computational complexity.The classic partition-based clustering algorithms include K-means,K-means++,and FCM etc.These algorithms are flexible and simple,converge quickly,and easier to extend new functions.However,there are still some shortcomings.For example,the selection of the initial clustering center has a greater impact on these algorithms'performance.Euclidean distance is a general criterion to determine the degree of similarity,but it is difficult to use the helpful information generated in the iterative process of the algorithm,moreover,its high computational complexity reduces the calculation speed of the algorithm to a certain extent.For solving these problems in existing partition-based clustering algorithms,this paper proposes a reinforcement learning-based clustering algorithm(RLC).The algorithm does not need to initialize the clustering center through random selection,and introduces the reward and punishment mechanism of reinforcement learning in the clustering task.The Q table is established to store the"knowledge"learned by the agent during running the algorithm.The agent selects action instead of calculating Euclidean distance of,which is benefit to reduce the computational complexity of the algorithm.K-means,K-means++,and FCM are used as benchmarks,and ten small-scale UCI data sets are used to verify the performance of the proposed algorithm in experiments.The results showed that the proposed RLC algorithm gave higher clustering accuracies and better clustering performance.Compared with the benchmark clustering algorithms,the RLC algorithm has certain advantages,but for most small-scale data sets,the running times of the algorithms are still longer than the benchmark clustering algorithms.In order to solve this problem,this paper further proposes a single behavior reinforcement clustering algorithm(SBRC)based on DP_RI(Discretized Pursuit Reward Inaction).The algorithm introduces the discretized reward technology of DP_RI,and removes the constraint that the cumulative reward of each behavior in the RLC algorithm must be equal to 1.Only the cumulative rewards corresponding to the actions that the agents selected are updated,and others cumulative rewards corresponding to the actions that the agents did not select are not updated.This measure further reduces the computational complexity of the algorithm.In the SBRC algorithm,the greedy coefficient is set to a dynamic value,which increases with the change of iterations,so that a more stable clustering result can be obtained.Except the above 10 small-scale UCI data sets,two medium-scale and one large-scale data sets are also used to analyze the performance of the SBRC algorithm.The experimental results showed that,compared with the RLC algorithm,the SBRC algorithm had higher accuracies and the computational complexities were reduced greatly on small-scale data sets,which is benefit for medium-scale and large-scale data.The production procedure of polysilicon is complicated.The process includes many steps,and the quality of its products is affected by various factors.In industrial fields physical and chemical methods are generally used to analyze the chemical composition of polysilicon production procedure and raw materials for improving the quality of the products,ignoring the aspect of production costs.In the production process of polysilicon,there are many types of raw materials and allocation schemes.It is difficult to find the low-cost,high-economical raw material allocation schemes using traditional analysis methods.In this paper,the proposed clustering algorithms are used for the selection of polysilicon production data raw material allocation schemes,and analyzing the polysilicon production data.This is an innovative idea of using clustering algorithm to analyze polysilicon production data.The minority carrier's lifetime and other indexes were used to find out the raw material allocation schemes with better economic benefit from different kinds of cluster.At the same time,DB index and profile coefficient are also used to evaluate the performance.The results showed that the algorithms proposed in this paper had better clustering performance and can be used for the selection of raw material allocation schemes in the production process of polysilicon.

Keywords/Search Tags:

clustering algorithm, reinforcement learning, greedy strategy, single behavior reinforcement, cumulative reward, polysilicon production data, raw material allocation scheme

PDF Full Text Request

Related items

1	Researches Of Robocup’s Local Strategy Based On Multi-Agent Reinforcement Learning
2	Research On Sample Generation And Selection Methods For Deep Reinforcement Learning
3	Research On Reward Optimization In Reinforcement Learning
4	Robot Navigation Algorithm Based On Reinforcement Learning In Unknown Environment
5	Research On The Sparse Reward Problem Based On Hierarchical Reinforcement Learning
6	Intelligent Interference Strategy Generation Based On Reinforcement Learning
7	Research On Deep Reinforcement Learning Algorithm Based On The Combination Of Intrinsic Reward And Auxiliary Tasks
8	Research On Sparse Reward Based On Reinforcement Learning
9	Research On Reinforcement Learning Based On Clustering Algorithm
10	Theory and application of reward shaping in reinforcement learning