Font Size: a A A

Research On Learning Automata Theory And Its Application In Influence Maximization

Posted on:2018-09-10Degree:DoctorType:Dissertation
Country:ChinaCandidate:H GeFull Text:PDF
GTID:1368330590955282Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
As a type of Machine Learning,also a branch of Artificial intelligence,Reinforcement Learning(RL)has attracted lots of attention from researchers in recent years.Reinforcement learning requires no prior knowledge and model assumptions about the problem at hand.Besides,different from supervised learning,the training of reinforcement learning is done by feedbacks from environment,without any labeled training examples.By taking the environment as a black box,with a closed loop structure and a trial and error manner,the intelligence agent can learn the optimal action with the guidance of environment feedback.Owing to this property,RL has been widely used in intelligence decision,adaptive control and so on.As the earliest non associative RL algorithm,learning automata(LA)has lots of advantages such as simple form,easy implementation,ability of fast stochastic optimization,high tolerance to noise and self-contained convergence.LA has been applied extensive in theoretic problems,such as graph coloring problem,stochastic shortest path problem,stochastic function optimization,etc.,as well in engineering problems,such as wireless network spectrum allocation,image processing,pattern recognition,etc.However,with the increasing complexity of application scenarios and the increasing scale of the problems at hand,conventional learning automata algorithms faced with new difficulties and challenges.Pursuing learning automata algorithms with faster convergence and higher accuracy,expanding the LA theory to make it more flexible to be applied in new scenarios have become the research hot-spot and development trend in the field.Considering this,this dissertation studies the learning automata algorithm comprehensively,and improves the estimator based LA algorithms from several aspects.Besides,based on part of the above theoretic achievements,we applies learning automata to influence maximization problem.The specific research work is summarized as follows:Firstly,based on the summarization of previous studies,this dissertation discusses the mathematical model,learning framework,performance evaluation metrics and evaluation methods of learning automata in detail,and introduces conventional variable structure learning automata algorithms.Secondly,for the fact that maximum likelihood estimator(MLE)based learning automata has a relatively slow rate of convergence,this dissertation studies confidence interval estimator based learning automata.The advantages and limitations of the existing MLE are also analyzed.Compared with the maximum likelihood estimator,the confidence interval estimator can not only give an accurate estimate of the reward probabilities,but also shed lights on the error of the estimation.This provides learning automata with more information,which facilitates faster convergence of learning automata.Based on this philosophy,this dissertation proposes a general technique for all estimator based learning automata algorithms,and improves its convergence rate by replacing its estimator.Extensive experimental simulations confirm the superiority of the proposed method.Thirdly,the interaction between learning automata and environment is costly in some special cases,and the parameter tuning of learning automata in such environments can be very expensive.For this problem,the parameter-free learning automaton algorithm is studied.Based on Bayesian theory,an Bayesian inference based convergence judgement for learning automata is proposed.By using the philosophy of optimizing,a deterministic exploration strategy is proposed to achieve fast convergences.Incorporating the Bayesian inference based convergence judgement and deterministic exploration strategy,a parameter-free learning automata(PFLA)is proposed.It is proved theoretically that the lower bound of the accuracy of PFLA is irrelevant to the characteristics of the external environment,and the ? optimality of PFLA is further proved.It is verified through experimental simulation that a set of general parameters can adapt PFLA to different environments,instead of tuning parameters for each environment respectively.The simulation results also show that the convergence rate of PFLA is faster than that of all deterministic estimator based learning automata.Fourthly,for the fact that most environment feedbacks are generated via computer simulation,in order to take full advantages of the abundant resources(multi-core CPU,even computer cluster)to accelerate the convergence of LA,the conventional parallel algorithm CPPA and decentralized learning based parallel scheme DLPS are combined and improved in this dissertation.Based on the aforementioned confidence interval estimate,an efficient parallel scheme(EPS)that can parallelize most estimator based learning automata is proposed,the simulation demonstrates its superiority over CPPA.What's more,the parallelization of PFLA is also discussed in this dissertation,and the proposed parallel PFLA inherits the parameter-free property as intended.Fifthly,based on the proposed confidence interval estimator based learning automaton algorithm,the problem of influence maximization in social network is studied.First of all,the influence maximization problem is mapped into a learning problem of LA.Besides,considering the characteristic of the mapped problem,confidence interval estimator learning automaton algorithm in P-model environment is extended to S-model environment.Finally,the learning automaton is used to iteratively select the seed nodes,and the estimate is inherited from last iteration in a certain way to utilize the sub-modularity of the spread function.On these basis,IMLA(Influence Maximization Learning Automata),together with two IMLA-based improved approaches,are proposed.The proposed algorithms improve the classic greedy algorithm by avoiding large amount of Monte Carlo simulation and outperform other heuristic algorithms by enlarging the influence spread,therefore the proposed approaches are suitable for large-scale social network.Simulation experiments on three real-world network datasets show that the proposed methods has the advantages of faster computation speed and larger influence spread.In conclusion,this dissertation investigates the learning automata in stationary random environment,and proposes a series of effective learning automata algorithms,which provide theory evidence and experimental reference for the wide application of learning automata based solutions.
Keywords/Search Tags:Learning Automata, Confidence Interval estimator, parameter tuning, Parallelization, Influence Maximization
PDF Full Text Request
Related items