Font Size: a A A

Research On Disambiguation Method Of Repeated Authors Based On Improved PSO Algorithm To Optimize BP Neural Network

Posted on:2021-05-02Degree:MasterType:Thesis
Country:ChinaCandidate:G H QiuFull Text:PDF
GTID:2518306032467834Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the development of science and technology,the number of documents has increased dramatically.In a large number of documents,the phenomenon of duplicate author names will not only reduce the efficiency and accuracy of retrieval,but also affect the progress of knowledge retrieval and research work,so disambiguation of duplicate authors Work is imminent.In order to improve the accuracy of disambiguation of duplicate authors,this paper proposes an artificial neural network-based disambiguation algorithm.The neural network model has powerful nonlinear mapping capabilities and can classify and predict multi-dimensional complex problems.The specific research content includes the following aspects:(1)Construction of literature data set and basis for feature selection.This article first gives the way to obtain the literature records and the formation process of the data set.Secondly,according to different attribute characteristics,it has a different degree of influence on the disambiguation of duplicate authors,and selects the attributes with strong disambiguation ability through corresponding algorithms.The test results show that Email,co-authors,journals,research directions,organizations,English names,graduated colleges,and postcodes have a good differentiation effect.Analyzing the attributes of features has a certain guiding role in improving the disambiguation algorithm and improving the accuracy and efficiency of author identification.(2)Particle Swarm Optimization(PSO)algorithm based on Beta distributed dynamic inertial weights.The traditional PSO algorithm is easy to prematurely converge to the local optimal,and does not have the ability to jump out of the local trap.To this end,this paper proposes the inertial weights based on Beta distribution,using a random strategy to dynamically adjust the size of the weights to improve the algorithm's global search ability.The experimental results prove that the improved PSO algorithm has better average convergence results and further optimizes the nerves.The network lays the foundation.(3)Improve the PSO algorithm to optimize the BP(Back Propagation)neural network model.The PSO algorithm has the characteristics of fast convergence,which can train a set of initial weights and thresholds that are closer to the true value for the BP neural network.First,the improved PSO algorithm is used to optimize the initial weights and thresholds of the BP neural network,and then the optimized initial weights and thresholds are used to continue training the BP neural network model.After multiple back propagation iterations,the final test set is finally obtained.The model with the best performance.(4)Comparison and analysis of experimental results.On the same test set,the disambiguation method based on BP neural network used in this paper is compared with the text clustering algorithm,string fuzzy matching algorithm,sparse feature classification algorithm,and mean square error adjacency matrix clustering algorithm based on the combination of basic features..Experimental results show that the performance of the algorithm used in this paper on the problem of disambiguation of duplicate authors has improved,and the accuracy of the disambiguation results has reached 88.4%,which verifies the effectiveness of the algorithm.
Keywords/Search Tags:Name disambiguation, PSO algorithm, BP neural network, Beta distribution, Dynamic inertia weight
PDF Full Text Request
Related items