Font Size: a A A

Research On Software Requirement Clustering Based On Deep Learning

Posted on:2021-08-13Degree:MasterType:Thesis
Country:ChinaCandidate:G R CuiFull Text:PDF
GTID:2518306230978199Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rise of the fourth industrial revolution,big data and artificial intelligence have been equipped with high-speed development trains.The number and scale of software have grown at an alarming rate,and the types have tended to diversify.Among the many different styles of software,why This strategy of mining demand features and clustering software demand features has become an important challenge in the intersection of software engineering and artificial intelligence.Software requirements text clustering provides a reliable guarantee for software quality,minimizes the risk of requirements analysis,and reduces the cost of software development.At home and abroad,there is still little research on the feature mining of software requirements text.At the same time,the clustering algorithm is too simple.Aiming at the above problems,this paper proposes a deep clustering model combining deep learning techniques and classic clustering algorithms,and achieves a good clustering effect on the software requirement text.This paper analyzes the software requirements text,and finds that the software requirements text has the characteristics of high dispersion,noise,and sparseness.The existing clustering work extracts and divides samples based on traditional features,and rarely considers the functional semantics of software requirements.Therefore,this paper proposes two methods for software-required text clustering:(1)a text clustering algorithm that combines self-attention mechanism and multi-channel pyramid convolution,uses convolutional neural network for feature fusion,and then fuses the Features are output by traditional clustering algorithms;(2)Dropout variational embedded clustering,which uses dropout variational self-encoding depth to extract text features,and then uses the features of the encoder for clustering.Entropy loss,reconstruction loss and clustering loss to achieve a good clustering effect.On the software requirement clustering algorithm(Self-Attention Multi-Channel Pyramid Convolution Network and Self Organization Map,SA-MPCN&SOM)that combines the self-attention mechanism and multi-channel pyramid convolution,the software requirement text undergoes text preprocessing and word embedding Afterwards,the attention mechanism captures the internal relations of the sentence,and then extracts feature information based on the multi-channel pyramid network with different convolution windows.During the convolution process,the text-perceived fragments increase in inverse proportion to the sequence length.Organize the mapping network to complete the cluster output.This method solves the shortcomings of traditional feature extraction methods.By comparing with other depth feature extraction methods,it highlights the effectiveness of this method.In order to solve the problem that two-stage clustering cannot backpropagate and optimize the clustering center and division,on the regular variational embedded clustering(Dropout Variational Embedding Cluster,DVEC),the original software requirement text is embedded in the sentence,and then input to the regular variation In the clustering model,the Dropout regularization is fused to remove noise,and then the original data distribution is learned by the autoencoder structure of the variational embedded clustering,the embedding space conforms to the normal distribution through heavy parameter techniques,and the decoder is used to reconstruct the text.The vectors in the embedding space are divided into clusters to define the target distribution of the clusters,and the regular variational autoencoder loss and clustering loss are jointly optimized through small batch gradient descent,and the clustering result is finally output.This method improves the robustness of the model and avoids the distortion of the feature space due to clustering loss.At the same time,learning the sample distribution improves the feature quality.By comparing with the domestic and foreign clustering algorithms,this method can achieve a good clustering effect.
Keywords/Search Tags:Software Requirements, Self-attention Mechanism, Multi-channel Pyramid Convolution, Variational Embedding Clustering, Self Organization Map
PDF Full Text Request
Related items