Font Size: a A A

Software Change Classification Based On Probabilistic Latent Semantic Analysis

Posted on:2014-04-27Degree:MasterType:Thesis
Country:ChinaCandidate:M YanFull Text:PDF
GTID:2268330392972369Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Software change is a software change commit conducted by a developer. Tracingand monitoring software change is a significant and difficult task in software lifecycle.Researchers have proposed a variety of keywords retrieving based approaches toidentify reasons for software changes. However, despite the great success achieved,there are some unsolved issues remaining in this research, such as the synonymy andpolysemy problems in analyzing the change message log. In this paper, we proposed asemi-supervised topic model based solution in identifying reasons for software changes.Topics are extracted using Probabilistic Latent Semantic Analysis (PLSA) fromsoftware change descriptions recovered from source control systems such as SVN. Themain works are as follows:1. In software change log extracting and preprocess, this paper provided a solutionwhich combined Cvsanaly, GATE and wordnet tools. And by the help of these API, weimplemented data extracting, store and preprocessing framework.2. In order to process the synonymy and polysemy problems, we introduced thePLSA based approach, and implemented it by Eclipse and Matlab.3. When applying the PLSA model, comparing with the basic PLSA, we made twoimprovements: firstly, the vocabulary in our model is constructed directly correspond tothe software change domain knowledge; second, in building the semi-supervised model,we added labeled samples with software change domain knowledge to initialize theword-frequency matrix.4. After evaluated with experiments on five open source projects from differentfield, we find that our semi-supervised topic model based approach can produceprobabilistic, more appropriate and more practicable classification result than keywordsretrieving based approaches.
Keywords/Search Tags:software engineering, software change, topic model, semi-supervised, LSA
PDF Full Text Request
Related items