Font Size: a A A

Design Pattern Detection Based On Similarity Scoring,FSM And Machine Learning

Posted on:2020-02-03Degree:DoctorType:Dissertation
Country:ChinaCandidate:L WangFull Text:PDF
GTID:1368330572980583Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Design patterns are examples of successful designs that people have come up with in their practice.They help designers build new designs based on past work and reuse previous successful designs.The application of design patterns has greatly improved the development efficiency of software systems and the quality of software systems.However,the design documents for many systems are either incomplete or do not exactly match the source code.This problem is even more serious for software systems built using agile development method.On the other hand,even if the system's design documents are fully available and fully matched to the source code,these documents may not have detailed design pattern usage information.Therefore,the efficient and accurate automatic detection of design pattern instances contained in systems is of great significance for understanding,maintaining and reconstructing large software projects.In recent years,relevant literatures at home and abroad have proposed many methods for automatic detection of design patterns.However,design pattern detection is a more complex issue.These methods have the following deficiencies:1)Most of these literatures match patterns with the whole system,so the detection accuracy and time performance is not high.Some of these literatures attempt to optimize time performance by reducing the search space before executing the search algorithm,but still reduce the search space on the basis of the entire system.2)Behavioral pattern detection is a challenging issue.Some of the existing literatures only can detect structural/creational patterns and cannot detect behavioral patterns.There are also some literatures attempt to use the same matching algorithm to search for structural/creational patterns and behavioral patterns,so the detection accuracy is not high,especially for design patterns whose structural characteristics are not obvious or who have similar structure characteristics with other design patterns.In recent years,some scholars first use methods based on the structural/object creation characteristics to obtain behavioral pattern candidate instances,and then validate the candidate instances by individually matching the behavior characteristics.However,most of these literatures do not execute source code to detect if the method calls actually occurring during the run match a behavioral pattern.Therefore,these literatures are limited and imprecise in analyzing behavior.3)The most important process of design pattern detection is to match the characteristics of patterns to the characteristics of the system.In recent years,relevant literatures at home and abroad have considered various pattern characteristics,but most of them belong to structural characteristics and behavioral characteristics of design patterns.To enhance the readability and maintainability of code,design patterns often have their own naming characteristic.Currently,only a small number of literatures consider naming characteristic of design patterns.The use of naming characteristic makes it easy and accurate to validate candidate instances obtained based on other characteristics.4)When design patterns were just proposed,some scholars have tried to identify design patterns through software metrics.However,they rely on a few rigid,single indicators to determine whether the system is a pattern instance,so the accuracy is obviously not high.This is especially true for the identification of design pattern variants.In addition,software metrics are primarily used to measure the static structural properties of systems.Therefore,it is difficult to accurately identify behavioral patterns only through software metrics.In fact,using software metrics to detect design patterns often requires manual validation to get the final instances.5)The most important process of using machine learning to identify design patterns is to prepare training samples.Most of the current literatures using machine learning to detect design patterns are to manually acquire and mark training samples,which takes a lot of time and labor.In order to meet the requirements of large-scale and high-complexity software systems for design pattern detection method accuracy and time performance,this paper proposed a design pattern detection method based on similarity scoring,finite state machine(FSM)and machine learning.The main research work and innovations are as follows:1)Aiming at the problem that the existing literatures mostly match patterns with the whole system and the detection accuracy and time performance are not high enough,a design pattern detection method based on similarity scoring and secondary subsystems was proposed.This method divides the system to be detected into several subsystems,and further divides the subsystems into secondary subsystems with the same number of classes as the number of roles in the pattern to be detected,and then uses similarity scoring algorithm to match secondary subsystems and patterns to detect pattern instances in the system.The experimental results show that this method has a high detection accuracy and time performance.2)Aiming at the problem that the existing literatures can not detect behavioral patterns or the detection accuracy of behavioral patterns is relatively low,a behavioral pattern candidate instance validation method based on FSM was proposed.This method uses the unit test tool JUnit to execute the behavior pattern candidate instances obtained by the above method based on similarity scoring and secondary subsystems,and matches the method calls actually occurring during the run with the FSM transformed from behavior patterns to finally validate whether the candidate instances are pattern instances.The experimental results show that this method can improve the accuracy of behavioral pattern detection.3)Aiming at the problem that most of the existing literatures do not consider naming characteristic of design patterns,and the accuracy of design pattern recognition based on software metrics is not high and requires manual validation,a behavioral pattern candidate instances preliminary validation method based on software metrics,naming characteristic and machine learning was proposed.The method uses six existing design pattern detection algorithms to detect pattern instances in 102 open source projects to obtain positive and negative samples,and provides metric values and the names of the classes in the positive and negative samples to a learning system(this paper uses ANN method)to learn.This in turn provides a model that contains the acquired knowledge,which may perform preliminary validation on behavioral pattern candidate instances.This method can reduce the search space for validation based on FSM.4)Aiming at the problem that manually acquiring and marking training samples for design pattern detection requires a lot of time and labor,an algorithm for automatically acquiring and marking training samples was proposed.This algorithm integrates several existing design pattern detection algorithms,and it takes open source applications containing design pattern instances as input and automatically determines and marks training samples.This algorithm can generate training samples for the above-mentioned behavioral pattern candidate instance preliminary validation method based on software metrics,naming characteristic and machine learning.This paper first provided the basic idea of this method,and discussed extraction of source code information;then,proposed a representation of the design pattern and system based on directed graph/matrix and a representation of behavioral pattern based on FSM;next,discussed the design pattern detection algorithm based on similarity scoring and secondary subsystems,the behavioral pattern candidate instance validation algorithm based on FSM and the behavioral pattern candidate instances preliminary validation method based on software metrics,naming characteristic and machine learning in detail respectively;finally,carried out the experiments on the open source projects JHotDraw 5.2,JRefactory 2.6.24 and JUnit 3.7,and the results were analyzed and discussed.The experimental results show that the proposed method can detect design patterns,and has a high detection accuracy and time performance.The effect is especially obviously for behavioral design patterns whose structural characteristics are not obvious or who have similar structure characteristics with other design patterns.
Keywords/Search Tags:design pattern detection, detection accuracy rate, similarity scoring, finite state machine, software re-engineering
PDF Full Text Request
Related items