Font Size: a A A

Program Comparison Analysis Technology For Open Source Code Reuse

Posted on:2020-01-05Degree:MasterType:Thesis
Country:ChinaCandidate:L HaoFull Text:PDF
GTID:2428330575997724Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the development of software industry and the rise of open source movement,open source software reuse has become a common practice in software engineering.Due to the significant advantages in shortening development time,reducing development costs,and improving software quality,a large number of IT companies build their key products based on open source software.Open source software has become a new force in software sector.Although the resource of open source software is very rich,it is not easy to reuse them effectively.Two challenges stand in the way of reusing:license violation and code updating.The above two challenges are the main obstacles affecting the reuse of open source software.Program comparison analysis technique is an effective way to solve these problems.A code comparison analysis technique is proposed,which uses function as basic detection unit.An incremental repository analysis method is designed,which only analyzes the differences between code snapshots(hereinafter referred to as incremental text).The analysis method contains four steps:(1)Retrieve the differences between code snapshots.(2)Design incremental function extracting algorithm and convert incremental text to complete and parsable functions,then extract features from functions.(3)An improved algorithm based on Simhash is designed to convert functions to corresponding binary hash sequences(hereinafter referred to as function fingerprint).(4)Use big data analysis technology to compare and trace the function fingerprint,find high-quality,up-to-date code based on comparing and tracing results.An incremental function-based code comparison analysis method is proposed and a prototype using the above method is designed,which can analyze open source ecosystem.A public service platform based on function fingerprint for open source ecosystem is established,which can manage the reuse of open source more effectively and can avoid the risk of open source license violation.Using the above method,the code updates in original projects can be found more easily,also the open source code with higher quality and stability can be located more effectively,which can meet the requirements of license violation detection and code update locating in open source code reuse.
Keywords/Search Tags:Open Source Software, Incremental Analysis, Program Comparison, Code Traceability
PDF Full Text Request
Related items