Font Size: a A A

Research On Program Code Mining System Based On Domain Repository

Posted on:2010-08-18Degree:MasterType:Thesis
Country:ChinaCandidate:H F NiFull Text:PDF
GTID:2248360275454998Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the increasing popularity of ACM International Collegiate Programming Contest(ACM International Collegiate Programming Contest,ACM / ICPC for short) in domestic colleges and universities, more and more teams participate in it in order to make more students get in,which means the higher requirement for coaches.The challenges can be ascribed to the following three points. Firstly,how to choose the team members with comprehensive knowledge.More fields are involved in ACM/ICPC and this requires the members own comprehensive knowledge,which can bring the teams the desired results.Secondly,how to team up.In order to achieve the best results,every specific situation of each team member should be considered.Thirdly,how to do exercises which the team members are not good at.The increasing number of the teams makes the contest more competitive.So it is necessary for the team to get the coaches’ advice on the characteristics of the contest of each division to improve the quality of the training. According to the problems mentioned above,this thesis proposed a program code mining method based on domain repository.This method used a large number of the codes left by the team members in the training or the problem-solving codes as a reference in the competition of each division every year,to mining the knowledge character hiding in the codes,which can give a hand to solve the above problems.The specific work is as follows.First of all,establish domain repository.It is the research base, which can provide the support to program code transformation and analyze the knowledge points with statistical analysis.This thesis established a machine understandable domain repository in the programming field with the combination of ontology and the modeling methods in domain ontology.Second,program code transformation,which means to convert many groups of codes to knowledge points using statistical form.The reason is that program code is unstructured data and can not directly use statistical analysis.Through developing language identifier,lexical analyzer and judgment and distilment of knowledge points,we transform the codes.Then,knowledge points with multiple dimension statistical analysis.This is the method for knowledge character mining of team members or each division;actually,it is principal component and cluster analysis method.Firstly,we use principal component analysis method to transform the variables to the several integration variables, and finish the estimate;then,we use cluster analysis method to fulfill the description of knowledge character of team members or each division.Finally,we give an example to explain the method in detail in using,and the analyze results of the outputs also.
Keywords/Search Tags:program code mining, domain ontology, domain repository, program code transformation, multiple dimension statistical analyses
PDF Full Text Request
Related items