Font Size: a A A

Method Based On Identification Of Faceted Classification And Cluster Tree For Component Retrieval

Posted on:2019-05-14Degree:MasterType:Thesis
Country:ChinaCandidate:S H DuFull Text:PDF
GTID:2428330545453841Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The development of complex large-scale enterprise application software is facing serious challenges.For this reason,people have proposed a variety of technical means to improve the efficiency and quality of software development.Software component technology can not only improve the overall benefits of the software in various stages of the life cycle,but also greatly improve the product quality and production efficiency and versatility and openness,and thus received the majority of people's attention.However,with the continuous development of component-based software engineering,the number of software components has gradually increased,and the scale of component libraries has expanded dramatically.Concern the problem that quickly and efficiently retrieve the target component from a huge software component library,this thesis proposes a component retrieval method based on facet classification and clustering tree.The method mainly includes the following parts:(1)Parsing the component description.Based on the multifaceted classification mechanism of facet classification,combined with the Web Service's service requirements,the use of domain terms and faceted terms constitute the component identification set to extract feature words from the description of the component.Then component is represented by a set of feature words.(2)Building a model of vectored component.According to the feature word weight calculation formula,the component represented by the feature word is converted into the component vector represented by the weight.So that the problem of component matching is converted into the cosine similarity of the component vector.(3)Creating a component cluster tree.The components in the component library are divided into layers.The component classification is refined from top to bottom.The semantic similarity clustering is performed on the components under the sub-facets.Then the component clustering tree is established.And the structure of the component library is optimized.(4)Building a component search algorithm model.Extracting feature words from user-described search conditions and convert them into search component vectors.The fuzzy matching of components is achieved by calculating the cosine similarity between component vectors.The semantic similarity between the search component and the cluster center is calculated.And the highest degree of semantic similarity will be obtained.Similar clustering clusters as candidate search results,effectively reducing the number of component matching.It can overcome the traditional subjective factors that are caused by simply using facet classification methods to describe and retrieve component classifications by using component identification sets to describe the components.It can effectively narrow the search scope by introducing the idea of clustering trees.Clustering components by calculating the semantic similarity between components and building component clustering trees.The comparison experiments show that the algorithm based on facet classification and clustering tree can improve the component search results,making the average precision rate reach to 88.3%,and the average recall rate reach to 93.1%.
Keywords/Search Tags:software component, faceted classification, component identification, cluster tree, component retrieval
PDF Full Text Request
Related items