Font Size: a A A

Research On Domain Resource Clustering Based On Semantic Field Model And Its Application

Posted on:2014-01-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:L J WuFull Text:PDF
GTID:1228330398490336Subject:Education Technology
Abstract/Summary:PDF Full Text Request
The educational resource is an important part of the teaching process. High quality educational resources can help learners achieve better learning performance. With the development of information technology, the number of educational resources flourishes. So how to efficiently and clearly organize these resources to facilitate learning for users has become an important problem which needs to be solved in digital learning. Resource clustering is one of the most commonly used ways to organize resources automatically, because clustering is an unsupervised algorithm and do not need training corpus. Resource clustering refers to the process that dividing the resources which belong to the same topic and have high similarity of contents between each other into a cluster according to the semantic connotation of the resources, and then specifying a semantic label for every cluster. The emphases of the clustering process are the representation method of semantic content, feature selection and reduction, and the generation of cluster labels. Around these above key elements, this paper focuses on the resource clustering technologies based on semantic field model. The main research contents contain five parts:(1) The resource clustering framework based on semantic field model;(2) The construction method of domain-oriented semantic field model;(3) The method of feature selection and reduction;(4) Resource clustering algorithm based on semantic field model;(5) The application of resource clustering algorithm. The work in this dissertation has been supported by the project "Research on key technologies of knowledge concentration and fusion"(2008AA01Z127) in the National863Plan and the project "Research on key technologies of educational resources allocation and remote service in rural areas"(No.200603110400) in National Key Technology R&D Program in the11th Five year Plan of China.The contributions of this thesis include:(1)Aiming at that the general vector space model does not consider the associations between words, we put forward a resource clustering framework based on semantic field model in this dissertation. We introduce the idea of field model into semantics to form the semantic field model which is used to describe the semantic associations between words. Then we study on the mathmetic model of semantic field to ensure the representation and calculation of semantics. Then we propose a resource clustering framework based on semantic field model. This framework defines the calculation method of the semantic location and semantic quality of resources. Based on this, we define the calculation method of semantic filed strength and semantic gravity between resources. Under the action of semantic gravity, resources will condense into several clusters. Through this mechanism, we can add the semantic associations between words into the resource clustering to improve the precision of resource similarity.(2) Manual construction of semantic field is a time-consuming work. So we study on the automatic construction of semantic field. First, according to the characteristics of the domain phrase, we propose an atomic word formation algorithm to extract the candidate phrases from domain corpus. In this word formation algorithm, atom vocabularies are used as the basic units to build domain phrases to improve the precision of domain phrases recognition. Then we make certain rules to filter the candidate phrases and delete those strings which are not domain phrases to improve the efficiency. Then we refine the left candidate phrases to form a collection of domain concepts. After getting the collection of domain concepts, we mark the relationships between concepts. And then use the Interpretive Structural Modeling algorithm (ISM) to form the hierarchy structure diagram of the concepts. At last, we estimate the related parameters of the semantic field model through certain methods, such as the quality of the concepts and the adjusting parameter of the potential function. These parameters are used to describe the semantic field of educational technology.(3) Clustering is an unsupervised machine learning algorithm, so there is no training set for feature selection. Semantic field is an excellent domain prior knowledge base which can be used to guide the feature selection and reduction. First, we analyze the importance of domain concepts in feature selection and reduction. We add all the domain concepts into the segmentation dictionary as the candidate features in pretreatment. Then we study the mapping methods from semantic features to domain concepts. The mapping methods include mapping the synonyms features into a domain concept and mapping the instance features into its category concept. At last, we put forward a feature reduction method based on semantic field. Experimental results show that feature selection and reduction based on semantic field can get better clustering results than feature selection based on document frequency.(4) To introduce more semantic information into the domain resource clustering, we study the resource clustering algorithm based on semantic field. Semantic gravity is used to simulate the attraction between resources. Under the semantic gravity, resources will move towards each other and finally condense into several clusters. After clustering, a semantic label extraction algorithm based on semantic field is proposed to ensure the readability and representativeness of category labels. Experimental results show that the clustering algorithm can receive good results in domain corpus.Integrating the above research results, a domain-oriented resource retrieval system is developed in this paper. The main functions of this system include:domain resource retrieval, clustering of the search results and the visualization of clustering results, etc. Through analysis of the actual effect of the system, the functions and performances can satisfy the needs of the practical application.
Keywords/Search Tags:resource clustering, semantic field model, domain concepts, feature selection andreduction, semantic gravity, semantic label
PDF Full Text Request
Related items