Font Size: a A A

Research On Method Of Social Tagging For Service

Posted on:2014-06-15Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y HuangFull Text:PDF
GTID:1318330398454874Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In the past few years, due to easy classification of resources and can use the tag to retrieve content, social tagging systems are becoming increasingly popular.Social tagging is a novel and useful mechanism which is introduced by Web2.0.An increasing number of users provide information of resource through the social tagging behavior,so it was a tag-based search method,reveal the users' preferences for the content by label that is marked by the user.Therefore, the marking information can be used to make recommendations. The user of social tagging system can define individual classification, other users can also browse related resources,nevertheless, due to the lack of effective management of tagging process,and the lack of definition of the relationship between the label,the resource classification made by the user possible is not very reasonable,so there is a limitations of resource sharing.At present,there are three main methods for recommending by label: Recommended method based on tripartite graph,based on probability model, based on collaborative filtering.The shortcoming of method based on tripartite graphas is it treat users?objects?labels as three different types of node,sides only appearing between different types of nodes,no edges between nodes of the same,this artificial division separates the co-occurrence relationship between the three types of nodes,which inevitably cause loss of information. The shortcoming of method based on probability model algorithm is it is based on machine learning,this method is generally adopt Gibbs Sampler,or expectation maximization method to obtain the optimal recommend results,so it require a higher computing power of the computer. Especially when there is a very large amount of data, use algorithm based probability model will consume a longer calculation time?At present, how to design quickly and efficiently recommendation algorithm based on probability model is still a problem.The shortcoming of recommended algorithm based on collaborative filtering is it calculate the similarity by using the user's past behavior, so it requires a lot of historical data to better measure similarity. At the same time, calculate the similarity is also very time-consuming in the case of massive data. In addition, many of the online user's behavior is not very significant (such as the browse and compare behavior in the purchase process), how to measure the similarity based on this type of user behavior is still a research problem.Therefore, in order to solve above-mentioned problems, the research concentrates on the label recommended, resource query expand, include the following:(1) Proposed a method based on random walk label recommendations and service inquiriesDue to the label of service is random in existing service registration system, and the shortcoming of the weak correlation between labels and services, proposed a label recommended method of random walk label sort, thus start service-seeking by using the correlation between the labels and services.These APT service that without label in the article, use API service description document to find its similar documents,then recommend out the labels? The premise is based on the document that is similar with API service description document share similar labels, we also use the document similarity build API service topology map,assume that if a label in the topology map mark a specific API services, then it is also suitable for its neighbors API Service. Using the random walk method, calculated the location of a given tag and the size of the label list in the corresponding API service label list,obtain API services related scores, then find the list of services according to the label similarity. Verify the feasibility of this method in the real data set, the experiment fully verify the effectiveness of the proposed method in this article, provides a new perspective for the label recommended.(2) Based on the service clustering of label recommendedCluster service can get the type of services, better to find services.Some of the services that have a small number of labels. This article proposes a label recommended ways to improve the performance of Mashup service clustering, we use the conbination of the similarity of Mashup description text and the similarity of its corresponding label recommendation as the similarity of Mashup service in the method, experimental results show that the proposed label recommended strategy effectively expand Mashup Service that with a small number of labels, resulting in more related label information, thus the effectiveness of clustering is better.(3) Topic-based tag sortingIn the social tagging system, People's tagging behavior is casual.In order to make labels can play a better identification effect on service data, label sort is very important.The article puts forward a kind of label sorting method based on the Topic,which can extract the different topics in the label space, and get label sequence corresponding to different themes.The experiment compared the label recommended effect of the three methods of LDA, MFTR, LDA+MFTR, and the results show that the effect of the LDA+MFTR the best.(4) Label prediction based on active learningThe label is divided into manual annotation and automatic tagging, until now automatic annotation method has not yet made people very satisfied, but the manual annotation is a very time-consuming and labor-intensive process, and need a lot of the user's actual experience.Article put forward a kind of label prediction method based on active learning used for network resource data prediction.The present method can be seen as a combination of active learning and annotation method,and label prediction and sample selection are the two main parts of it.The purpose of sample selection is to select the sample with maximum information from unlabeled samples to manually annotate, and then make label prediction for the rest of the service data which are unlabeled data.The Paper combined the three indicators:the ambiguity, the reference rate, and the diversity as the comprehensive index of sample selection, and compare the results of three different label prediction method, the experimental results show that the annotation data which get by this sample selection method contains a large amount of information, so that the label prediction can get better results.
Keywords/Search Tags:tag recommendation, query, service clustering, sorting, prediction
PDF Full Text Request
Related items