| The quantity of tags affects the effect of searching, locating and sharing resources, so it is important to carry on the efficient and comprehensive tag mining. In tag mining researches, there are two focuses:tag recommendation and redundant tags processing.To solve the problems that the recommending results of existing methods were not enough complete and lost latent tags, the thesis proposed the multi-threshold continuous condition random fields model to recommend tags. Based on the continuous condition random fields, it employed the co-occurrence rate between tags, the semantic similarity of tag pairs, and the user similarity the three thresholds to extract tag features, and concurrently dug out the dominant and recessive tags, through the L-BFGS algorithm iterative calculation to get model parameters, then established the model to recommend tags. Tests in Bibsonomy data set showed that this method is feasible. The result comparisons with the continuous condition random field model and the maximum entropy model displayed that tags recommended by this model are more accurate and more comprehensive. The stability of the model performs well.To enhance the quality of tags and improve the accuracy of the traditional clustering methods in processing redundant tags, the thesis adopted the maximum entropy model to process redundant tags. By applying the semantic similarity between the tag pair to extract tag features, it acquired model parameters by SCGIS algorithm iterative calculation, then construct the maximum entropy model to process redundant tags. The tests on BibSonomy dataset verify that the method is feasible, and the experiment results in comparison with the nuclear K-Means clustering method indicate that the maximum entropy model in processing redundant tags acquires the stable performance and the higher accuracy. After redundancy processing, the tag redundancy of the dataset decline significantly. The experiment of processing redundant tags by the maximum model on the tags recommended by the multi-threshold continuous condition random fields model showed that the quality of results acquire the further improvement. |