The prevalence of video sharing websites and other video applications brings the explosion of web videos in terms of both quantity and content diversity. It poses a tough challenge to the web video detections, organizing and categorizing while brings more watching choices to audiences.Clustering is an unsupervised machine learning technique without much depending on high-quality training samples and human involvement. With the successful use of clustering in text retrieval, clustering can be employed to solve the organizing and categorizing issue of web videos. Nowadays, the clustering method for web videos mainly represents web videos and measures the similarity between web videos based on visual features, and then uses the clustering algorithm for analyzing. Resulting from the "semantic gap" in image analysis, the existing clustering methods for web videos represent and describe the web videos by only using the visual features, which cannot be accurate and comprehensive and leads to unsatisfactory clustering results.Taking the abundant information and complex structure of web videos into consideration, this paper proposes a flexible clustering method based on multi-modal strategy for web videos, consisting of a web video representing scheme, a similarity measuring scheme and an adopted clustering algorithm. This method makes full use of information in web videos, achieves web video representation and similarity measurement scheme by integrating the extracted visual features, semantic features and text features of videos to describe a web video more accurately, and design the scheme of measure of similarity in each modal. With the multi-modal combined similarity as input, the affinity propagation algorithm is employed for the clustering procedure. The clustering method is evaluated by experiments conducted on web video dataset and has a better performance than single modal clustering. This paper also applies this method to the clustering of video search results, which can improve the user experience of search engine. |