Font Size: a A A

Research And Application Of Video Search Result Analysis And Visualization Method

Posted on:2011-12-25Degree:MasterType:Thesis
Country:ChinaCandidate:N WuFull Text:PDF
GTID:2178360302980190Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the rapid development and wide application of the Internet, search engine, like Google and Baidu, has gradually become an important tool to query and get information for web user. Due to the large set of documents on web and the ambiguity of query words, long and cluttered search result linear list in traditional search engine can't make user localize interesting information fast and efficiently. In order to guide the process and reduce the burden of user query, one approach that tries to solve this problem is to generate keywords related categories by clustering of search result.Under the backgroud of video system, we work on research and application of the core technology of search result clustering analysis and key issue of clustering result visualization method. In this paper, it introduces the theoretical knowledge of rough sets, vector space model, and genetic algorithm. And it also describes the K-value study algorithm in K-means clustering and the improved method of initial centroid selection. At last, a new way of clustering result visualization is proposed.Combined with the theory of rough sets and vector space model, we introduced a new document representation method based on tolerance rough sets. To avoid the case that poor document representation may affect the clustering result, the new method enriches the description of document by taking advantage of the tolerance relation between terms, and the upper approximation space of document. So it can provide good data input for the clustering algorithm, and ensure the effectiveness of the clustering result.K-means is a common clustering algorithm based on division, which can generate good clusters and is scalable and efficient for large data sets. Because the K-value and initial centroid in K-means are random, the quality of this algorithm is often not good. Based on the global optimization of genetic algorithm and the local optimization of K-means algorithm, we introduced a genetic algorithm for K-value study and proposed an improved method of initial centroid selection. In addition, we used a K-means algorithm which can form overlapping clusters for document clustering.Finally, after research on rough sets, vector space model and genetic algorithm, and the improved method of K-means algorithm, a new way to visualize the clustering result is proposed. It can break the limitation of linear list in traditional search engine, and allow web user to understand and get the required information quickly in massive amount of data sets.
Keywords/Search Tags:Rough Sets, Vector Space Model, Genetic Algorithm, K-means, Search Result Clustering, Data Visualization
PDF Full Text Request
Related items