Font Size: a A A

Research On Technologies Of Automatic Summarization For Large Image Collections

Posted on:2015-01-06Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y ZhaoFull Text:PDF
GTID:1228330467986988Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
With the booming of the network communication and the popularity of photo sharing websites, the number of images available on the web is dramatically increasing. Internet users upload their personal photos online and share these photos with the public. These images provide us a resource of perceiving the world through the eyes of others. However, these huge data collections provided by users are often unorganized, noisy and redundant. How to quickly and accurately retrieve and browse images automatically from these large scale image collections has become a challenging problem which needs to be solved urgently. The technology of automatic summarization of image collection can help users quickly and efficiently query and browse these collections by automatically choosing a small set of the most representative image data from the original large scale image collections.From the angle of representativeness, diversity and automaticity, this dissertation studies in-depth about several key technologies in the process of generating summary, including image feature description, feature matching and automatic clustering. The main, innovations in the dissertation are as follows:1. For the problems of high dimension of feature description matrix and complex computation aroused by using Scale Invariant Feature Transform (SIFT), this dissertation proposes a simplified feature description matrix by using the weighted concentric region to take the place of the square region. Geometric calibration of The Random Sample Consensus (RANSAC) is introduced to eliminate wrong matches. In order to cut down the high time cost of RANSAC, use the optimal matching point to construct the small sample set to fit transformation matrix. The experiments indicate the proposed method reduces the time cost of fitting and improves the computation efficiency, as well as efficiently filters the mismatching pairs.2. An optimization algorithm for Speeded Up Robust Feature (SURF) based on the space constraint relation is proposed in this dissertation. Use optimal matched points to form rotary coordinate system to create spatial matrix, along with using simplified RANSAC to make a geometric calibration check on the matched points. Experiments demonstrate that the proposed algorithm achieves high speed while maintaining high matching accuracy. 3. Because the traditional ant colony algorithm may lead to local optimization, a bin-based (ATTA) automatic clustering method is proposed. By combining the ATTA algorithm which is used to initially self-organize and clump the data with the bin-based (ATTA) automatic clustering, the preliminary regional clustering data is stacked into these bins. After that we can discriminate the objective function to merge and split these bins. Finally, the global optimization is found.4. An Affinity Propagation (AP) clustering algorithm based on validity index is proposed. According to the cluster definition and the classification consistency of similar objects, a new clustering validity index based on AP is designed. To resolve the problems of huge computation and high memory consumption caused by automatic AP clustering in dealing with large scale data set, an AP-based method which can rapidly search optimal number of clusters is proposed. Extract the geometry densest data from the original large scale data set to constitute a representative data set. According to the unique character of the AP, that is the bias parameters determine the number of clusters, we can use AP to search the bias parameters in the representative data set. Then the optimal number of clusters in the original collection is obtained. This method is so robust that it can be combined with various validity indexes to determine the optimal number of clusters of the large scale data set. Combining with the SURF based on spatial constraint and the AP based on new validity index, the model of automatically generating the summary of the whole data collections is designed.
Keywords/Search Tags:summarization for imagecollection, large scale collections, imagematching, automatic clustering, RandomSample Consensus(RANSAC), optimalnumber of clusters
PDF Full Text Request
Related items