Font Size: a A A

Research On Information Fusion Technology And Its Application In Video Content Analysis

Posted on:2008-09-30Degree:DoctorType:Dissertation
Country:ChinaCandidate:D Y DingFull Text:PDF
GTID:1228330362972563Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Three types of information fusion technologies are studied with video contentanalysis as their background. They are rank aggregation (RA), data normalization,and temporal clustering analysis of video concepts.As a robust and workable decision fusion method, rank aggregation is widelyapplied. However, this method is not quite understood for its effectiveness fortwo-class decision problems. Consequently, the state-of-the-art RA methods aremostly heuristic and their principles remain unclear. We study the probabilisticmeaning of RA for the two-class decision problems and set up a theoreticalframework of the Probabilistic Model Supported Rank Aggregation (PMSRA) basedon the theory of order statistics. This framework provides a sound probabilistic modelfor RA for two-class decision problems and serves as a theoretical foundation forfuture studies in this field. Furthermore, we propose the Bayesian PMSRA method,which gives ranks according to the posterior probabilities given all the lists. Thismethod is tested and compared with peer methods in the semantic concept detectiontask in video. Experimental results show the great advantage of this method comparedto others, thus verify the appropriateness of our theoretical framework and theeffectiveness of our method. Finally, we provide a novel view of the PMSRAframework from the quotient space theory of cross-modal learning. The PMSRAmethods can be viewed as instances of the combination technology of quotient spaces.This view reveals that the PMSRA is essentially a cross-modal learning technology.Data normalization is usually the start point of information fusion. Traditionalmethods for data normalization suffer from low efficiency and low accuracy inestimating the range and position of the data because of their dependency on the datadistribution. We propose a Truncation-Limited Density-maximized (TLDM) datanormalization method to address the aforementioned problems. In the TLDM method,the estimation of the data position and range is conducted on the basis of orderstatistics. Both theoretical analysis and experiments on four years of the TRECVIDdata point to the fact that the TLDM method is accurate, efficient and adaptive to data distributions.Lastly we study the phenomenon of temporal clustering of video concepts andinvestigate the method to use it for video concept detection. It is common forsemantically related video shots to appear together in representation. However, fewdiscussions about this property of video were seen before this study. We propose ameasure of the degree that certain concept-related shots are clustered and devise amethod, called temporal clustering analysis, to boost the detection of semanticconcepts in video by taking advantage of this property. Experiments show that thetemporal clustering analysis is more effective than direct fusion of time feature ortime-based detection results.
Keywords/Search Tags:rank aggregation, data normalization, temporal clustering analysis, information fusion, video content analysis
PDF Full Text Request
Related items