Font Size: a A A

Study Of Extracting Semantic Information From Videos And Images

Posted on:2010-09-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:S L ZhangFull Text:PDF
GTID:1118360275991234Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The extraction of semantic information from videos and images can implement theretrieval based on the semantics. There are some issues in the existing extraction methods:(1) how to construct a sound semantic hierarchy; (2) how to effectively expressthe semantics; (3) how to automatically discover and utilize the relevance between semantics;(4) how to dynamically fuse the semantics; (5) how to mine and utilize thetemporal dependency of videos. Three methods are proposed to solve these issues.First, a bottom-up hierarchical semantics extraction framework is proposed, whichdivide the low-level features of shots, objects and scenes concepts into a bottom-upthree-level hierarchy. A support vector machine(SVM) is trained for a given low-levelfeature and a given object concept. The proposed boosting fusion, combines severalSVMs trained on each feature into a salient object detector. The confidence output ofthese detectors are used by two proposed model vectors to express the semantics ofshots. Scene concepts are then learned on these model vectors.Secondly, a semi-automatic image annotation method based on auxiliary tags isproposed, which measures the relevance between tags by the normalized mutual information,and improve the tag classification with a dynamic mixture model. This methodcan be easily combined with the relevance feedback on tags to speed up the interactionbetween human and the computer.Finally, to improve the object detection in video shots, spatial and temporal informationare further mined to discover and locate the helpful objects in the same shot orthe adjacent shot. These spatial and temporal auxiliary information are integrated in adynamic mixture model and improve the original detection results effectively.
Keywords/Search Tags:hierarchical semantics, salient object, model vector, multi-label learning, normalized mutual information, dynamic mixture model, relevance feedback, temporal dependency
PDF Full Text Request
Related items