Font Size: a A A

Knowledge analysis & application to multimedia content recognition problems

Posted on:2009-10-19Degree:Ph.DType:Dissertation
University:The University of Texas at DallasCandidate:Chin, YohanFull Text:PDF
GTID:1448390005958342Subject:Computer Science
Abstract/Summary:
We are living with various kinds of multimedia contents, such as personal images, web-pages and images, personal videos (YouTube) and movies. We have been using the 3D human motion capture techniques in movie and game industry. As hardware techniques for getting multimedia data advanced, many researchers developed new algorithms for automatic multimedia content analysis problems. However, there exists the semantic gap, which is the distance between low-level feature values of multimedia and the high level of human perception. This already well-known semantic gap prevents us from using pre-developed multimedia content recognition algorithms for commercial level application. For reducing this semantic gap, in this dissertation, we analyze the various multimedia data types and find relations between them. For applying knowledge-bases to multimedia content recognition enhancement, we propose extraction methods about knowledge-base and methods for applying the newly extracted knowledge-base effectively to multimedia content recognition problems.;First, we analyze the knowledge aspect with 3D human motion capture data since it has more inherent knowledge than video human motion data. We extract the semantic characteristic values from 3D human motions capture data while we reduce the dimensions from 62 to 1 quantization values. Another knowledge extraction is from WordNet, which is hierarchical lexical dictionary. We show that semantic similarity from WordNet would be very useful for removing noisy keywords among annotated keywords for a image.;Second, based on this extracted knowledge for enhancing multimedia content recognition problems, we propose several methods how we can apply knowledge-bases to multimedia content recognition problems effectively. For instance, with semantic feature values from 3D human motion capture data, we demonstrate that 3D human motion capture data can be transformed into document-like textual representation, called, HMDoc (Human Motion Document) since we already get the semantic quantization sequences for each 3D human motion capture data. More than this, 3D human motion semantic feature values can be merged with video low-level feature values. This can be built on the traditional machine learning algorithms, such as Hidden Markov Models (HMM). We propose a framework based on HMM for using this 3D human motion semantic feature values as the hidden state sequences. It increases the video human gesture recognition problem's accuracy and decreases the learning time for a HMM model parameters. Another application of extracted knowledge is for image annotation problems. We propose a new research problem, graph-approximating image annotation refinement for applying semantic similarity to decide un-related keywords within polynomial time. So, throughout several quite different multimedia data types, we show how we can use the knowledge-base as another powerful assistance for improving multimedia content recognition problems.
Keywords/Search Tags:Multimedia content, 3D human motion capture data, Feature values, Application
Related items