Font Size: a A A

Investigating Scene Decoding Based On The Semantic Relationship Between The Scene And The Object Sounds Within The Scene

Posted on:2018-02-07Degree:MasterType:Thesis
Country:ChinaCandidate:X J WangFull Text:PDF
GTID:2334330542477873Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
Humans perceive the outside world mainly by visual and auditory senses,and humans can quickly and efficiently identify the surrounding context and objects through the integration of visual and auditory stimuli.Previous studies suggested that the human brain could represent the semantic relationship between the scene and the object within the scene.In a real life,the human brain understands the external world by integrating visual and auditory stimuli.However,the computer in the exploring and understanding the outside world has encountered great challenges especially in the crossmodal scene recognition.No study has investigated the semantic relationship between visual scene and auditory sound with the machine learning method.Functional magnetic resonance imaging(fMRI)has characteristics of high spatial resolution,noninvasive and time resolution based on the contrast-enhancement principle of blood oxygen level(BOLD).It is very suitable for analyzing temporal and spatial information of brain activity and advanced brain function connection,which is widely used in clinical,cognitive research and other fields.Understanding the underlying mechanism of the human brain in decoding the scene through the sound with fMRI,which is helpful to the development of artificial intelligence and intelligent computer on the scene recognition,could guide the computer to more efficiently identify complex scenes.This paper focused on the semantic relationship between the scene and the object sound within the scene.Functional magnetic resonance imaging was used to acquire blood oxygenation level dependent data while participants viewed four categories of scenes and listened to eight categories of sounds.The multivariate pattern analysis(MVPA)using a support vector machine classifier was then conducted to assess whether human brain could decode the patterns of the scenes based on the averaged patterns of sounds of objects the scenes contained.Our findings suggested that multivoxel patterns evoked by the scenes could be predicted with the averaged patterns evoked by sounds of objects within the scenes in the posterior fusiform area,lateral occipital area and superior temporal sulcus(STS).Furthermore,we performed a functional connectivity analysis among four regions of interest under the scene and sound conditions respectively,and no significant positive correlations were observed between STS and the other three regions.Then we explored the brain areas that were functionally associated with STS positively in two tasks using a seed-to-voxel analysis and found a distinct network in processing the scenes and sounds.Finally,we explored the information flow among the brain area involved in the experimental tasks by dynamic causal modeling and found that the information flow between the LO and pF was modulated by the scene condition.
Keywords/Search Tags:Cross modality, Natural scene decoding, Functional connectivity, Multivariate pattern analysis, Functional magnetic resonance imaging
PDF Full Text Request
Related items