| In daily life humans should interact flexibly with the external environment even if the task is simple,such as taking a bag of milk in the crowded supermarket,or passing the ball to the teammate accurately in basketball games.The basis of the above interaction is the ability of the visual system to perceive,understand,and predict of motions.Such amazing ability of vision may come from massive operations based on rich cognitive resources.However,cognitive science researches over the past decades have shown that human cognition resources are extremely limited.This means that visual intelligence cannot be realized purely by amounts of operations of big data,instead,it should process the visual information with simple operations.Visual process often depend rather critically on the particular representation that is employed(Marr,1982).Therefore,constructing efficient information representation is the key to achieving human visual intelligence.In the motion scenes,object does not exist in the visual scene independently,but has some connection with other objects or environments.The visual processing of motion often relies on the integration of interconnected visual objects.Therefore,the vision system needs to represent motion as integrate structure.Among the possible integrate representations,hierarchical structure makes it possible to describe information in different levels,which could express rich information in simple form.Such feature of hierarchical structure is highly compliant with the core requirements of visual processing(Xu,Tang,Zhou,Shen,&Gao,2017).Based on this,we propose hypothesis that the motion in visual processing is represented as hierarchical structure.Current study combines methods of psychophysical research and computational modeling.Human performance to motions with different potential hierarchical structure were measured.At the same time,simulation results from different models were compared with human performance.The results showed that:(1)The change of the latent hierarchical structure of motion influences the performance of the participants,which indicates visual hierarchical representation of motion.(2)The visual hierarchical representation of motion is stable and is not affected by information length,task cues and other factors.(3)The visual hierarchical representation of motion exists also in scenes that contain social information.(4)The visual hierarchical representation of motion is a kind of causal structure.It not only describes the form of movements,but also describes the generate process of movements.(5)Based on the constructed hierarchical representation,the visual system recognizes,understands and predicts the motion scenes through the reverse engineering process.The above results not only provide solid evidence for the existence of visual hierarchical representation of motion,but also reveal the stability and universality of hierarchical representation.In addition,the computational model of current study further simulates the process of performing subsequent processing using the hierarchical representation,providing a useful attempt for the artificial intelligence system to approach human intelligence. |