Font Size: a A A

Research On Semantic Analysis Method Of Market Stall Monitoring Video Based On Multi-scale Feature Fusion

Posted on:2022-03-07Degree:MasterType:Thesis
Country:ChinaCandidate:S JiangFull Text:PDF
GTID:2518306353484634Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
At present,the management method of market stalls adopts manual supervision and management,which is inefficient and causes a lot of manpower and material resources wasted.The methods of video surveillance are applied to manage market stall management.As a current research hotspot,image semantic description has broad application prospects in industrial video surveillance.In recent years,the research on the generation of image semantic description has achieved certain results,but there are also many shortcomings and problems.For example,the extraction of image attribute information is incomplete,the description of the relationship between attributes is inaccurate,part of the image information is lost in the process of generating semantic description,and the generated description sentence is not fluent.In response to the above problems,the thesis proposes a semantic analysis method for surveillance video of market stalls based on multi-scale feature fusion.The main research contents are as follows:(1)An image feature fusion method based on multi-scale feature fusion is proposed.For the diverse targets and complex background images of market stalls,the convolutional neural network is used to extract the visual features of different scales of the image,and the image features of different scales are effectively merged.So that the fused feature vector contains more detailed image information,which effectively alleviates the problem of image information loss.(2)A model of image semantic description based on multi-scale feature fusion and adaptive attention is built.In the encoding stage,image features are fused at multiple scales to extract more accurate information of stalls images.In the decoder part,an adaptive attention mechanism is used to make the model adaptively adjust the weight matrix of the image visual features and the generated word information in the process of generating the image semantic description,making the generated description sentence better and the grammar more fluent.(3)A semantic description generation model for surveillance videos of market stalls based on transfer learning is built.Training the model in(2)using the stall image data set to obtain the model of market stalls image semantic description.For a section of booth surveillance video,the video frames are intercepted at regular intervals,and the corresponding description sentences of the video frames are generated,and the illegal overflow and the stalls occupied by the road are analyzed to assist the managers in managing the market booths and improve management efficiency.The experimental results show that the model of this thesis produces better sentences on the shared data set and the indicator of evaluation BLEU and CIDEr are improved compared with other typical image semantic description models.For the images of market stalls,the corresponding description sentence can be generated,and the description sentence contains the correct image information.It can be used to assist managers to manage intelligently market stalls and improve management efficiency.
Keywords/Search Tags:Semantic description of surveillance video, Multi-scale feature fusion, Adaptive attention mechanism, Convolutional neural network, Transfer learning
PDF Full Text Request
Related items