Font Size: a A A

A Lightweight Semantic Segmentation Network For Video Scene Comprehension Method On Edge Devices

Posted on:2021-04-22Degree:MasterType:Thesis
Country:ChinaCandidate:Y C YangFull Text:PDF
GTID:2428330611999322Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
Videos are pervasive in all perspectives of human life,and it is valuable to recognize or even identify the containing abundant information over various domains.To this end,various video comprehension approaches are proposed with promising results.However,these state-of-the-art techniques still face two problems: one is that the semantic information of objects among video sequences is hard to extract for a better comprehension of the video content with lack of temporal information.The other one is that the redundant data of video contents leads to large sized neural networks that cannot be deployed on edge devices with limited hardware resources.To address the problems towards the development of video comprehension on edge devices,this dissertation proposes a lightweight semantic segmentation network for video scene comprehension on edge devices.It can quickly locate and identify the objects contained in the scene,and extract time-series semantic features as inputs fed into a LSTM-based spatio-temporal model.To further optimize the entire model,we also simplify the backbone of the original segmentation network and apply tensorization and quantization based compression methods.To evaluate the performance of the proposed network,we conduct experiments with City Scapes,UCF11 and MOMENTS datasets.It demonstrates that the proposed network achieves an accuracy improvement of 9% versus the traditional CNN+RNN approach on UCF11,a storage reduction of 143.1× and an inference speed of 101 FPS on MOMENTS with a GTX1080 Ti GPU.The proposed network is further implemented on Huawei Atlas 200 DK device with 18.89× speedup after optimization.As a conclusion,the experiments show that the proposed network for video scene comprehension could well comprehend the video content by incorporating the semantic information and the temporal correlations.The lightweight structure largely degrades the redundancy of the original model,with a higher inference speed,providing a lightweight,high-speed and high-precision solution for video comprehension on edge devices.
Keywords/Search Tags:video comprehension, semantic segmentation, tensorization, edge computing
PDF Full Text Request
Related items