| With the development of technologies such as smart-home and virtual reality,the demand for digital modelling and visualization of indoor spatial is increasing,and the need for fast and accurate estimation of indoor spatial layouts has become a matter of great interest.Indoor spatial layout estimation provides an a priori basis for most tasks such as augmented reality,robot navigation and scene understanding,and has a wide range of application scenarios and important research implications.The task suffers from the challenging problems of indoor spatial being obscured by numerous clutters,complex environmental conditions and different scene topologies,and existing methods based on traditional geometry or single-task supervised learning still have certain shortcomings for extracting semantic features.In this regard,the research in this paper is specified as follows.(1)In order to perceive the layout relationship of indoor scenes more effectively and further improve the accuracy of spatial image segmentation of indoor scenes,an indoor spatial layout estimation model based on multi-task supervised learning is proposed.An improved encoder-decoder network structure is designed for the segmentation characteristics of indoor layout images,using the lightweight Mobile Net V2 network as the backbone network to greatly reduce the training difficulty of the model,and introducing depthwise separable convolution in ASPP to effectively capture multi-scale information and further reduce the parameters of the model.Multi-task supervised learning is introduced to infer the indoor spatial layout and semantic edge results for each local region,with multiple tasks sharing some common feature representations,thus improving the generalization capability and efficiency of the model.The output of the semantic edges of the local regions is used to post-process the results of the semantic segmentation map with feature fusion to optimize the edge positions between regions.During the training process of the network,a joint loss function is designed,and smoothing loss and edge loss are used to assist the training of the network model,thus improving the geometric reasonableness of the semantic segmentation results.(2)The experimental segmentation results of each model are evaluated using evaluation metrics commonly used in the field of semantic segmentation such as Io U,m Io U,PA,MPA and PE.1)Five network models are constructed using several methods and backbone networks commonly used in the field of semantic segmentation,and are compared with the models in this paper on the LSUN and Hedau datasets.Compared with the FCN and PSPNet network models,the Deep Lab V3+ model performs better overall in the indoor spatial layout estimation task.And the values of the global metrics m Io U,MPA,PE for the proposed model in this paper are 78.06%,87.51%,7.54% and78.77%,87.54%,7.08% on LSUN and Hedau datasets,which are better than the other methods overall.Specifically for each area face,the results obtained using the Io U and PA category metrics illustrate that the model in this paper is better at segmenting each area face,with a particularly significant improvement in the segmentation capability of mid-walls.2)A quantitative comparison of the model proposed in this paper with other research work related to indoor spatial layout estimation in terms of PE metrics was conducted,and the comparison included both non-end-to-end traditional methods and end-to-end deep learning methods.Compared with indoor layout estimation research methods in recent years,the PE metrics of the model in this paper are lower than other methods on both LSUN and Hedau datasets,and the accuracy of layout estimation has certain advantages.3)The number of parameters and inference time are chosen as evaluation metrics to further compare the size of each network model and the speed of indoor layout estimation,and to analyse the fluctuations of m Io U,MPA and PE of various network models during the training process.The comparative analysis shows that the model in this paper has good segmentation efficiency while ensuring segmentation accuracy.During the training process of the network models,the convergence speed is faster compared to other models,and the performance of m Io U,MPA and PE is in the leading position.4)To verify the necessity of making improvements to the encoder structure and the effectiveness of introducing the multi-task supervised learning module and the feature fusion module,five different sets of schemes were set up on the LSUN dataset for the ablation experiments.The experimental results show that the improved encoder-decoder module,the multi-task supervised learning module,and the feature fusion post-processing module have some enhancement effects on the segmentation of indoor scene images.This is mainly because the multi-task supervised learning and joint loss function in this paper’s model are more perceptive of semantic information,which can enhance the model’s ability to segment indoor spatial layout and improve the geometric rationality of the segmentation results.5)The LSUN and Hedau datasets are visualized using this paper’s model and qualitative analysis is conducted.From the visualization results,it can be seen that the model in this paper has good robustness and generalization ability,and even in the case of complex environmental conditions,numerous clutter occlusions and multiple topologies,the model still performs well and can correctly identify the indoor scene layout segmentation map.(3)A system for indoor layout estimation is designed and developed,combining semantic segmentation algorithms from deep learning with software development,as a way to facilitate non-technical users to observe the segmentation effect of the model on interior layout estimation.The system is analysed in terms of both requirements and system structure and functionality,the system’s architecture and required technical framework are given,the Py Qt framework is chosen to build the system’s UI interface,and My Sql,Python and Py Torch are combined to develop and implement the system’s functionality.The system is then designed and implemented in two functional modules,user and layout estimation,and each UI interface and functional operations such as user login,registration and segmentation are described in detail.Finally,the layout estimation system is tested in both functional and performance aspects to further assess the feasibility of the system.The test results show that the system has good reliability,beautiful interface,stable operation,and can effectively reason out the interior layout results while ensuring efficiency,and has certain application value for indoor spatial layout estimation. |