| Graphic design combines various visual elements to efficiently convey information.Previous research on graphic design mainly focused on visual-textual layout,floorplan generation,etc.However,with the popularity of visualization,there is a clear trend that visualizations are gradually used in scene images,as they are designed to provide rich in-formation and attractive visual representations through a combination of geometric shapes and color palettes that enhance the visual expression of the scene image.To meet the needs arising from the popular trend,this paper focuses on the study of the automatic genera-tion of visualizations over 2D scene images,defined as vis-over-image layout.Since both images and visualizations contain complex visual elements,the integration process is not easy.This paper presents a study on vis-over-image layout based on three aesthetic prin-ciples summarized by visual design experts: readability,visual balance,and consistency.First,this paper proposes an encoder-decoder convolutional neural network to per-ceive the visual saliency of the input image.Readability requires that overlapping salient regions of the scene is not allowed when rendering visualizations.Hence,this paper em-ploys a CNN to capture the deep visual features of the input image,which are then con-verted into a saliency map.The saliency map shows the visual saliency of each pixel,while our goal is to locate the non-salient regions in the scene.Thus,an image segmentation al-gorithm is proposed.Specifically,it first performs blue noise sampling algorithm over saliency map to select a representative set of pixels,and then performs Delaunay Trian-gulation over sampled pixels to connect them into multiple consecutive triangular regions whose area is highly consistent with the visual saliency,thus facilitating the location of non-salient regions.Second,this paper proposes a graph convolutional network with an attention mech-anism to construct spatial relationships between image subjects and visualizations in the graph domain.Visual balance requires visualizations should keep a uniform visual dis-tance from salient regions and boundaries of the scene.And consistency requires the visualization color scheme should be coordinated with that of the input image.To achieve these two goals,this paper constructs the image as a graph structure and considers the cen-ter of the visualization as a graph node,and then proposes a graph attention model to learn the spatial relationship between them.Therefore,the trained model can output visually uniform vis-over-image layouts.Next,based on clustering algorithm and color contrast calculation,color palettes are extracted from images and mapped to the corresponding visualization color space.Then the visualizations are colored following data features to form aesthetically attractive vis-over-image layouts.In order to demonstrate the effective-ness of this paper,we perform both quantitative and qualitative evaluations with several comparison experiments and ablation experiments.Finally,to show the practicability of this paper,we design and implement an in-telligent system providing vis-over-image layouts in real time.The system allows users to customize images,data and visualization types,and then generates delicate vis-over-image layouts.In addition,it provides users with the ability to interact with the generated visualizations to explore more information.With the system,this paper reports three in-teresting application scenarios: Scene Data Visualization,Augmented Visualization,and360 Panoramic Visualization,in each of which specific vis-over-image layout examples are presented to fully represent its potential application prospects. |