Font Size: a A A

Research On The Generation Of GUI Tree From High Fidelity UI Design Image Based On Deep Learning

Posted on:2022-02-26Degree:MasterType:Thesis
Country:ChinaCandidate:Y WuFull Text:PDF
GTID:2518306551971139Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of communication technology and the widespread popularity of mobile terminal devices,people have increasingly relied on mobile devices for work and life,and mobile applications have become more and more critical.For mobile application(App)developers,in order to make the app stand out in the application market filled with a large number of similar products,it must be equipped with a rich graphical user interface(GUI)and pleasant user experience(UX).So how to design and develop a good GUI is a crucial issue for mobile applications.Generally speaking,developing an application GUI involves two stages:designing the UI and implementing the UI.Designers usually use software(such as Sketch and Photoshop)to design images with specific domain knowledge in UI design.After that,the developer converts it into code implementation.With several iterations,it is finally released to the application market for users to download and use.The high-fidelity UI design image mainly describes the required interface elements and their spatial layout in the form of pixels.Due to the need for component refinement,layout construction,work requires an inevitable trial and error cost when converting them(the entire interface or part of the interface)into the code form.Suppose one can directly generate an interface tree from the high-fidelity UI design image to represent the controls' elements and composition(divided into containers and components)in the GUI implementation.In that case,this process can be greatly simplified,and development efficiency can be improved.The ideal interface tree should take the real controls' class names used in the actual development as the nodes and the nesting relationship between the controls as the parent-child relationship,which can be used as the skeleton of code to speed up the GUI implementation.However,there are few pieces of research in academia that can directly generate interface trees.Most of them first detect interface components(the leaf nodes of the interface tree)and then determine the container to which it belongs until the interface tree is finally generated(such as REMAUI and REDRAW).This approach is challenging to ensure the generation effect of the interface tree due to the common component missed and multiple inspection problems,and the difficulty in determining the container to which the component belongs.Some studies using DSL(Domain Specific Language)as a coarse-grained interface abstraction(such as pix2code).They firstly generate DSL from the high-fidelity interface image and then convert it into an interface tree.This type of approach requires the definition and maintenance of the DSL and requires two conversions to get the interface tree,resulting in unsatisfactory results.UI2 code is similar to this research in that it directly generates the interface tree.However,UI2 code has problems such as long-distance dependence of feature information and loss of spatial location information,which affects the interface tree generation's effect.This thesis proposes a neural network translator based on an improved Transformer to automatically learn the knowledge of conversion from high-fidelity design images to interface trees to address those problems.It can automatically convert high-fidelity UI design images into interfaces composed of real interface nodes.The tree can help developers understand the interface image and reduce the difficulty and cost of development.The generated interface tree can be regarded as the initial “guidance instruction”.It can be used for subsequent GUI specific implementations(such as filling colour,quoting pictures,inputting text).This thesis uses the exact match rate,BLEU and edit distance to evaluate the generation effect.The exact match rate is used to measure the proportion of the generated interface tree that is completely consistent with the real interface structure.BLEU and edit distance are mainly used to measure the degree of similarity between the generated interface tree and the real interface structure that does not precisely match.The main work of this thesis is as follows:(1)Existing mobile application data sets are of low quality,incomplete information,or mixed information,making it challenging to directly apply it to this research.Therefore,this thesis first formulates screening rules to filter out some of the available data from the existing data sets.In addition,use the crawler to collect mobile application APK files from Google Play by application category.Then,this thesis uses automatic testing tools to run APK files on the simulator,automatically obtain mobile application interface screenshots and runtime data.Finally,the data is processed to construct the interface tree image data set.(2)Existing research has problems such as long-distance dependence and loss of spatial position information when encoding image features into context vectors to decode and generate predictive text.For this,this thesis uses Transformer and CNN to generate interface trees.It uses Transformer's Self-Attention to solve the long dependency problem of feature encoding.Simultaneously,spatial position encoding is used to improve the loss of spatial position information,and the exact matching rate,BLEU and edit distance are used to evaluate the generation effect.The results show that the exact match rate between the generated GUI interface tree and the real interface structure reaches 71.16%,which increases 8.3% compared with the existing research.The BLEU score reaches 93.36%,which increases of 5.05% compared with the existing research.The average Levenshtein edit distance is 2.9,and the average tree edit distance is 2.14.(3)Since the interface tree has a robust hierarchical structure relationship,Self-Attention treats all the input feature information as elements of the same level,which makes it more challenging to learn the hierarchical relationship between the controls.Therefore,to further improve the generation effect of the interface tree,this thesis proposes a priori memory SelfAttention,a continuous storage memory module is added to each layer of the encoder to learn the prior knowledge of features.Simultaneously,use it to perform attention calculation to obtain the prior information of this layer and transmit it to all subsequent high-levels.It then can participate in the high-level attention calculation,strengthen the model's ability to understand image features.This thesis uses the improved model to generate the GUI interface tree.The results show that all indicators are significantly improved.Among them,the exact match rate reaches 72.22%,the BLEU score reaches 93.8%,the average Levenshtein edit distance is 2.75,and the average tree edit distance is 2.
Keywords/Search Tags:Deep Learning, Interface Tree, GUI Hierarchy, User Interface, Self-Attention
PDF Full Text Request
Related items