Font Size: a A A

Research On Construction And Minimization Algorithm Of Fuzzy Tree Automata

Posted on:2015-02-19Degree:MasterType:Thesis
Country:ChinaCandidate:D D SunFull Text:PDF
GTID:2298330422984675Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Now, as people’s economic condition is improving and the development of science andtechnology, Web information is also more and more large and complex. How to extract therequired information from massive information in Web page, become one of the researchhotspots. However, Webpage information data exist semantic overlapping and semanticfuzziness, which makes the traditional information extraction technology can not meet theneeds of users. In order to solve this problem, this paper studies how to construct fuzzy treeautomata for Web information extraction, and propose a constructor of fuzzy tree automataand minimization algorithm of tree and fuzzy tree automata.In this paper the main work and the main technology is as follows:(1) Construct the unranked tree automata model. Based on the tree structure of a webpage, the page through the HTML and XML DOM parser, generating the unranked DOM treeset, in order to solve the unranked tree node number uncertainty, we construct a (k,l)-contextual tree on the basis of DOM tree set, and that can control the height and the widthof the tree, and use of two-way transfer function, construct the unranked tree automata.(2) Using Rough set technology to process the fuzzy information and to construct afuzzy tree automata. A tolerance relation model is constructed based on the Rough set theorythat meet processing fuzzy information in webpage, and we combine with the tolerancerelation model and upper approximation in Rough set theory, and realize the expansion of“main information”, better solve the fuzzy information, and increase the accuracy ofinformation extraction; Based on this, advances the process of construction of fuzzy treeautomata, and through the experiment, verified the effectiveness of the fuzzy tree automatamodel in extractiong information.(3) Tree automata and fuzzy tree automata minimization. The difficulty of tree automataminimization is classify tree state may produce a new state string classify. The paper putforward the definition by tree operators, tracking and marking way to solve this problem;Using the fuzzy equivalence of state set, fuzzy equivalence is constructed; To use thebisimulation forward bisimulation techniques, and get the maximum forward bisimulation andput forward forward bisimulation algorighm of fuzzy tree automata; Through the example,this algorithm can get less states than the original fuzzy tree automata.
Keywords/Search Tags:tree automata, fuzzy tree automata, automata minimization, Rough set, information extraction
PDF Full Text Request
Related items