Identify And Extract Web Information And The Emergence Mode

Posted on:2006-06-16

Degree:Master

Type:Thesis

Country:China

Candidate:Q Lei

Full Text:PDF

GTID:2208360152491714

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

With the rapid development and opening characteristic of the Internet, the amount of information has increased greatly. Web has already become an indispensable information sources for people. There is a large amount of information that describes interrelation of entities on the Web; meanwhile lots of valuable information is hidden in the interrelations between the entities. However, today's search engines which search information relying on keywords matches, lack the ability of knowledge manipulating and understanding, so it can not discern relations on the Web.In this paper, we take XML which is a new standard of information issue and exchanging on the Web as the object of our researching, and put forward a method concerning about mining relations and patterns in XML documents on the Web. This method first collects XML documents according to user's requirement, and then it discerns target XML files which contain relations required by user by calculating similarity between XML documents. At last it establishes user's searching pattern and use pattern-matching algorithm to extract all relation occurrences from target document.Experimental results show that our similarity calculating method in this paper can be used to discern target XML document in a goodperformance. At the same time, the way we represents user's requirement and the pattern-matching algorithm we take is able to extract the most target relations from given XML documents accurately.

Keywords/Search Tags:

relations, XML similarity, pattern matching, data extracting

PDF Full Text Request

Related items

1	The Database Schema Research And Matching Method
2	Extracting relations from large text collections
3	Automatic Extraction Of Conceptual Relations For Constructing Domain-Specific Ontology
4	Similarity Query And Pattern Mining On Data Streams
5	Study On The Calculation Methods Of Similarity For Spacial Scenes Consisted Of Vector Area Objects
6	Study On Template Matching Algorithm Based On Extracting Of Image Contour In Robot Vision
7	Research Of Web Service Discovery Based On Semantics
8	A Method For Extracting Three Kinds Of Implicit Quantity Relations From Algebra Problems
9	A simple and dynamic data structure for pattern matching in texts
10	Rearch On XML Pattern Matching Based On Data Stream Environment