Font Size: a A A

The Research And Design On Information Extraction And Gathering Model Based On XML

Posted on:2008-12-18Degree:MasterType:Thesis
Country:ChinaCandidate:L QinFull Text:PDF
GTID:2178360215988055Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet recent years, Web has had a tremendous range and shaped a platform to share information. How do we get information quickly and efficiently in Web? It is a problem to disturb Internet users all the time. Under this background, technique of Web information extraction occurs and is inherited from Information Extraction technique that came into effect many years ago. What's more, it inherits and develops some key techniques of information extraction field. At the same time, it became factual criterion to express Internet information speedily after XML came forth. Therefore, extraction technique of Web information plays a double role for it combines the traditional technique of information extraction with XML technique when extract information.At the beginning of the article, the author researches the technique of information extraction and the technique of XML and finding a general rule of tree structure which fits XML structure well. This rule can extract the data of Web into XML document in some pattern. If users can not utilize information at their pleasure after Web information is extracted, it will be good for nothing. So data integration technique also is an important sub-process. One of the category author will research is how the data extracted is mapped accurately to target database. At the same time, the author presents a Web query model based on XML. In the summary, extraction technique of Web information combines with technique of XML store and access, which realize the reuse of Web data in maximum.The innovation of the article is that the author presents an archetype system design of information extraction and an available project. This kind of system can meet demand of information extraction in various fields for the extract method with XML is adopted resourcefully in the system. At last, the author makes a remark on all extraction targets of the system by some examples and gains the expectant result.
Keywords/Search Tags:Web information extraction, XML, data integration, Mapping
PDF Full Text Request
Related items