Font Size: a A A

Web Table Search Engine Construction

Posted on:2014-08-06Degree:MasterType:Thesis
Country:ChinaCandidate:F ZhouFull Text:PDF
GTID:2268330392473387Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the rapid development of web, search engine becomes more and moreimportant. And for dealing and mining more information increasing day by day fromweb, search engine needs to find new supporters except for the plain text of web.Web Table Search Engine System is a search engine that use computer to extract,analyze, store and retrieval the information of tables in the web under some certainrules, which aims at providing a new supporter for search engine. As a new supporter,tables are universal, readability and well-structured, which are suitable for reflect thestructure and content of the whole web, as an element for searching, and are easilyread by computers.For Web Table Search Engine System is a kind of search engine, it contains basicprocesses and methods to deal with data like traditional search engine. But unliketraditional search engine system, there are another three components in the Web TableSearch Engine System, which named Table Extraction, Table Header Detection andTable Rank. Every component has been designed its own processes and algorithms tocounter the characteristics of tables. In this study, these problems have beenresearched and practiced at some aspects below:1. Table Extraction. In this aspect, an algorithm has been realized to extract themeaningful tables in the web by examine of structure. And it compares with thealgorithm by examine of content.2. Table Header Detection. In this aspect, a new algorithm has been designed tofind if a table has header and which row/column is its header. The algorithm detectsthe rows and columns of table by inspecting the structure, content, style and someother aspects of table, and meanwhile improves accuracy through a combination ofmachine learning algorithm.3. Table Rank. In this aspect, a new query-unrelated search method, namedTableRank similar to PageRank, has been designed, which solves the problem of theranking of objectives lack of link.4. Build the Web Table Search Engine System. Depending on the operatingconditions, the results for the three aspects above of the system and conducted astatistical analysis, evaluation of the designed algorithm results. The results ofrunning the system can effectively help users better and faster to find the requiredforms, and learn more about the page and the entire online world.Through the construction of the system and the corresponding algorithm design,explore some particularities of the table search, for related research and development tools do some meaningful exploration.
Keywords/Search Tags:Search Engine System, Table, Table Rank
PDF Full Text Request
Related items