In this thesis,we focus on a typical web structural information,with is commonly called ranking lists,and its auto-extraction.Compared with other web structures,ranking lists often contain more information,a richer variety,and a higher quality.The ranking lists can be used as an important source of data for some of the full-field knowledge base and for some Q & A systems.In this paper,we propose an efficient,end-to-end ranking list extraction algorithm.The algorithm is based on both visual and semantic information.Based on this algorithm,we can get more than 1.7 million ranking lists from 1.7 billion web pages with 92.0% accuracy and 72.3% recall rate. |