| Web data has been a great source of information for most potential and great value in all areas, some of the current research and applications such as recommender systems, opinion mining and analysis, data integration and integrated systems, are based on the structured web data. Finding and obtaining the required information from the massive and complex web quickly and efficiently for deep mining and getting potential value is bean a challenging technical problem, but also a very practical topic and meaningful study. Research of accurate web data extraction has been a hot research field in order to meet these needs. Web data extraction is the process of obtaining information from semi-structured or unstructured web pages, and converting the information into structured data for mining and using.The main contents of this paper is as follows:Current information extraction methods and extraction model are analyzed and compared, extraction template and generating extraction rules based on user interaction has been studied, Several types of web elements and navigation elements for extracting has been designed, Using an approach based on XPath for location and identification of the these elements; a web data extraction system has been designed and implemented, the specific development technology using Qt development framework〠Python and JavaScript, a webkit browser engine has been provided in the data extraction system for web content rendering and Ajax dynamic load and user interacting.The experimental results shows that the accurate web data extraction system implemented in this study can meet the demands for all types of sites, such as, sites of newsã€sites of e-commerceã€sites of weibo etc. The extracted data can be saved as result of a variety of structured data formats (databases, Excel, formatted text file), and efficiency and accuracy performance of the extraction system can reach a high level. |