This paper studies how to automatically extract posting information from topic pages of BBS. Traditional solutions to such problem are based primarily on analyzing the HTML DOM trees and tag structures of pages, and thus heavily dependent upon the HTML standard. The accuracy of extraction is greatly influenced by whether the page is well formatted, and the approach may have to be changed whenever the version of script language evolves.Here, a language independent technique is proposed. Our solution performs the extraction just based on the visual information of topic pages. We conlude the visual features of BBSpostings which guides entire extraction process. It is carried out in three steps: first construct the visual block tree of the topic page, then locate the posting region in the tree, and finally extract every posting information from the posting region. Experimental results indicate that the vision feature based approach can achieve high extraction accuracy.The study has mixed the BBS data mining technology and vision feature parsing technology for web pages, and will be significant to the resource integeration of BBS and social administration of BBS. |