| With the rapid development of the Internet, the Internet has become an important way to access to news and information. How to access relevant information more conveniently, more comprehensively and more accurately has become an issue. As traditional network media with dedicated web sites no longer statisfy the needs of users, the notion of news search engine emerges. With growing popularity of mobile phones and continual improvement of its usablility, mobile news search is becoming a trend.Design and implementation of a text extraction algorithm for HTML news pages based on the characteristics of human vision. The algorithm is based on the judgment of text. According to factors incluing the count of Chinese characters, the count of hot words, the count of hyperlinks, certain paragraph patterns of text can be determined. Then by using the relationship of HTML nodes, the text of the news pages can be extracted.In this paper, a number of key technologies of the mobile search engine on news have been deeply analyzed and researched, and a prototype system has been realized.Design of a mobile search engine on news, and the actual realization of this program and a prototype with a number of improvements made according to users'experience in the second work phase. |