In the field of computer analysis of document images, the problems of physical and logical layout analysis have been approached through a variety of heuristic, rule-based, and grammar-based techniques. In this paper we investigate the effectiveness of statistical pattern recognition algorithms for solving these two problems. Using a new software environment for manual page image segmentation and labelling, a dataset containing 932 page images from academic journals has been created. Several physical layout analysis algorithms have been implemented, including a new algorithm based on a logistic regression classifier. Three statistical classifiers were applied to the logical layout analysis problem, with encouraging results. A new model for how ink is laid out on a page was used to develop a prototype combined segmentation and labeling system. Finally, several applications have been investigated, and rudimentary implementations demonstrated. Results indicate that statistical pattern recognition approaches to these problems will be very fruitful. |