Font Size: a A A

Design And Implementation Of An Accessible Summary Mobile News System Based On Automatic Summarization

Posted on:2018-11-19Degree:MasterType:Thesis
Country:ChinaCandidate:X L WangFull Text:PDF
GTID:2358330515459789Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the development of mobile Internet and the popularity of mobile devices,people get more and more diversified channels of news information.Due to the visual defect,visually impaired people can only rely on hearing and touch to get information from the outside world,not as easy as the normal use of touchscreen cellphones and tablets.Access to news is extremely limited and less selective.Very few news applications are targeting or optimized for visually impaired people.Therefore,the development of an accessible news software is of great significance.In this context,our laboratory cooperates with China Braille Press and hopes to develop a news system accessible to the visually impaired people.Firstly,we propose a news crawler that supports dynamic pages crawling to ensure the comprehensiveness and diversity of news data.In the aspect of URL crawling,the crawler uses the non-interface browser HtmlUnit to trigger the script events by simulating click and slide,which solves the problem that the URLs in the dynamic pages are difficult to obtain.Then,in the aspect of URL denoising,this pager designs a denoising method based on regular expression,which can effectively remove the non-news links,so as to improve the efficiency of crawlers and avoid the waste of resources.In the last stage for news data extraction,we propose and implements a template-based news extraction method with high accuracy.Secondly,in order to improve the efficiency of users' reading,we proposed a new automatic summarization method based on TextRank.In this paper,the classical TextRank algorithm is implemented by using the BM25 similarity calculation method.Then we use the similarity between news headline,subtitles and sentences to adjust scores of sentences calculated by TextRank,which takes the structural affect of news headline and subtitles into account.The experimental results show that the improved TextRank method is better than the classical one.Finally,we implemented the automatic summarization based news system with an accessible Android application.
Keywords/Search Tags:news, web crawler, automatic summarization, accessibility
PDF Full Text Request
Related items