Font Size: a A A

Research On Public Opinion Hot Words Extraction And Classification Based On Campus Network Traffic

Posted on:2019-01-26Degree:MasterType:Thesis
Country:ChinaCandidate:J ZhenFull Text:PDF
GTID:2428330545457139Subject:Systems analysis and integration
Abstract/Summary:PDF Full Text Request
With the rapid development of the campus network,the Internet public opinion in colleges and universities has gradually become a hot topic.The ordinary public opinion system focuses on general public attitudes and opinions on social events.It is hard to locate the public opinion to a specific physical area such as a university.Public opinion hot words extraction based on campus network traffic can discover the hotspot issues of a campus.We can also find the emotion and attitude of teachers and students on public events after the analysis of massive network information.The public opinion analysis of a university is of great significance to improve the management level for the harmonious campus.Public opinion hot words extraction based on campus network traffic need to address the following challenges:network traffic collection and archiving,restoration and reconstruction of public opinion data based on network traffic,Public opinion hot words extraction and classification,etc.The main contents of this thesis are as follows:(1)First,we use the open source network traffic monitoring tool Bro to capture and achieve the campus network traffic of Hubei University.We solved the challenging problem of high-speed,real-time,IPv4&IPv6 network traffic acquisition.The captured data was transformed to log files in order to store and archive large-scale network traffic.(2)We resolve the problem of public opinion data restoration and reconstruction in two ways:first,we use Bro to obtain the HTTP pages directly,secondly,the HTTPS pages were obtained using a common crawler application framework named Scrapy.The HTTPS page and the HTTP page were consolidated and rebuilt in the end.(3)The main content of the web page was extracted,then duplication removal,word segmentation,keyword extraction,classification and other processing were executed.The text extraction algorithm is improved and the analysis results were presented in the campus network traffic analysis system.(4)I also participated in the design and implementation of a campus network traffic analysis system.Based on the classification of collected network traffic,the public opinion data reconstruction and public opinion hot words extraction were integrated into the network traffic analysis system as a sub system.
Keywords/Search Tags:network traffic, capture and classification, public opinion hot words extraction
PDF Full Text Request
Related items