Font Size: a A A

Research On The Detection Method Of Network Text Target Words Based On Time-space Scanning

Posted on:2020-01-14Degree:MasterType:Thesis
Country:ChinaCandidate:L Y Z GaoFull Text:PDF
GTID:2438330596997517Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
With the rapid advancement of modernization and urbanization,new Internet media has become a new way of information dissemination.Online public opinions begin to show such characteristics as fast communication speed,large data volume and complex data type,and we media text becomes a prominent carrier.At present,the analysis of online public opinions is mostly limited to the statistical analysis of word frequency or multi-source joint analysis of a single and specific website,which fails to achieve the joint analysis from the two dimensions of time and space,making it difficult to achieve accurate positioning analysis and early warning.But space-time scan statistic method with space and time,it can scanning the two dimensions at the same time.By changing the scan,it can achieve early warning analysis of the characteristics of dynamic window.And in the aspect of text information network public opinion has great application potential.In recent years,there are a lot of studies have also shown the space-time scan statistics compared to other method has stronger ability of data analysis.In this thesis,spatio-temporal scanning statistics are used to construct the scanning algorithm for online text public opinions.Spatio-temporal scanning is traditionally used in the field of medical diseases.In view of the characteristics of online text public opinions,the scanning structure and data simulation are improved in the application process.This thesis focuses on the combination of online text public opinion and space-time scanning and mainly does the following work:(1)By using the web crawler from newspapers to extract text information network platform,analyses the characteristics of the network text.Establishing a database based on the characteristics of the network text,then to break up the text and word processing.After that the statistics of each phrase form the target word thesaurus,and extract the phrase as the query words in the process of scanning.(2)Study the advantages and disadvantages of spatio-temporal scanning statistics and other models,and the spatio-temporal scanning statistics combining time and space are determined in combination with the characteristics of network texts to build the scanning model of network public opinion.It includes the modeling of data source,the modeling of spatial distance,the use of generalized likelihood ratio function and the parameter calculation method in the modeling process.The validity of the scanning model is confirmed by the example analysis of the whole scanning model with the assumed data.(3)The whole experimental system was built and the experimental code was written.The data extraction module and the data matrix calculation module were constructed.The data sorting and cleaning module was added according to the actual situation,and the scanning level of the triple cycle of time,space and scanning range was determined.By studying the discriminant method of aggregate significance,the significance of data was determined in the form of monte carlo simulation.Aiming at the problem of simulated data reorganization in simulation,two data matrix rearrangement mechanisms,full random rearrangement and relevance rearrangement,were constructed.By crawling the real data and testing the experimental system,the actual test results of the online text public opinion experimental system based on space-time scanning were obtained.Experiments show that this article proposed model based on space-time scan the text of the network public opinion on the Internet platform text analysis is effective,can very good showed that abnormal network phrases in the text,in view of the actual scanning process improvements obviously optimize the efficiency of experimental system,achieved the purpose of network public opinion real-time text analysis.
Keywords/Search Tags:Network Public Opinion, Spatiotemporal Scanning Statistics, Data Modeling, Monte Carlo
PDF Full Text Request
Related items