Research On Statistical Methods Of New Crown Epidemic Data Based On Text Analysis

Posted on:2022-09-10

Degree:Master

Type:Thesis

Country:China

Candidate:J Sun

Full Text:PDF

GTID:2514306320468234

Subject:Applied Statistics

Abstract/Summary:

PDF Full Text Request

As novel coronavirus pneumonia continues to develop,more and more information can be used for statistical analysis.The trajectory information of confirmed cases is the main content of the text information that can be studied.It is the direction that we can carry out text mining and exploration.This paper takes Harbin as an example.Through statistical research method,the cluster analysis is carried out on the track of confirmed cases and asymptomatic infected persons in Harbin area as of June 2020.Through the research method of text clustering,it helps the region to help the region to track existing cases and suspected cases with overlapping tracks with newly diagnosed cases,The author tries to provide scientific methods for tracing the virus and quickly locking suspected cases.In this paper,the text clustering method based on vector space model(VSM)and k-means algorithm is adopted.In view of the problem that the dimension of feature vector space obtained after track segmentation is too large,the algorithm complexity is too high.In this paper,the feature vector is reduced by variance based feature selection method,Thus the algorithm complexity is reduced and the clustering effect of text is improved.In addition,the effect of Euclidean distance and cosine distance on the clustering of trajectory text is compared.According to the research results of clustering the case track text,it is shown that the clustering results of nearly 70% are interpretable.It is proved that kmeans clustering method of vector space model has certain practical and reference value for case track text clustering.In addition,the paper also uses the method of center of gravity trajectory analysis,analyzes the spatial track of epidemic situation in Harbin.The analysis results have some research value in tracking the development track of virus and early warning of timely protective measures in the virus diffusion area.

Keywords/Search Tags:

COVID-19, Vector space model, Feature selection, Text clustering

PDF Full Text Request

Related items

1	COVID-19 Public Opinion Analysis Based On Crawler Technology And Text Clustering
2	Research On SNP-based Feature Selection And Diagnosis Model For Schizophrenia
3	Support Vector Data Description-based Feature Selection Method And Its Application
4	Interpretable Feature Selection In The Analyses Of COVID-19 And Postoperative Analgesia
5	Multi-label Feature Selection Algorithm Based On Sample Differences
6	Multi-task Feature Selection Algorithm And Its Application For Multimodal Neuro Image
7	Research On Colorectal Cancer Prediction Model Based On Feature Selection
8	Research On Multi-feature Ambiguity Resolution Method For Traditional Chinese Medicine Text Segmentation
9	Research On Coronary Heart Disease Screening Model Based On Ensemble Features Selection
10	Research On COVID-19 Detection Algorithm Based On Convolutional Neural Networ