Font Size: a A A

Research Of Network Hotspot Content Classification Based On Improved Singular Value Decomposition And Cosine Theorem

Posted on:2018-08-19Degree:MasterType:Thesis
Country:ChinaCandidate:W LiuFull Text:PDF
GTID:2428330542976972Subject:Information management and information systems
Abstract/Summary:PDF Full Text Request
In the context of rapid development of network and information explosion,people's demand for information accessibility,availability and readability has promoted the emergence and devel-opment of hotspot discovery.The hotspot of the network is regarded as the foundation of mass media,and the barometer of social development and public opinion which has certain guidance to people's idea and consciousness.Therefore,how to improve the efficiency and effectiveness of network hotspot detection and classification from coverage,accuracy and timeliness has become the focus and difficulty in the field of text clustering and classification.Singular value decomposition(SVD)and cosine theorem are important mathematical models which has been widely used in text clustering and classification,information retrieval,image re-trieval and face recognition.Especially in the field of text clustering and classification,SVD can solve the problems of excessive text processing effectively,highlight the semantic structure,so that the processed documents can be closer to the results of human intelligence processing.Be-cause of its simple,understandability and effective distinction between angles,the cosine theorem is often used as a similarity calculation method.Based on singular value decomposition and cosine theorem,we improve the selection method of SVD's dimensionality reduction factor K and modify numerical insensitivity of cosine similar-ity calculation,then combine them into a hierarchical hybrid clustering algorithm.finally,con-structing a network hotspot content classification system based on SVD and cosine theorem to enhance the accuracy of text clustering and classification,improve the reliability of hotspot ex-traction from the semantic level through the document acquisition and processing,the construction of the term-document model constructing,the hot content extraction and the hot content classifi-cation.In addition,aiming at the properties of hotspots,the study introduces the factors of new word phrases and word length at the stage of keywords extraction to enhance the representation of keywords for hotspots.At the same time,converting the test text into a new semantic space by using the dual use of SVD to improve the performance of overall system.The research enriches and develops the application of SVD and cosine theorem in the field of hotspots classification and it realizes the successful transition from the hotspot clustering pattern based on syntax to the hotspot clustering pattern based on semantics in order to extend the source of text and provide accurate and efficient information resources for people.Finally,we conduct the training experiments and test experiments on the network hotspot content classification system by constructing the data set,then verify the experiment by using the comprehensive clustering evaluation index C-F value.The experimental results show that SVD can extract the semantic relations between words in a certain degree to prominent the text semantic structure.Compared with the single algorithm,the hybrid clustering method has a significant im-provement in performance.Therefore,the research of network hotspot content classification based on singular value decomposition and cosine theorem are of great significance to improve the effi-ciency and effectiveness of traditional hot spot extraction and promote development and improve-ment of the text clustering,text classification and hotspot discovery theory system.
Keywords/Search Tags:singular value decomposition(SVD), cosine theorem, hotspot detection, text classification, text clustering
PDF Full Text Request
Related items