Font Size: a A A

Research On Spatial And Temporal Information Extraction In Unstructured Text

Posted on:2018-12-24Degree:MasterType:Thesis
Country:ChinaCandidate:Q X DuFull Text:PDF
GTID:2348330518968283Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
With the rapid development of the network information age,the number of text information on the network is more and more,and its number is increasing at an incalculable rate.In the face of such a huge amount of network text information,how to get useful information from users is a hot topic in today’s society.In order to facilitate the user from a large number of information sources to quickly obtain the information needed by users,access to information of various methods are gradually being explored.In general,it is said that information extraction will refer to information retrieval,the two are interrelated,complement each other,information retrieval contains a relatively large range,including the document search,identification,clustering and other technologies,the use of these technologies can be convenient The user finds the required documentation in a large number of text sets.However,the information extraction technique is different from the information retrieval.It is intended for users to find more detailed information from a related document,such as named entities,event information,and time information And so on,these fine information to the user’s demand for information extraction is getting higher and higher,and these fine information of the dominant,format,greatly facilitate the experts and scholars for a field of research and application.Information extraction is the natural language text in the disorder of information through a certain technology and methods,the output of a certain format with the information.In recent years,the scope of information extraction has been expanding,and the research on information extraction of events has been paid more and more attention.Technically,statistics-based techniques and machine learning methods have been used in information extraction important role.In this paper,we study the hybrid bidirectional hidden Markov model and the main algorithm associated with it.The forward algorithm in the evaluation;the maximum likelihood algorithm used to mark the training samples in the learning of the model and the method used to mark the training samples Algorithm;decoding the Viterbi algorithm.This paper focuses on the application of HMM in information extraction in unstructured text,and establishes a space-time extraction model based on hybrid bidirectional HMM.Comparing and analyzing the extracted data by closed test and open test,it is proved that the improved method of HMM model is effective.The main research direction and purpose of the article include the following four aspects:1)structured expression of event space-time information.After analyzing the linguistic features and semantic components of the spatiotemporal information of the Chinese text,the temporal and spatial information labeling system and the identification model of the event are established.Taking the research of bird distribution characteristics as an example,the literature metadata in CNKI is the main data source,The method of marking the spatiotemporal information in unstructured text is established,which provides relatively standardized training text and test text for the study of temporal and spatial information.2)space-time information extraction.By analyzing the general characteristics of the time information expression in Chinese text,the temporal entity and the custom rule are combinedwith the time entity speculation and the normalization representation.The time and space information of a specific event is realized by using the annotation method based on the mixed hidden Markov model Identify.3)event space and time information matching and visualization.Based on the identified spatio-temporal information as the object of study,the matching method of spatiotemporal information of specific events is discussed and the space-time pairs are intuitively expressed.The time-space process of specific events is reconstructed by cluster analysis,and the temporal and spatial information of events is organic and intuitive Show on the map.4)the application of space-time information.Will be the distribution of birds and the characteristics of space-time changes in the map,for birds and birds and birds to provide valuable information to achieve scientific predictions for the community to provide strong support for bird information.Space and space information research can also be used in other areas,such as: cadastral management,intelligent transportation and defense and other fields.
Keywords/Search Tags:Mixed Hidden Markov Model, Information extraction, Viterbi, Time and space information
PDF Full Text Request
Related items