Font Size: a A A

Research On Extraction System Of Crop Based On HMM

Posted on:2007-03-14Degree:MasterType:Thesis
Country:ChinaCandidate:X Y JianFull Text:PDF
GTID:2178360185951009Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In recent years, with the rapid development of computer and Internet technologies, the field-level information exponential growth. How to access useful information to be an effective and efficient use of information. Information Extraction (IE) is the process of identifying specified types of events or relations in natural language texts, and recording this information in a structured form. Thereby avoiding the cumbersome manual search, and enhancing efficiency.There are many methods now applied to the field. Such as based on clustering methods, statistical methods and models based on matching methods, using crop seeds text, We can extract useful information of crop cultivation through the methods, avoiding a cumbersome manual reading. This article presents an information extraction model based on HMM, and defines segments separated by of punctuation (comma, full stop, Gantanhao, etc.) for small sentence;the meaning of small sentence to the topic. We think, First, the text of a crop seeds is actually some topics assembly. Secondly, according to reading the large number of crop seeds cropus, we find that thedescription of the seeds are generally consistent. A text is some topics set. Compared with other methods, there are follow: Don't need know domain information too much;Can be used in different domain;eliminate the noise in clustering arithmetic.According to our analysis, It mainly finished the following several jobs:> Comparing different kinds of sentence similarity arithmetic, and assimilating their strongpoint, put forward a method based on extend to crop text;> cluster small sentence, and genera training corpus and the topics sets;> Through training the relation between topics, get HMM model of this field;^ mark the only topic for small sentence using Vitiber algorithm. We applied the method to Chinese agricultural texts, and received a good performance.
Keywords/Search Tags:Information extraction, HMM, topics, clustering
PDF Full Text Request
Related items