Font Size: a A A

Research On Temporal Information Recognition And Normalization

Posted on:2009-03-24Degree:MasterType:Thesis
Country:ChinaCandidate:Y Q PanFull Text:PDF
GTID:2178360278464448Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In the area of natural language processing, temporal information is an important carrier of language semantics. Time information denotes the changes of things in everyday language. People catch the whole process of things by knowing the temporal information of starting, proceeding, and ending. Time expression recognition plays an important role in information extraction, question answering, summary generation, topic detection and tracking.In this paper, a brief introduction and analysis to current research status and available method was brought, along with the annotation guidelines. Methods based on rules and statistics are separately explored to solve the problem of Chinese time expression recognition. An effective method to solve the problem of English time expression extraction and normalization was explored.In rule-based Chinese time expression recognition method, according to the syntax guidelines of time expression extent recognition, a method based on dependency tree was used, then the error-driven method was combined to the dependency tree method, which improves the result greatly, the final result achieves more than 76%.In machine learning based time expression recognition, method of Support Vector Machine, Conditional Random Field and improved Conditional Random Field was separately used. This is the first time to use CRF model to solve the time recognition problem. A series of effective features was selected and enlarged by templates. ACE evaluation tool was used to evaluate the system, the final results achieves more than 90%. The evaluation results shows that machine learning method is better than rule base method, among all machine learning methods, CRF model achieves better result than SVM model, improved CRF method improves the recognition efficiency while the result is improved.In the problem of English time expression recognition and normalization, SVM model was first used to recognize time and then to classify the time to several classes. For each class of time expressions, rules are used to normalize it. By introducing machine learning method to English time recognition and normalization, the result improves greatly than only use the rule based method while saves a lot of work to write rules.In a word, this paper explores effectively on Chinese time expression extraction and English time expression recognition and normalization, and achieves good results and beneficial conclusions.
Keywords/Search Tags:Time expression recognition, Time expression normalization, Information extraction, Conditional Random Field
PDF Full Text Request
Related items