Font Size: a A A

A Study On Time Expression Recognition And Normalization From Text

Posted on:2021-01-25Degree:MasterType:Thesis
Country:ChinaCandidate:G J GaoFull Text:PDF
GTID:2428330647951041Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
As we know,there are plenty of time expressions in texts.Recognizing time expressions from texts and utilizing them are helpful to a lot of natural language processing tasks,such as question answering,reading comprehension and so on.Time ML is an annotation specification for time and event in texts.On the time aspect,it defines the extent and normalized value of each time expression,which makes us understand time expressions more scientifically.In this thesis,we follow the annotation of Time ML to explore the recognition and normalization of time expressions in texts using a combination of manual and automatic methods.The main work and contributions are as follows:1.For the time expression recognition,we model it as a pattern matching problem and propose a pattern-based method named TR.This method first constructs the token type system manually,then abstracts patterns of time expressions by token types,and finally matches the possible time expressions with generated patterns.Because of the characteristics of automatic pattern generation,TR needs less manpower than classic rule-based methods with a good interpretability.In the evaluation,TR has achieved a good recall,but the precision is not ideal.2.We propose another method named TR* based on TR.TR* adds a selection step after pattern generation,and this step retains good patterns by dropping some poor patterns.We model the pattern selection problem as an EBMC problem and solve it with a greedy algorithm.TR* achieves satisfactory results in the evaluation.3.For the time expression normalization,we propose a rule-based method TN.This method artificially assigns normalization rules to tokens after designing time functions,then uses heuristic algorithm to combine the normalization rules into the requiredfunction form,and finally executes them in turn.TN does not need to design expression level rules,and it is more flexible and convenient.In the evaluation,TN has achieved good results.
Keywords/Search Tags:Time Expression, Timex Recognition, Timex Normalization, Pattern Matching
PDF Full Text Request
Related items