Font Size: a A A

Study On CRF-based Chinese Named Entity Recognition

Posted on:2011-12-17Degree:MasterType:Thesis
Country:ChinaCandidate:H F ShiFull Text:PDF
GTID:2178360305977558Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Named entity recognition (NER) is an essential task in natural language processing research. The main work of NER is to classify every word in a document as being a name of a person, location, organization, date, time, number or other named entities. NER plays a significant role in applications of natural language proeessing, such as information retrieval, information extraction, machine translation and so on. In a word, how to identify and classify named entities has great theoretical and practical significance.In this paper, we summarized the methods of NER, and introduced the evaluation strategy for NER. Also. We described the CRF models in detail. CRF is a statistical approach of machine learning, which outperforms other approaches in segmenting and labeling sequence data.Using the lexicons, we introduced external features into the training process. The experiment results show that the external features can reduce the need of training data and improve the effect of the NER remarkably. Regarding CRF as the basic model, we design and construct an experimental system to recognize Chinese names of person, location and orgnization in word level. The experimental system had good expansibility.Finally, we recognized the temporal expression and number expressiong using the method of regulations.From the result of the experiments, the NER using CRF is proved feasible. In the future we will focus the research on CRF model, especially on the feature selection and parameter training.
Keywords/Search Tags:Named Entity Recognition, CRF, NLP, feature
PDF Full Text Request
Related items