| Chinese Named Entity Recognition (NER) is the task of finding entities, such as people, locations and organizations in the Chinese text. NER is an important subtask in information extraction, question answer system, parsing, machine translation, and so on.NER has received considerable attention from the natural language processing community over the past years. However, few studies have been reported on the recognition of Chinese nested named entities. Compared with the basic named entity, Chinese Nested Named Entity (NNE) exhibits more complex structure, which consists of one or more basic entities. This paper conducts an in-depth study on nested named entity recognition based on statistics-based learning method.Firstly, this paper analyzes and summarizes a variety of learning algorithms for nested named entity recognition. Then a novel approach to Chinese nested named entity recognition based on a joint model is proposed to address the problem with the traditional sequence labeling methods. The main work and contributions of this paper are listed as follows:1. First, we investigate applying sequence labeling methods to Chinese nested named entity recognition. Two distinct sequence labeling approaches, i.e a hierarchical labeling scheme and a dual-layer model, are adopted to recognize Chinese nested named entity, the two corresponding baseline systems are implemented using the conditional random fields.2. Second, we propose and design a novel approach to Chinese Nested named entity recognition using a joint model. We first formulate Chinese NER as a joint task of boundary identification and entity categorization, together with segmentation, which are performed simultaneously. Further, we combine the joint process with the label set "BIE" used in the traditional sequence labeling approaches, to recognize the Chinese nested named entity. Thus, based on the joint model with the extended labeling scheme, we can not only identify the boundary and category for whole Chinese NNE, but also find the basic named entities and normal words within the NNE. In particular, we use the average perceptron algorithm and k-best MIRA algorithm for training, and the beam search algorithm improved by max-violation update for decoding. Within this framework, we explored a variety of effective feature representations for Chinese nested named entity recognition. A set of10-folds cross-validation experiments were conducted on the PKU corpus. The experimental results show that the joint model achieves much better performance with the F1-score of80.85%than the two traditional baseline systems using sequence labeling models, demonstrating the effectiveness of the joint model for nested named entity recognition task. |