Font Size: a A A

Chinese Named Entity Recognition Based On Neural Network

Posted on:2018-07-18Degree:MasterType:Thesis
Country:ChinaCandidate:L WangFull Text:PDF
GTID:2348330518492591Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Named entity recognition (NER) is a task that aims at extracting and categorizing specific entities in texts, such as person names, location names and organization names. NER is a fundamental technology for many natural language processing applications, such as information extraction, question answering system, machine translation and so on.Traditional statistical machine learning methods often require large amounts of knowledge and a set of manually designed feature templates to achieve high performance, and the choice of feature templates has an important influence on the performance. However, designing a set of good feature templates is heavily dependent on human creativity, prior knowledge and linguistic intuition, which is laborious and costly. In this thesis, some neural architectures for Chinese named entity recognition are presented, eliminating the need for most feature engineering. Specifically, the main work in this thesis are as follows:(1) The relevant technologies and models involved with Chinese NER and deep learning are analyzed in this thesis. Firstly, the main difficulties associated with Chinese NER and the existing methods for Chinese NER are carefully analyzed. Then, the relevant models of deep learning are summarized, including feed forward neural network, recurrent neural network, word embedding and neural language model.(2) A neural architecture based on bi-directional long-short term memory (Bi-LSTM)is implemented as the baseline system, by regarding the Chinese named entity recognition task as a sequence labeling problem. The sequence of character embedding is fed to the model, which can utilize complete context information, detect character-level features automatically and assign label to every character of the given sequence.(3) The novel segment-level models based on the neural networks for Chinese NER are proposed. In the Chinese sentence, there is no delimiter between words. Thus, Chinese NER can be treated as a segmentation problem, identifying boundary and categorizing entity simultaneously. Compared with the traditional sequence labeling model, the model that directly represents segment is attractive, which is not bounded by local tag dependencies and can adopt segment-level information. Methods of segment-level Chinese NER based on neural networks in this thesis are firstly presented, using two neural architectures. For solving segment-level Chinese named entity recognition problem, these methods combine neural network and semi-CRF, compose multiple segment-level representations and assign tags to segments.A series of experiments are conducted in this thesis, and the experimental results show that the proposed segment-level models for Chinese NER based on the neural networks achieve higher performance than the baseline system.
Keywords/Search Tags:Chinese named entity recognition, Deep learning, Bi-directional long-short term memory (Bi-LSTM), Segment-level Chinese named entity recognition
PDF Full Text Request
Related items