Font Size: a A A

Research On Named Entity Recognition Method In Telecommunications Field

Posted on:2019-07-26Degree:MasterType:Thesis
Country:ChinaCandidate:G ZhangFull Text:PDF
GTID:2428330566496015Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the continuous integration of telecom services and Internet technology,the telecommunication field has entered the digital age.As the basis of intelligent application in telecommunication field,the research of Named Entity Recognition(NER)in telecommunications field has practical significance.However,the traditional NER methods are not fully applicable to telecommunication fields because the texts have certain chatracteristics and the datasets are scarce.The thesis improves the conditional random field(CRF)model based on the characteristics of the texts in telecommunication field.Digital features,alphabetical features,keyword features and geographical feathers are added to the feature items to enhance the learning ability of the CRF model.The different context window sizes are selected for the different feature items in the process of building the feature templates to ensure the rationality of context information usage.The incremental learning strategy is used to select the optimal feature set to improve the efficiency of constructing the optimal feature set.Aiming at the lack of marked datasets in telecommunications field,a corpus tagging method,based on Tri-Training algorithm,is proposed.With the help of tectonic texts in telecommunication field,some corpora are marked by constructing the basic lexicon in telecommunication field and tagging all text corpora with Tri-Training algorithm.The thesis combines the improved CRF model with the corpus tagging method based on Tri-Training algorithm to construct a complete NER system in telecommunication field.And the proposed methods are verified with the help of this system.The experimental results show that the methods both are effective and feasible and can achieve satisfactory results in the NER in telecommunication field.
Keywords/Search Tags:NER, CRF, Optimal Feature Subset, Corpus Tagging, Tri-Training Algorithm
PDF Full Text Request
Related items