Font Size: a A A

Design And Implementation Of Word Segmentation System For The Non-standard English Text

Posted on:2016-09-20Degree:MasterType:Thesis
Country:ChinaCandidate:W HanFull Text:PDF
GTID:2308330461979006Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Northern Heavy Industry Group Co., Ltd, as a large transnational & state-owned enterprise, his main product is Tunnel Boring Machine. Especially, after acquisition of NFM Technologies SAS in 2007, in the environment of TBM manufacture, the communication between our first line team and the foreigners service team become more and more frequently; meanwhile with the constant development and expanding of marketing, the exporting products of our enterprise increasing day by day, which increase the opportunity for our staff to offer service on foreign sites, the common study on TBM and on other products between Chinese staff and foreigners occurs frequently, thus an effective transmission between English and Chinese become critical. In order to make the communication between Chinese staff and foreign service team more easily, to improve the unfavorable factors, such as the unreadable in English phrase written by Foreigners, we choose word Segmentation as solution, which will also push the progression on the operation and production for our enterprise, as well as the highly development of site service.Based on the actual development need of our enterprise, the author designed and implemented a Word Segmentation System Based on non-standard English text. The core functions of the system are providing the services of word segmentation for some non-standard or input limited English text. Output the English documentations with word segmentation tags.In this paper, author research and analyze the related technologies of both English and Chinese word segmentation firstly. After contrast the relevance and differences between Chinese and English word segmentation techniques. Author proposes an effective method of word segmentation in English base on the largest string matching. And a systematic analysis of needs and feasibility analysis has been carrying out. Then the author designs the framework of the system. Introduce detailed design and implementation of English word Segmentation function, the core modules of the system particularly. It also describes the design and implementation of other functional modules and database.After design and implementation, the author tested the system based on the premise of the initial shape of the framework and functions, evaluated its functionality and performance, made some improvement. At last, the author summarizes all the works have done, pointed out the characteristics and the inadequacies of the system, and gave some recommendations for improvement. Test shows the system can satisfy the requirements of highly effective communication for the enterprise.
Keywords/Search Tags:English word segmentation, Automatic Segmention, Algorithm
PDF Full Text Request
Related items