Font Size: a A A

Research Of Chinese Word Segmentation Technology Applied In Police Information System

Posted on:2008-07-10Degree:MasterType:Thesis
Country:ChinaCandidate:J WeiFull Text:PDF
GTID:2178360242472339Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The main objective of this research is to design a Chinese word segmentation system used for police information system (WSSPIS) in order to segment information effectively and accurately, especially to the important words for police information analyzing.Police information has some characteristics: 1,including many unlisted names and place names, 2,including many professional words, 3,submitted by fixed format, 4,including many similar or continuous information.At first, we designed the SAFM segmentation dictionary mechanism. This mechanism can construct diversified dictionary and add words conveniently, therefore, we can improve speed and precision of segmentation by using characteristic 4. Based on SAFM dictionary, we implemented Omni-Segmentation arithmetic. Based on Omni-Segmentation arithmetic, we designed a mechanism of checking ambiguity called SDOS, which can check all ambiguity in sentence. In order to reduce the workload of ambiguity processing, we designed SDOSD strategy. Follow that, we designed the ambiguity strategy of WSSPIS, which can get important words when they in ambiguity fields. For important unlisted names and place names, WSSPIS used the method of extracting repeated string and the characteristics 3 of police information to ensure these words can be gotten effectively.Results of experiments show that WSSPIS can segment police information effectively and accurately. It can satisfy the requirements of police information system.
Keywords/Search Tags:Word segmentation, Police information system, SAFM dictionary, SDOS mechanism of checking ambiguity, SDOSD strategy of ambiguity processing, Identification of unlisted word
PDF Full Text Request
Related items