Research Of Domain-oriented Extraction Method Of Text Information

Posted on:2015-11-14

Degree:Master

Type:Thesis

Country:China

Candidate:F K Zhou

Full Text:PDF

GTID:2298330467977029

Subject:Natural language processing

Abstract/Summary:

PDF Full Text Request

As computers are widely used in various domain and the rapid development of Internet, moreand more event information are stored and processed as the form of an electronic document in thecomputer. The Internet is slowly becoming the main carrier of information and communicationplatform, it has become the largest collections of the various informations. As the times of big datacoming,80%of the information data is stored on the network as unstructured data (natural language,images, videos, etc.).As the chinese text has the characteristics of unstructure, untandardize and uncertainties, itadoptes the technology roadmap of “text description-normalized expression-structured extraction-pattern mining” to focus on the temporal attribute information extraction, the classificationmethods, the resolvelation methods and the extractions methods of the incident event field. It hasmade a solid theoretical foundation for the study of the extraction of the event information andservived a viable solution for the constructions of the national geographic-based informationservices.Firstly, based on the study of the emergencies structured expression, several extraction methodsof the Chinese text event property information was proposed, to make it exact to extract theinformation. For chinese text classification, the SVM model was applied for Chinese textclassication and achieve good results. For the non-temporal property information of theemergencies, the rules model and the statistical model was proposed and applied,. Not only the rulemodel but the statistical model were studied that they can bring different results in the field ofnatural language processing, so the combination methed of the both can be effective to achieve theextraction of the chinese event text in oriented domains. The combine method of HMM model andsyntax analysis model were finally used in this thesis for text attribute extraction, experimentsshowed that the method has better results. Finally, the feasibility of the method was proved throughthe realization of the prototype system.

Keywords/Search Tags:

Domain-Oriented Extraction Method of Text Information, HMM Model, SVMModel, Natural Language Processing, SyntaxAnalysis

PDF Full Text Request

Related items

1	Narrative Information Extraction with Non-Linear Natural Language Processing Pipeline
2	Research On Topic - Oriented Keyword Extraction Method
3	Research Of Non Domain Knowledge Dependent Text Summarization Method
4	Learning text analysis rules for domain-specific natural language processing
5	Research On Text Representation Model And Application In Text Classification And Natural Language Inference
6	In View Of The Short Carrier Natural Language Text Information Hiding Technology Research And Implementation
7	The Application Of Natural Language Processing In Mining The Characteristics Of Concept Convey
8	Research And Implementation Of Text-oriented Entity Relation Extraction Technology
9	Research On Machine Learning For Natural Language Processing And Transmission
10	Natural Language Processing Aiming To The Core Texts Of Scientific Literature