Font Size: a A A

Research And Applications On Text Feathurs Extraction From Science And Technical Literatures

Posted on:2010-11-16Degree:MasterType:Thesis
Country:ChinaCandidate:L YuFull Text:PDF
GTID:2178360278966405Subject:Control theory and control engineering
Abstract/Summary:PDF Full Text Request
Scientific and technical literatures are important manifestation of scientific research. Also, they are very important reference and materials for study. To process massive scientific and technical literatures automatically, extracting important information is significance to science and technical literatures retrieval using computer, library construction and management, knowledge discovery.At present, the process to science and technology literatures mainly includes keywords extraction, text clustering, and knowledge discovery etc. In the processing, to extract the keywords and express the text are the bases of further stage. Nowadays, there is no unit and widely accepted method for text feature extraction, especially for the text feature extraction of science and technical literatures.For the characteristics of scientific and technical literature and the way that text expression, this paper completed the following work:(1) We analyzed the characteristic of the science and technical literatures and the way of text expression in details. The advantages and limitations are analyzed on the bases of summering.(2) The structure of the science and technology was analyzed in details. Also the information distribution in various blocks of the science and technical literatures was analyzed for the example of License Plate Recognition in the field of scientific and technical literatures.(3) The way of the text expression for science and technical literatures was analyzed in details; focus on the factors that affect the candidate features' importance in the science and technical literatures. And the factors were quantified .We established model of CRF to process it.(4) The CRF Model was involved for text block tagging and feature extraction from each block of the science and technical literatures, focus on the description of the method for feature extraction from title and candidate feature sorting using CRF Model.(5) The improved method for the experiment and applications of text feature extraction of science and technical extraction was involved.
Keywords/Search Tags:Information Extraction, Science and Technical Literature, Text Features, Conditional Random Fields
PDF Full Text Request
Related items