Font Size: a A A

Research On Key Technology Of Feature Information Extraction Based On Domain Website

Posted on:2022-10-28Degree:MasterType:Thesis
Country:ChinaCandidate:W ZouFull Text:PDF
GTID:2491306317997149Subject:Traffic Information Engineering & Control
Abstract/Summary:PDF Full Text Request
The website of civil aviation field publishes a large number of important information related to navigation.The relevant flight personnel download and read the contents in time,which plays an important role in ensuring flight safety and reducing the rate of aviation accidents.But the navigation information and key information extraction are downloaded manually,which waste human resources and have a high error rate.It is also the key point to construct intelligent civil aviation information service by extracting unstructured navigation information into the structured data understood by computer.Therefore,it is of great significance to research the key technologies of feature information extraction of civil aviation websites.As an important part of navigation information,the paper studies the extraction of navigation notice information issued by the civil aviation website,and the specific contents are as follows:(1)How to obtain the navigation announcement text issued by civil aviation websites in many countries and regions.Firstly,the structure of civil aviation website is analyzed and the existing crawler capture strategies are summarized.The algorithm of priority weight algorithm for de-noising breadth is proposed to capture the link of the website,and the link set captured is regarded as the original data set.Then,the paper uses naive Bayesian algorithm to classify the links of information published on Civil Aviation websites,and then crawls the web pages that publish the navigation notices according to the classification results.Finally,a multisegment text extraction algorithm is proposed,which extracts and stores the multi-segment navigation announcement text in the text document,which provides the data base for the next stage of text information extraction.(2)How to extract the key information in the navigation notice text.The paper proposes and constructs a model of information extraction based on deep learning.Firstly,the text data of the notice is preprocessed.The cyclic neural network,conditional random field and their combination structure are used as the coding and decoding layer of the model,and the prediction evaluation criteria of the model are verified respectively.According to the experimental results,the optimal Bi-GRU-CRF is selected as the coding and decoding layer of the final model.On this basis,the model is optimized and the word vector is pre trained by Ernie.The experimental results show that the prediction results of the final optimized model of the information extraction of the navigation notice are significantly higher than the pre optimization model learning speed,the accuracy rate is increased by 8%,the recall rate is increased by 7% and F1 is 7%.
Keywords/Search Tags:crawler, information extraction, neural network, notice of navigation
PDF Full Text Request
Related items