Research On Adaptive Extraction Methods For Bamboo Germplasm Information

Posted on:2021-03-31Degree:MasterType:Thesis
Country:ChinaCandidate:H LiFull Text:PDF
GTID:2428330602996864Subject:Computer application technology
Bamboo species is an important forest resource with economic benefits.It is the precondition to obtain quickly a large number of bamboo species information and extract accurately the required information.However,most of them are collected manually and selected gradually in the current acquiring way of bamboo resource data.That takes too much manpower and resources because of a large amount of work time.Therefore,how to extract quickly and accurately the required information from a large quantity of bamboo resources is an urgent problem.In this article,with the bamboo germplasm resources information extraction as the research object,the key to solve three problems——The problem to build ontology in bamboo germplasm resources,the problem to identify automatically bamboo terms based on ontology and the problem to build an adaptive extraction system of bamboo resources data based on the web.The article proposes to construct the bamboo germplasm resource ontology,by the guidance of the ontology information,and introducing the word vector feature,selecting the conditional random field algorithm of machine learning for identification,that can identify quickly and accurately the required bamboo species terms effectively from a large number of data,which provides a basis for the automatic extraction of bamboo species terms information in the system.The main research work of this article is as follows:1.An ontology construction method of bamboo germplasm resources based on Ontology Wed Language was studied.Using the website information of “the flora of China” on bamboo germplasm resources data,and referring the construction of bamboo tables in the management platform of bamboo rattan germplasm resources.This article adopts the manual construction method and the top-down technical route.On the basis of five rules were came forward by Gruber in 1995,combined with the seven steps and skeleton method,to construct a kind of domain ontology of bamboo species by manual way.And the OWL is used to describe the domain ontology of bamboo species,the protégé is used to develop and represent visually.2.An automatic identification method of bamboo domain terms based on ontology was studied.First,the bamboo information by climbing from “the flora of China” website as the data research object.After choosing the word itself,the part of speech and indicators as the basic characteristic set,introducing the domain dictionary of bamboo species ontology to guide the information recognition process,and adding the word vector feature to improve recognition effect,using the BIO as boundary,to build jointly into a conditional random field model as a input feature set.Finally,the CRF model recognition result shows that this method has better recognition performance in the article than the model recognition under the common feature set,which provides a basis for the subsequent rapid and accurate extraction of the bamboo species data.3.An adaptive extraction system of bamboo germplasm resource data based on web was developed.The adaptive extraction system of bamboo germplasm resource data was designed and developed on the web by Java programming language,and the popular Java framework “springboot” and the front-end framework “layui” were adopted.The system functions include mainly five functional modules: data processing,model recognition,data extraction,user retrieval and user management.The purpose of automatic identification and extraction of bamboo species data in the system is implemented,so that users can extract and query quickly the related information in bamboo species.The research content of this article provides theoretical basis and technical support for the rapid extraction of information from a large number of bamboo germplasm data and the construction of bamboo and rattan resource database.It has strong research significance for recognition and extraction automatically in bamboo information.
Keywords/Search Tags:Bamboo germplasm resources, Condition random field, Ontology domain dictionary, Term vectors, Adaptive Extraction
