An Improved Text-oriented Algorithm For Sieving The Domain-specific Concepts

Posted on:2014-03-12

Degree:Master

Type:Thesis

Country:China

Candidate:L Q Huang

Full Text:PDF

GTID:2268330392472495

Subject:Applied Mathematics

Abstract/Summary:

PDF Full Text Request

Ontology learning is a hotspot of semantic technology, as well as its appllications,being attracted by many scholars at home and abroad. Getting concepts, an importantpart of ontology learning, the filtering quality of which, decides the effect of ontologyconstruction. The data source, the text being the carrier, has been becoming themainstream of current ontology learning. Therefore, this paper pays more attention tothe conception filtering in text fields.We should get candidate concepts first for the domain-specific concepts sieving,and then filter out non-domain concepts from the candidate concepts set to form a set ofdomain concepts. The existing domain-specific concepts sieving algorithms not onlyomit some important low frequency candidate concepts, synonymous with relationshipsor integral part of the relationship, but also select a large number of high-frequencyredundancy concepts which are not related to the field, affecting the precision and recallrates of the concepts sieving.In view of the existing concepts sieving algorithms have the inaccurateshortcomings, this paper presents an improved domain-specific concepts sievingalgorithm. This algorithm, using the contextual information of the candidate concepts,calculates the degree of similarity between the candidate concepts, and than identifiesthe low frequency with synonymous relationship and integral part of the relationshipwords set based on the value of the calculation results, as well as filters out partialredundancy concepts; Finally, this paper presents the improved formulas and fieldconcepts sieving algorithm, making it better filter these low-frequency but veryimportant field words.In order to prove the validity of the proposed method, the present paper conductscomparative experiments between the improved sieving method and the current popularusing algorithms, with the same data sets, as well as taking the accuracy (precision),recall rate (recall) and the measured value (F-measure) as comparative indicators.Seeing from the experimental results, for one thing, the improved algorithm is verysignificant for low frequency domain concepts, which include synonymous, integralpart of the relationship words, and synonymous as well as integral part of therelationship words. For another, the improved algorithm avoid some omissions whichattributes to the low frequency, greatly improving the precision and recall rates of the domain concept extraction.

Keywords/Search Tags:

Ontology learning, Candidate concept, Context, Field concept, Sievingalgorithm

PDF Full Text Request

Related items

1	Study On The Semi-automatic Construction Of Domain Ontology Based On Concept Lattice
2	Research On Ontology-based Concept Distinction In NlU And Its Application In Artificial Intelligence Instrument Design
3	Focused Crawler Based On Domain Ontology And Similarity Concept Context Graph
4	Research On Multi-source-oriented Domain Ontology Construction Based On Formal Concept Analysis
5	Research On Concept Similarity Of Web Information Retrieval
6	Research On Domain Ontology Concept Extraction And Relation Extraction
7	A Method For Building Semantic Web Rough Ontology
8	The Research On The Ontology Module And Operation Based On Concept Lattice
9	Research Of Methods For Similarity Among Domain Ontology Concept Based On Concept Lattice
10	Study On GPU-based Union Algorithm Of Concept Lattices