Research Of Skyline Preference Query Based On Incomplete Dataset

Posted on:2019-08-11

Degree:Master

Type:Thesis

Country:China

Candidate:Z Shi

Full Text:PDF

GTID:2428330545454774

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

In recent years,with the dramatic development of information technology such as Internet,Internet of things and so on,the way of producing data is also increasingly diversified.An important aspect of data availability is integrity.Unfortunately,incomplete datasets are a frequent phenomenon due to machine abnormalities,privacy,human error and widespread use of automated information extraction and aggregation.How to efficiently acquire user's information from incomplete data has become an important issue that should be resolved.Skyline query can provide effective decision analysis and preference query results that meet the needs of users,so it is widely applied in a lot of fields,such as Multi-objective decision making,environmental monitoring,market analysis,data mining and so on.Data cleaning,repair and other pre-processing is a common incomplete data processing method,then data query is performed on the cleaned and repaired data.These methods not only bring great time cost,but also may introduce new 'noise',which leads to a number of deviations of result and the result cannot meet user's demand.At present,obtaining personalized information from incomplete dataset lacks efficient and accurate strategy.In this paper,a Skyline preference query algorithm based on incomplete dataset is proposed,Which can extract personalized information based on user preferences on incomplete data sets and improve Skyline query efficiency.Firstly,clustering the sub datasets with different importance after partition by using different strategies.In clustering,Skyline query space can be shrank by pruning some tuples which dominated by others.Then,executing two algorithms the two query subspaces for clustering that are the Skyline query algorithm based on tuple sorting which can assure the accuracy and the Skyline query algorithm based on domination degree simplifies processing that is very efficient.As the result of two Skyline algorithm,two local Skyline results could be obtained.Last but not least,the global Skyline query results are selected based on whether the intersection of two results is empty.If intersection is not empty,the intersection is returned to the user as the global optimal solution.If the intersection is empty,generalization center classification is applied to the union of two results for abtaining suboptimal solution.A great quantity of experimental results show that the proposed SPQ-I can obtain results that meeting user needs according to different user preferences.The accuracy is high,and the efficiency of SPQ-I in dealing with the high dimensional incomplete data is remarkable compared with SIDS and CDSkyline.

Keywords/Search Tags:

incomplete data, Skyline query, user preference, clustering, dataset partition

PDF Full Text Request

Related items

1	Research On SKYLINE Preference Query Technology Over Incomplete Data
2	Skyline Query Processing Of Massive Incomplete Data Base On Space Partition
3	Research Of Dynamic Skyline Query Processing Approach In MapReduce
4	Research On Skyline-Join Query Processing Of Incomplete Datasets With Crowdsourcing
5	Research Of Skyline Query Processing On Uncertain Dataset Of WSNs
6	User Preference Query Processing Over Data Stream
7	Research On Skyline Query Algorithms Over Uncertain Dataset
8	Answering Skyline Queries Over Incomplete Data With Crowdsourcing
9	Research And Implementation Of Skyline Query Algorithm In LBSN Environment
10	Research On Top-k Skyline Query In Multiple Environments