Font Size: a A A

Research Of Image Labeling And Retrieval Based On Preference

Posted on:2011-06-05Degree:DoctorType:Dissertation
Country:ChinaCandidate:J ChenFull Text:PDF
GTID:1228360305983570Subject:Computer applications
Abstract/Summary:PDF Full Text Request
The current society undergoes the tremendous impact of the fast growing information technology, which means the information increases with the geometric. Common people are inundated in the information the data mining technology to guide their decision-making and extract helpful semantic labeling of the multimedia is necessary for the retrieval engine to enhance its performance of information extraction. Emergence of the image labeling is the natural development of the semantic labeling in the Internet context for various media, which is fundamental for multifarious individual application. Baidu and Google have introduced a novel searching tool from the users’keyword for corresponding video files with the guidance of the users’profile and Internet browsing history. The input data of the tool can be also the pictures, videos and audios rather than the simple key words. Microsoft has designed the singular on-line songs retrieval system whose input can be users’own voice in singing such songs. The famous domestic video web-sites such as Xunlei and youku all introduce the censorship system to scrutinize the huge amount of uploaded videos and images abiding to the correlative law, which requires the automatic machine labeling and scrutinizing to decrease the artificial intervention and errors. In order to increase recall ratio and precision ratio of the hybrid media semantic labeling and retrieval, the privacy and preference information of the user is utilized in e-shopping, e-books reading,on-line news recommendation and other Internet activities. The protection of such privacy and preference data in the utilization process against abuse and defrauding is a sensitive subject attracting folk attention after the "erotic picture leak", precise personal information retrieval and other sensational accidents involving the privacy. To this question, this thesis sets up a robust model for the image semantics labeling with the utilization of the acquired user’s profile, privacy and preference, which guarantees the security of thus data in gathering, transmission and storage. A comprehensive system to increase efficency of the image semantics labeling and retrieval.After a thorough survey of the current searching engines’image semantic labeling, retrieval and image-ranking technology, this thesis presents hierarchic privacy and preference information collection under user’s manipulation. This system can gather the user’s privacy and preference data and store part of them on the searching engine server which is not involved the subtle subject of privacy, store other part of them on the user’s client which contain subtle privacy information. The first round of image-ranking is carried out on the searching engine based on the coarse data extracted from the privacy with some extent processing. The result of the first round of image-ranking is returned to the users’computers, and the second round of image-ranking is carried out based on the subtle privacy data on the local computer without any risk of privacy leakage. The two-round of image-ranking algorithm can utilize the privacy and preference data in a controlled manner with security, which changes the current algorithm idea of total privacy data storage on the searching engine server.This thesis also presents the conception of privacy and preference semantics web, which is essential to the privacy data modeling and storage. In the data structure of privacy and preference semantics web. The key words of image semantic labeling are connected with directional edges with weight value, which can help to depict the latent semantics for privacy and to increase the efficiency of image labeling algorithm. The three levels of the privacy and preference semantics web deal with the open semantics without any privacy leakage, the buffer semantics which needs some perturbation or other processing measure and the absolute privacy which must be stored on the local computers. The three-level architecture is the foundation of the privacy protection to deal with the data with diverse sensitivity.The thesis makes a proposal that the image semantics labeling algorithm of hybrid media can be enhanced by the privacy and preference data, which is combined with the underlying characteristics of the hybrid media and the context semantics. After machine learning, the algorithm can calculate the weights value of every kind of characteristics group for the singular individual application. The algorithm set a bridge to narrow down the tremendous gap between the upper semantics and underlying characteristics. The relationship between the labeling words and the underlying characteristics of images is set up precisely which is helpful to solve the problem that the application of images matching algorithms is constrained in the certain ontology and knowledge.Finally, the thesis discusses the method how to enhance the comprehensive function of images semantics labeling and retrieval with cross-validation of underlying characteristics of images and privacy and preference semantics, which means the similarity of underlying characteristics demonstrates the similarity of privacy and preference semantics in great probability and vice versa. The cross-validation turns the one-direction machine training into the two-direction machine training, and turns the machine study with total supervision into the semi-supervision. This is a compromise with the complexity of cross-media retrieval system and the current image semantics labeling and retrieval performance.The future research is mainly about to set a full-formed image semantics labeling and retrieval system based on the preference, which realizes the interaction of the system, administrator and users. At the same time, the preference rules can be adjusted by the user and application context. The preference protection should include the cryptography, public key and private key, digital signature. The privacy should be depicted with the users’behaviour on the Internet. The variation of such behaviour can be calculated in the precise measurement of preference and privacy in the matching algorithm for image-ranking. This is important for the semantics labeling and retrieval system’s humanistic design and improve the performance of the searching engine.
Keywords/Search Tags:Image Labeling, Image Retrieval, Hierarchical Information Extraction, Preference Semantic Web
PDF Full Text Request
Related items