Font Size: a A A

Exploring Social Collaborative Data For Information Retrieval

Posted on:2012-05-20Degree:MasterType:Thesis
Country:ChinaCandidate:L C YangFull Text:PDF
GTID:2178330338984152Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Information retrieval is a key technology helping users find what they need among huge amounts of information. Information retrieval techniques have been developed for several decades and they work quite well on traditional document data sets. However, since there are new types of data continually emerging, the development of new information retrieval techniques is still in great need and catches much research interest. These years, with the rapid increase of internet users and the rise of the Web2.0 concept, Web users are becoming the leading forces of generating information. On one hand, Web users surf the web to get information; on the other hand, they leave behind valuable information through their activities on the internet. For example, search engine users leave click through data; delicious.com users leave their favorite Web URLs as well as corresponding tags; Web developers leave anchor texts for the linked Web pages. The information generated by one single user may be not very useful. While there are millions of users keep active to generate information, it would start to form a huge knowledge base. Here, we name this type of user generated description data as Social Collaborative Data. We find that Social Collaborative Data, overall, are of high quality and form good summarization of some Web resources. They have much potential in helping computers understand data and enhancing information retrieval.Based on what type of data is available in information retrieval tasks, this paper lists two different scenarios and proposes customized solutions for each scenario.First, in the scenario of having only social collaborative data, we derive information retrieval models based on the generation features of the data. The research of information retrieval on traditional documents has a long history, but social collaborative data are much different; hence, some of the traditional models are not suitable for direct application. Specifically, we model the generation features of this new type of data, correlate it with information retrieval and propose new retrieval models. The experiments show that, for information retrieval tasks on social collaborative data, our new models outperform traditional models a lot.Second, in the scenario of having both social collaborative data and other text data, we propose to integrate these two data for retrieval. Some traditional information retrieval applications have had great success, and these only consider traditional document information. Social collaborative data, as a new high-quality information source, should have much potential to further improve retrieval results. Based on the mutual complementary characteristic of the data, we propose a mutual reinforcement framework for data enhancement, and perform information retrieval tasks via enhanced data. Our experiments show that, comparing to simple composition of data, the enhanced data of our model give much better information retrieval results.
Keywords/Search Tags:Social Collaborative Data, Information Retrieval, Generation Feature, Data Integration
PDF Full Text Request
Related items