Font Size: a A A

Research And Implementation Of Data Placement Strategy In Online Social Networks

Posted on:2017-03-09Degree:MasterType:Thesis
Country:ChinaCandidate:Q Y WangFull Text:PDF
GTID:2308330491951596Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the rapid development of information technology, the number of users in online social networks is rapidly increasing. It creates a large scale of data that millions of users communicate through online social networks. The social network data structure is complex, and now it has some new features with the continuous development of social applications. In addition, the storage environment is also constantly developing. In order to support different data storage architecture, the placement strategy of social network data is studied in this thesis. The main work is as follows:(1)The characteristics and development trend of online social networks are analyzed. Practical significance of the research is clarified. The data placement problems of centralized and distributed data storage architecture in social networks are introduced respectively, as well as the research status.(2)To address the problem of data placement in centralized online social networks, data partitioning and replication problems of social networks need to be solved. The existing data partitioning and replication algorithms are usually designed in target of load balancing, reducing the access cost among friends and improving the replication efficiency, without considering the new characteristics of social network data, for example, location aggregation. In this thesis, a double layer social graph model is designed to extract the location information. Then a dynamic partitioning and replication algorithm MSDPR is proposed. The algorithm uses an improved K-Means algorithm to cluster the position information. After that, the data are divided according to the clustering results and copied according to the social relations. Simulation results show that MSDPR can effectively improve the efficiency of the local access and reduce the latency of access in social networks. Moreover, it has a better adaptability when adding data dynamically.(3)To address the problem of data placement in distributed online social networks, the selection problem of replica storage nodes needs to be solved. In the existing P2 P storage node selection algorithms in distributed online social networks, the number of storage nodes for each user is totally the same. Considering the data access frequency and users’ behavior are of large difference in social networks, a double-layer ego network model based on location is designed to extract single user information. On this basis, a new storage node selection algorithm is proposed. The new algorithm firstly measures the social influence of users according to the characteristics of community structure in social network. Then it calculates the number of each user’s replica based on social influence. At last, the storage nodes are selected according to the availability of user nodes, node location information and the strength of relationship between users. Simulation results show that the proposed algorithm can effectively improve the access efficiency of hot spot data in the distributed online social networks.(4)Based on the above research results, a data storage prototype system for online social networks is designed and implemented. The system realizes the storage function of social network data for the centralized and distributed storage architecture. It verifies the practicability of the dynamic partitioning and replication algorithm based on the location, and the storage nodes selection algorithm based on the double-layer ego network.
Keywords/Search Tags:online social networks, data placement, data partitioning and replication, storage nodes selection
PDF Full Text Request
Related items