Font Size: a A A

Research On Microblog Data Storage Based On Relational Database And NoSQL

Posted on:2016-01-11Degree:MasterType:Thesis
Country:ChinaCandidate:J ZhouFull Text:PDF
GTID:2308330479484892Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In the era of data explosion, the Internet websites, such as e-commerce, social networking, audio and video etc., generate data at TB level every day. All these websites have to face mass data storage problem. Microblog, emerged as a new electronic media, has become more popular than the above websites, which also surffers from the storage problem. On one hand, chinese microblog commanies, such as Sina Weibo and Tencent Weibo, adtopted relational databases to store data at the beginning of their production. The adtoptions of relational databases lead to two challeges. First, It is hard to manage mass data in relational databases, since it is unable to achieve expansion by simply adding storage devices. Second, querying database is very ineffective. Even the cache miss rate is 5%, the access traffic is too heavy for a relational database because of excessive user access. On the other hand, foreign microblog commanies, such Facebook and Twitter, use No SQL to store the mass data. No SQL has solved the storage problem by adding storage devices or improving data access efficiency. However, No SQL is weak in high security and strong transaction. For mass microbloging data stroage, it is very important to improve the access efficiency, without the loss of high security and strong transaction ability. This becomes a hot research topic. However, the research on the issue lags far behind in practice. Although there are some studies, most of them remain in the conceptual or model level. Most of them are impractical because of lack in specific strategies.For the above, in this thesis, based on the microblogging business, we take advantage of the relational database and No SQL. The microblogging data are stored in different databases. Especially, the user-related information are stored in relational database(My SQL), microblogging related information are store in No SQL(Cassandra). We design a data storage architecture combining relational database with No SQL. Then we present store and access policies in detail. Lastly, a large number of experiments on the microblogging data sets show that the presented architecture and accessing polices work well. The contributions of this thesis include:①Ensure high security and strong transactional for some part of the business while solving the problem microblogging mass data storage. Using No SQL to store massive microblogging data, and user information with high security and strong transactional requirements use relational database to store.②Achieve high access efficiency and tions to access the user’s request. After the architecture in this thesis operation some time, most of the database query will access Cassandra database, at high load conditions, Cassandra has a stronger parallel computing, therefor, it has higher query efficiency compare with My SQL.③Through a lot of experiments and analysis, explained in detail the whole structure and strategies and offered solutions for other areas whose also being faced with mass data storage.
Keywords/Search Tags:Relational Database, NoSQL, Data Storage, Microblog
PDF Full Text Request
Related items