{"id":48232,"date":"2015-10-28T15:21:50","date_gmt":"2015-10-28T13:21:50","guid":{"rendered":"http:\/\/www.javacodegeeks.com\/?p=48232"},"modified":"2023-12-08T09:46:23","modified_gmt":"2023-12-08T07:46:23","slug":"nosql-vs-sql","status":"publish","type":"post","link":"https:\/\/www.javacodegeeks.com\/2015\/10\/nosql-vs-sql.html","title":{"rendered":"NoSQL vs. SQL: Choosing a Data Management Solution"},"content":{"rendered":"<div class=\"toc\">\n<h3>Table Of Contents<\/h3>\n<dl>\n<dt><a href=\"#introduction\">1. Introduction<\/a><\/dt>\n<dt><a href=\"#cap\">2. Distributed systems: the CAP theorem<\/a><\/dt>\n<dt><a href=\"#data_stores\">3. Relational data stores<\/a><\/dt>\n<dd>\n<dl>\n<dt><a href=\"#mysql_mariadb\">3.1. MySQL \/ MariaDB<\/a><\/dt>\n<dt><a href=\"#postgresql\">3.2. PostgreSQL<\/a><\/dt>\n<dt><a href=\"#others\">3.3. Others<\/a><\/dt>\n<\/dl>\n<\/dd>\n<dt><a href=\"#why_nosql\">4. Why NoSQL?<\/a><\/dt>\n<dt><a href=\"#key_value_data_stores\">5. Key\/Value data stores<\/a><\/dt>\n<dd>\n<dl>\n<dt><a href=\"#dynamodb\">5.1. DynamoDB<\/a><\/dt>\n<dt><a href=\"#memcached\">5.2. Memcached<\/a><\/dt>\n<dt><a href=\"#redis\">5.3. Redis<\/a><\/dt>\n<dt><a href=\"#riak\">5.4. Riak<\/a><\/dt>\n<dt><a href=\"#aerospike\">5.5. Aerospike<\/a><\/dt>\n<dt><a href=\"#foundationdb\">5.6. FoundationDB<\/a><\/dt>\n<\/dl>\n<\/dd>\n<dt><a href=\"#Columnar\">6. Columnar data stores<\/a><\/dt>\n<dd>\n<dl>\n<dt><a href=\"#accumulo\">6.1. Accumulo<\/a><\/dt>\n<dt><a href=\"#cassandra\">6.2. Cassandra<\/a><\/dt>\n<dt><a href=\"#hbase\">6.3. HBase<\/a><\/dt>\n<\/dl>\n<\/dd>\n<dt><a href=\"#graph\">7. Graph data stores<\/a><\/dt>\n<dd>\n<dl>\n<dt><a href=\"#neo4j\">7.1. Neo4J<\/a><\/dt>\n<dt><a href=\"#titan\">7.2. Titan<\/a><\/dt>\n<dt><a href=\"#flockdb\">7.3. FlockDB<\/a><\/dt>\n<dt><a href=\"#graphbase\">7.4. GraphBase<\/a><\/dt>\n<dt><a href=\"#infinitegraph\">7.5. InfiniteGraph<\/a><\/dt>\n<\/dl>\n<\/dd>\n<dt><a href=\"#document\">8. Document data stores<\/a><\/dt>\n<dd>\n<dl>\n<dt><a href=\"#couchbase1\">8.1. Couchbase<\/a><\/dt>\n<dt><a href=\"#couchdb1\">8.2. CouchDB<\/a><\/dt>\n<dt><a href=\"#mongodb1\">8.3. MongoDB<\/a><\/dt>\n<\/dl>\n<\/dd>\n<dt><a href=\"#time\">9. Time series data stores<\/a><\/dt>\n<dd>\n<dl>\n<dt><a href=\"#influxdb\">9.1. InfluxDB<\/a><\/dt>\n<dt><a href=\"#opentsdb\">9.2. OpenTSDB<\/a><\/dt>\n<dt><a href=\"#druid\">9.3. Druid<\/a><\/dt>\n<dt><a href=\"#prometheus\">9.4. Prometheus<\/a><\/dt>\n<\/dl>\n<\/dd>\n<dt><a href=\"#hybrid\">10. Hybrid data stores<\/a><\/dt>\n<dd>\n<dl>\n<dt><a href=\"#arangodb1\">10.1. ArangoDB<\/a><\/dt>\n<dt><a href=\"#orientdb1\">10.2. OrientDB<\/a><\/dt>\n<dt><a href=\"#hyperdex\">10.3. HyperDex<\/a><\/dt>\n<\/dl>\n<\/dd>\n<dt><a href=\"#specialized\">11. Specialized data stores<\/a><\/dt>\n<dd>\n<dl>\n<dt><a href=\"#rethinkdb2\">11.1. RethinkDB<\/a><\/dt>\n<dt><a href=\"#crate2\">11.2. Crate<\/a><\/dt>\n<\/dl>\n<\/dd>\n<dt><a href=\"#conclusions\">12. Conclusions<\/a><\/dt>\n<\/dl>\n<\/div>\n<h2><a name=\"introduction\"><\/a>1. Introduction<\/h2>\n<p>As software developers, we are living in the exciting new era of innovations, particularly in the areas of data storages and data processing. The landscape of available solutions, historically dominated by <a href=\"https:\/\/en.wikipedia.org\/wiki\/Relational_database_management_system\">relational database management systems<\/a>, has been changed dramatically by the so called <a href=\"https:\/\/en.wikipedia.org\/wiki\/NoSQL\">NoSQL<\/a> and later <a href=\"https:\/\/en.wikipedia.org\/wiki\/NewSQL\">NewSQL<\/a> movements.<\/p>\n<p>Those new kinds of data storages were born to address the needs of modern distributed application architectures, which required unprecedented (at that time) scalability and availability guarantees. The <a href=\"https:\/\/en.wikipedia.org\/wiki\/Scalability#Horizontal_and_vertical_scaling\">vertical scalability<\/a> reached the levels where the cost of hardware or\/and supporting infrastructure became unreasonable high so the switch to the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Scalability#Horizontal_and_vertical_scaling\">horizontal scalability<\/a> instead was inevitable.<\/p>\n<p>In this article we are going to look into the general problem of managing data in distributed architectures, briefly discuss the traditional <a href=\"https:\/\/en.wikipedia.org\/wiki\/Relational_database_management_system\">relational database management systems<\/a> and then switch our gears to <a href=\"https:\/\/en.wikipedia.org\/wiki\/NoSQL\">NoSQL<\/a> \/ <a href=\"https:\/\/en.wikipedia.org\/wiki\/NewSQL\">NewSQL<\/a> solutions.<\/p>\n<p>By no means is the intention of this article to define the winners or emphasize on how bad or good the particular data storage solution is. Rather to outline more options which could help you in building complex distributed systems, picking the right tools for the job knowing ahead no universal solution exists.<\/p>\n<h2><a name=\"cap\"><\/a>2. Distributed systems: the CAP theorem<\/h2>\n<p>In 2000, Eric Brewer came up with theorem which is known these days as <a href=\"https:\/\/en.wikipedia.org\/wiki\/CAP_theorem\">CAP theorem<\/a>. It asserts that a distributed system cannot provide all three of the following desirable characteristics at the same time:<\/p>\n<ul>\n<li><strong>Consistency<\/strong>: A read operation sees all previously completed write operations. In context of distributed system it means that all nodes see the same data at the same time.<\/li>\n<li><strong>Availability<\/strong>: Read and write operations always succeed. In context of distributed system it means that every request receives a response about whether it succeeded or failed.<\/li>\n<li><strong>Partition tolerance<\/strong>: The system continues to operate despite arbitrary partitioning due to network failures which prevent some nodes from communicating with others.<\/li>\n<\/ul>\n<p>We are not going to discuss <a href=\"https:\/\/en.wikipedia.org\/wiki\/CAP_theorem\">CAP theorem<\/a> in detail as it is worth its own article but rather emphasize on its importance in distributed systems design. Although it caused quite a lot of debates over the years (eventually leading to follow-up article from Eric Brewer in 2012 titled <a href=\"http:\/\/www.infoq.com\/articles\/cap-twelve-years-later-how-the-rules-have-changed\">CAP Twelve Years Later<\/a>), most if not all <a href=\"https:\/\/en.wikipedia.org\/wiki\/NoSQL\">NoSQL<\/a> \/ <a href=\"https:\/\/en.wikipedia.org\/wiki\/NewSQL\">NewSQL<\/a> solutions are built on the trade-offs outlined by the <a href=\"https:\/\/en.wikipedia.org\/wiki\/CAP_theorem\">CAP theorem<\/a>.<\/p>\n<p>Along this article we are going to highlight how each data stores accounts for <a href=\"https:\/\/en.wikipedia.org\/wiki\/CAP_theorem\">CAP theorem<\/a>, whenever it is appropriate and relevant details are available.<\/p>\n<h2><a name=\"data_stores\"><\/a>3. Relational data stores<\/h2>\n<p>For quite a long time <a href=\"https:\/\/en.wikipedia.org\/wiki\/Relational_database_management_system\">relational database management systems<\/a> were the de facto standard with respect to data store solutions. From time to time new players were trying to challenge that (for example, <a href=\"https:\/\/en.wikipedia.org\/wiki\/Object_database\">object databases<\/a>) however none of them were able to really disrupt the market. <a href=\"https:\/\/en.wikipedia.org\/?title=SQL\">SQL<\/a> (and its vendor-specific dialects) was a de-facto standard for querying relational data stores and essentially was a must-know language for any software developer out there.<\/p>\n<p>But things started to change radically a few years ago, when the growth of data volumes became exponential, hitting the limits of <a href=\"https:\/\/en.wikipedia.org\/wiki\/Relational_database_management_system\">relational database management systems<\/a> design.<\/p>\n<h3><a name=\"mysql_mariadb\"><\/a>3.1. MySQL \/ MariaDB<\/h3>\n<p><a href=\"http:\/\/dev.mysql.com\/\">MySQL<\/a> is the one of the oldest open-source <a href=\"https:\/\/en.wikipedia.org\/wiki\/Relational_database_management_system\">relational database management systems<\/a> and is the one of the most widely used data storage at the moment as well. By all means <a href=\"http:\/\/dev.mysql.com\/\">MySQL<\/a> was and is the great choice as data storage engine for most of the modern applications due to its stability and reliability.<\/p>\n<p>However the road of <a href=\"http:\/\/dev.mysql.com\/\">MySQL<\/a> took a steep turn in 2009, when Oracle Corporation entered into an agreement to purchase Sun Microsystems, the owners of <a href=\"http:\/\/dev.mysql.com\/\">MySQL<\/a> copyright and trademark. This acquisition raised a lot of concerns about the future of <a href=\"http:\/\/dev.mysql.com\/\">MySQL<\/a> and, as the result, it was forked to what we know today as <a href=\"https:\/\/mariadb.org\/\">MariaDB<\/a>.<\/p>\n<p>To address the scalability and availability demands of modern distributed systems, both <a href=\"http:\/\/dev.mysql.com\/\">MySQL<\/a> and <a href=\"https:\/\/mariadb.org\/\">MariaDB<\/a> offer clustered editions, <a href=\"http:\/\/dev.mysql.com\/downloads\/cluster\/\">MySQL Cluster<\/a> and <a href=\"https:\/\/mariadb.com\/kb\/en\/mariadb\/what-is-mariadb-galera-cluster\/\">MariaDB Galera Cluster<\/a> respectively. Referring to <a href=\"https:\/\/en.wikipedia.org\/wiki\/CAP_theorem\">CAP theorem<\/a> description, a single <a href=\"http:\/\/dev.mysql.com\/downloads\/cluster\/\">MySQL Cluster<\/a> or <a href=\"https:\/\/mariadb.com\/kb\/en\/mariadb\/what-is-mariadb-galera-cluster\/\">MariaDB Galera Cluster<\/a> is a <strong>CP<\/strong> system.<\/p>\n<p>To finish up with <a href=\"http:\/\/dev.mysql.com\/\">MySQL<\/a>, it is worth to mention quite a few case studies:<\/p>\n<ul>\n<li><a href=\"https:\/\/blog.twitter.com\/2012\/mysql-twitter\">MySQL at Twitter<\/a>, <a href=\"https:\/\/blog.twitter.com\/2015\/another-look-at-mysql-at-twitter-and-incubating-mysos\">Another look at MySQL at Twitter and incubating Mysos<\/a><\/li>\n<li><a href=\"http:\/\/highscalability.com\/blog\/2011\/12\/19\/how-twitter-stores-250-million-tweets-a-day-using-mysql.html\">How Twitter Stores 250 Million Tweets a Day Using MySQL<\/a><\/li>\n<li><a href=\"https:\/\/code.facebook.com\/posts\/1474977139392436\/webscalesql-a-collaboration-to-build-upon-the-mysql-upstream\/\">WebScaleSQL: A collaboration to build upon the MySQL upstream<\/a>,<\/li>\n<\/ul>\n<h3><a name=\"postgresql\"><\/a>3.2. PostgreSQL<\/h3>\n<p><a href=\"http:\/\/www.postgresql.org\/\">PostgreSQL<\/a> is as widely used as <a href=\"http:\/\/dev.mysql.com\/\">MySQL<\/a> and essentially those data storages are the most popular open source <a href=\"https:\/\/en.wikipedia.org\/wiki\/Relational_database_management_system\">relational database management systems<\/a>. Along with many similarities with <a href=\"http:\/\/dev.mysql.com\/\">MySQL<\/a>, <a href=\"http:\/\/www.postgresql.org\/\">PostgreSQL<\/a> traditionally is few steps ahead offering better extensibility and a lot of advanced features.<\/p>\n<p>The one of the features which <a href=\"http:\/\/www.postgresql.org\/\">PostgreSQL<\/a> lacks at the moment is out-of-the box clustering support although there are <a href=\"https:\/\/wiki.postgresql.org\/wiki\/Replication,_Clustering,_and_Connection_Pooling#Clustering\">a couple of options available<\/a>.<\/p>\n<h3><a name=\"others\"><\/a>3.3. Others<\/h3>\n<p>Despite the fact that we are going to talk about open source solutions along this article, it is worth to mention the big players in the space of commercial <a href=\"https:\/\/en.wikipedia.org\/wiki\/Relational_database_management_system\">relational database management systems<\/a>, led by <a href=\"https:\/\/www.oracle.com\/database\/index.html\">Oracle Database<\/a>, <a href=\"http:\/\/www.microsoft.com\/en-ca\/server-cloud\/products\/sql-server\/default.aspx\">Microsoft SQL Server<\/a> and <a href=\"http:\/\/www-01.ibm.com\/software\/data\/db2\/\">IBM DB2<\/a>. Depending on the price you are willing to pay, those data stores offer quite decent performance, scalability and reliability characteristics, satisfying the requirements of mostly any distributed system.<\/p>\n<h2><a name=\"why_nosql\"><\/a>4. Why NoSQL?<\/h2>\n<p>I think it is fair to say that the <a href=\"https:\/\/en.wikipedia.org\/wiki\/NoSQL\">NoSQL<\/a> \/ <a href=\"https:\/\/en.wikipedia.org\/wiki\/NewSQL\">NewSQL<\/a> movement emerged from an urgent need to fill the gaps in the architecture of the modern, highly scalable, responsive and available distributed systems, which deal with massive data volumes. At some point, the &#8220;one size fits all&#8221; approach backed by <a href=\"https:\/\/en.wikipedia.org\/wiki\/Relational_database_management_system\">relational database management systems<\/a> became a limiting factor (or even a blocker) for the systems to meet their functional requirements.<\/p>\n<p>Despite huge differences of <a href=\"https:\/\/en.wikipedia.org\/wiki\/NoSQL\">NoSQL<\/a> \/ <a href=\"https:\/\/en.wikipedia.org\/wiki\/NewSQL\">NewSQL<\/a> solutions from each other, there are a few things they have in common: simpler design, understanding of <a href=\"https:\/\/en.wikipedia.org\/wiki\/CAP_theorem\">CAP theorem<\/a> trade-offs and support of the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Scalability#Horizontal_and_vertical_scaling\">horizontal scalability<\/a>. Essentially, it comes at a price that it is tough or near impossible to find a general-purpose <a href=\"https:\/\/en.wikipedia.org\/wiki\/NoSQL\">NoSQL<\/a> \/ <a href=\"https:\/\/en.wikipedia.org\/wiki\/NewSQL\">NewSQL<\/a> data store which fits perfectly to all the needs of your applications. The choice rather depends on the problem <a href=\"https:\/\/en.wikipedia.org\/wiki\/NoSQL\">NoSQL<\/a> \/ <a href=\"https:\/\/en.wikipedia.org\/wiki\/NewSQL\">NewSQL<\/a> data store must solve. That is why the combination of <a href=\"https:\/\/en.wikipedia.org\/wiki\/Relational_database_management_system\">relational database management systems<\/a> and one or more <a href=\"https:\/\/en.wikipedia.org\/wiki\/NoSQL\">NoSQL<\/a> \/ <a href=\"https:\/\/en.wikipedia.org\/wiki\/NewSQL\">NewSQL<\/a> data storages is quite often these days.<\/p>\n<p>Over the rest of the article we are going to take a look at many of the most popular <a href=\"https:\/\/en.wikipedia.org\/wiki\/NoSQL\">NoSQL<\/a> \/ <a href=\"https:\/\/en.wikipedia.org\/wiki\/NewSQL\">NewSQL<\/a> solutions, split into seven different categories: <a href=\"#_Key\/Value_data_stores\">Key\/Value data stores<\/a>, <a href=\"#_Columnar_data_stores\">Columnar data stores<\/a>, <a href=\"#_Graph_data_stores\">Graph data stores<\/a>, <a href=\"#_Document_data_stores\">Document data stores<\/a>, <a href=\"#_Time_series_data\">Time series data stores<\/a>, <a href=\"#_Hybrid_data_stores\">Hybrid data stores<\/a> and <a href=\"#_Specialized_data_stores\">Specialized data stores<\/a>. Each of those data stores is worth at least one dedicated book so we are going to pay our attention to key design decisions and use cases.<\/p>\n<h2><a name=\"key_value_data_stores\"><\/a>5. Key\/Value data stores<\/h2>\n<p><a href=\"https:\/\/en.wikipedia.org\/wiki\/Key-value_database\">Key\/Value data stores<\/a> are designed for storing, querying and managing the data as associative arrays (also known as dictionaries or hashes). They are a perfect fit for distributed (or replicated) caching solutions, however many of those data stores go beyond simple key\/value manipulations.<\/p>\n<h3><a name=\"dynamodb\"><\/a>5.1. DynamoDB<\/h3>\n<p>The intention of this article is to cover only open source solutions in the <a href=\"https:\/\/en.wikipedia.org\/wiki\/NoSQL\">NoSQL<\/a> \/ <a href=\"https:\/\/en.wikipedia.org\/wiki\/NewSQL\">NewSQL<\/a> space, but <a href=\"http:\/\/aws.amazon.com\/documentation\/dynamodb\/\">DynamoDB<\/a> is going to be an exception. The reason for that is the great influence of its underlying principles (described in the paper <a href=\"http:\/\/www.allthingsdistributed.com\/files\/amazon-dynamo-sosp2007.pdf\">Dynamo: Amazon\u2019s Highly Available Key-value Store<\/a>) which have been an inspiration for many open source alternatives.<\/p>\n<p>Amazon <a href=\"http:\/\/aws.amazon.com\/documentation\/dynamodb\/\">DynamoDB<\/a> is a fully managed <a href=\"https:\/\/en.wikipedia.org\/wiki\/NoSQL\">NoSQL<\/a> data store service that provides fast and predictable performance with seamless scalability. Although it is most widely known as a <a href=\"https:\/\/en.wikipedia.org\/wiki\/Key-value_database\">key\/value data store<\/a>, it also is able to manage data in document formats (JSON, XML and HTML). Referring to <a href=\"https:\/\/en.wikipedia.org\/wiki\/CAP_theorem\">CAP theorem<\/a>, <a href=\"http:\/\/aws.amazon.com\/documentation\/dynamodb\/\">DynamoDB<\/a> is an <strong>AP<\/strong> system.<\/p>\n<p>Recommendations on when to use:<\/p>\n<ul>\n<li>High volume special events<\/li>\n<li>Social media applications<\/li>\n<li>Batch-oriented processing<\/li>\n<li>Reporting<\/li>\n<li>Real-time analytics<\/li>\n<\/ul>\n<p>To finish up with <a href=\"http:\/\/aws.amazon.com\/documentation\/dynamodb\/\">DynamoDB<\/a>, it is worth to mention quite a few case studies:<\/p>\n<ul>\n<li><a href=\"http:\/\/blog.sendwithus.com\/using-dynamodb-production\/\">Using DynamoDB in Production<\/a><\/li>\n<\/ul>\n<h3><a name=\"memcached\"><\/a>5.2. Memcached<\/h3>\n<p><a href=\"http:\/\/memcached.org\/\">Memcached<\/a> is an in-memory <a href=\"https:\/\/en.wikipedia.org\/wiki\/Key-value_database\">key\/value data store<\/a> for small chunks of arbitrary data (strings, objects) which could represent the results of database calls, API calls, or rendered pages. In many respects <a href=\"http:\/\/memcached.org\/\">Memcached<\/a> is one of the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Key-value_database\">key\/value data stores<\/a> pioneers, very widely used primarily as a caching solution intended to speed up dynamic web applications by alleviating database load. <a href=\"http:\/\/memcached.org\/\">Memcached<\/a> does not have any out of the box clustering support and because of this design simplification <a href=\"http:\/\/memcached.org\/\">Memcached<\/a> is very fast and effective but not very reliable (which could be unacceptable for many distributed applications these days).<\/p>\n<p>Recommendations on when to use:<\/p>\n<ul>\n<li>Caching Layer<\/li>\n<li>In-memory Session Store<\/li>\n<\/ul>\n<h3><a name=\"redis\"><\/a>5.3. Redis<\/h3>\n<p>Along with <a href=\"http:\/\/memcached.org\/\">Memcached<\/a>, <a href=\"http:\/\/redis.io\">Redis<\/a> is one of the most widely known and used <a href=\"https:\/\/en.wikipedia.org\/wiki\/Key-value_database\">key\/value data stores<\/a>. Although thinking about Redis as a key\/value store is a valid assumption, <a href=\"http:\/\/redis.io\">Redis<\/a> does much more, giving away to the hands of the developers the power of complex data structures. To quote <a href=\"http:\/\/redis.io\">http:\/\/redis.io<\/a>:<div style=\"display:inline-block; margin: 15px 0;\"> <div id=\"adngin-JavaCodeGeeks_incontent_video-0\" style=\"display:inline-block;\"><\/div> <\/div><\/p>\n<blockquote>\n<p><em>\u201cRedis is an open source, BSD licensed, advanced key-value store, used as database, cache and message broker. It is often referred to as a data structure server since keys can contain strings, hashes, lists, sets and sorted sets.\u201d<\/em><\/p>\n<\/blockquote>\n<p>For distributed deployment, <a href=\"http:\/\/redis.io\">Redis<\/a> supports clustering (also known as <a href=\"http:\/\/redis.io\/topics\/cluster-spec\">Redis Cluster<\/a>). Interestingly, from the <a href=\"https:\/\/en.wikipedia.org\/wiki\/CAP_theorem\">CAP theorem<\/a> prospective, <a href=\"http:\/\/redis.io\/topics\/cluster-spec\">Redis Cluster<\/a> is neither a <strong>CP<\/strong> nor a <strong>AP<\/strong> system but something along these lines.<\/p>\n<p>Recommendations on when to use (for more details please refer to <a href=\"http:\/\/oldblog.antirez.com\/post\/take-advantage-of-redis-adding-it-to-your-stack.html\">How to take advantage of Redis just adding it to your stack<\/a>):<\/p>\n<ul>\n<li>Caching Layer<\/li>\n<li>In-memory Session Store<\/li>\n<li>Distributed locking<\/li>\n<li>Real-time analytics<\/li>\n<li>Publish\/Subscribe &amp; Queues<\/li>\n<li>Counters<\/li>\n<li>Leader boards<\/li>\n<\/ul>\n<p>To finish up with <a href=\"http:\/\/redis.io\">Redis<\/a>, it is worth to mention quite a few case studies:<\/p>\n<ul>\n<li><a href=\"http:\/\/blog.zawodny.com\/2011\/02\/26\/redis-sharding-at-craigslist\/\">Redis Sharding at Craigslist<\/a><\/li>\n<li><a href=\"http:\/\/highscalability.com\/blog\/2014\/9\/8\/how-twitter-uses-redis-to-scale-105tb-ram-39mm-qps-10000-ins.html\">How Twitter Uses Redis to Scale<\/a><\/li>\n<\/ul>\n<h3><a name=\"riak\"><\/a>5.4. Riak<\/h3>\n<p><a href=\"http:\/\/basho.com\/products\/riak-kv\/\">Riak KV<\/a> is a distributed, scalable and fault-tolerant <a href=\"https:\/\/en.wikipedia.org\/wiki\/Key-value_database\">key\/value data store<\/a>. It supports advanced local and multi-cluster replication that guarantees reads and writes even in the event of hardware failures or network partitions. One of the distinguishing features of the <a href=\"http:\/\/basho.com\/products\/riak-kv\/\">Riak KV<\/a> is complex querying support, which is built on top of full-text search, secondary indexes and <a href=\"https:\/\/en.wikipedia.org\/wiki\/MapReduce\">map\/reduce<\/a>. From the <a href=\"https:\/\/en.wikipedia.org\/wiki\/CAP_theorem\">CAP theorem<\/a> point of view, <a href=\"http:\/\/basho.com\/products\/riak-kv\/\">Riak KV<\/a> is <strong>AP<\/strong> system which supports tunable consistency and availability levels.<\/p>\n<p>Recommendations on when to use (for more details please refer to <a href=\"https:\/\/docs.basho.com\/riak\/latest\/dev\/data-modeling\/\">Riak: Use Cases<\/a>):<\/p>\n<ul>\n<li>Content Management<\/li>\n<li>Session Storage<\/li>\n<li>Serving Advertisements<\/li>\n<li>Log Data<\/li>\n<li>Sensor Data<\/li>\n<li>User Profile \/ Settings \/ Preference Store<\/li>\n<\/ul>\n<h3><a name=\"aerospike\"><\/a>5.5. Aerospike<\/h3>\n<p><a href=\"http:\/\/www.aerospike.com\/\">Aerospike<\/a> is flash optimized, in-memory <a href=\"https:\/\/en.wikipedia.org\/wiki\/Key-value_database\">key\/value data store<\/a> for mission critical applications requiring blazing speed, cost efficient scaling and no downtime. <a href=\"http:\/\/www.aerospike.com\/\">Aerospike<\/a>\u2019s clustered architecture is designed to reliably store terabytes of data with automatic fail-over, replication, cross data-center synchronization and linear scalability. In terms of <a href=\"https:\/\/en.wikipedia.org\/wiki\/CAP_theorem\">CAP theorem<\/a>, <a href=\"http:\/\/www.aerospike.com\/\">Aerospike<\/a> is by and large an <strong>AP<\/strong> system, which additionally tries to provide a high consistency guarantees.<\/p>\n<p>Recommendations on when to use (for more details please refer to <a href=\"http:\/\/www.aerospike.com\/use-cases\/\">Aerospike: Use Cases<\/a>):<\/p>\n<ul>\n<li>Caching Layer<\/li>\n<li>User Profile Store<\/li>\n<li>Recommendation Engine<\/li>\n<li>Fraud Detection and Intervention<\/li>\n<\/ul>\n<p>To finish up with <a href=\"http:\/\/www.aerospike.com\/\">Aerospike<\/a>, it is worth to mention quite a few interesting resources:<\/p>\n<ul>\n<li><a href=\"http:\/\/highscalability.com\/blog\/2014\/5\/6\/the-quest-for-database-scale-the-1-m-tps-challenge-three-des.html\">The Quest for Database Scale<\/a><\/li>\n<li><a href=\"http:\/\/highscalability.com\/blog\/2015\/3\/17\/in-memory-computing-at-aerospike-scale-when-to-choose-and-ho.html\">In-Memory Computing at Aerospike Scale<\/a><\/li>\n<\/ul>\n<h3><a name=\"foundationdb\"><\/a>5.6. FoundationDB<\/h3>\n<p><a href=\"https:\/\/foundationdb.com\/\">FoundationDB<\/a> is a member of <a href=\"https:\/\/en.wikipedia.org\/wiki\/NoSQL\">NoSQL<\/a> \/ <a href=\"https:\/\/en.wikipedia.org\/wiki\/NewSQL\">NewSQL<\/a> family which combines traditional scalability of the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Key-value_database\">key\/value data stores<\/a> with true multi-key <a href=\"https:\/\/en.wikipedia.org\/wiki\/ACID\">ACID<\/a> transactions. The presence of SQL layer to access the stored data makes <a href=\"https:\/\/foundationdb.com\/\">FoundationDB<\/a> quite different from all other <a href=\"https:\/\/en.wikipedia.org\/wiki\/Key-value_database\">key\/value data stores<\/a> we have covered so far. Sadly, <a href=\"https:\/\/foundationdb.com\/\">FoundationDB<\/a> used to be an open source project but it suddenly ceased offering downloads and removed its public repositories after reportedly being bought by <strong>Apple<\/strong> in <strong>March, 2015<\/strong>.<\/p>\n<h2><a name=\"Columnar\"><\/a>6. Columnar data stores<\/h2>\n<p>The <a href=\"https:\/\/en.wikipedia.org\/wiki\/Wide_column_store\">columnar data stores<\/a> are designed for storing, querying and managing structured data. Much like <a href=\"https:\/\/en.wikipedia.org\/wiki\/Relational_database_management_system\">relational database management systems<\/a>, they use tables, rows, and columns however the names and format of the columns may vary from row to row within the same table. Many <a href=\"https:\/\/en.wikipedia.org\/wiki\/Wide_column_store\">columnar data stores<\/a> are built on a principles described by Google in the paper <a href=\"http:\/\/research.google.com\/archive\/bigtable.html\">Bigtable: A Distributed Storage System for Structured Data<\/a>.<\/p>\n<p>The <a href=\"https:\/\/en.wikipedia.org\/wiki\/Wide_column_store\">columnar data stores<\/a> do have a very wide range of applicability so it is very hard to come up with the precise list of recommendations. However, there are a few unique criteria which every of the data stores discussed in this section has, making it a good choice for certain scenarios.<\/p>\n<h3><a name=\"accumulo\"><\/a>6.1. Accumulo<\/h3>\n<p>The <a href=\"https:\/\/accumulo.apache.org\">Apache Accumulo<\/a> is a highly scalable structured <a href=\"https:\/\/en.wikipedia.org\/wiki\/NoSQL\">NoSQL<\/a> data store based on Google\u2019s <a href=\"http:\/\/research.google.com\/archive\/bigtable.html\">BigTable<\/a> design. <a href=\"https:\/\/accumulo.apache.org\">Apache Accumulo<\/a> supports efficient storage and retrieval of structured data, including queries for ranges, provides support <a href=\"https:\/\/en.wikipedia.org\/wiki\/MapReduce\">map\/reduce<\/a> processing and featuring automatic load-balancing and partitioning, data compression and fine-grained security labels. It is worth mentioning that <a href=\"https:\/\/accumulo.apache.org\">Apache Accumulo<\/a> operates over the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Apache_Hadoop#HDFS\">Hadoop Distributed File System<\/a> (HDFS). Referring to the <a href=\"https:\/\/en.wikipedia.org\/wiki\/CAP_theorem\">CAP theorem<\/a> trade-offs, <a href=\"https:\/\/accumulo.apache.org\">Apache Accumulo<\/a> is a <strong>CP<\/strong> system.<\/p>\n<p>Recommendations on when to use:<\/p>\n<ul>\n<li>Data access requires fine-grained security<\/li>\n<li>Fast read\/write access is required for <a href=\"https:\/\/hadoop.apache.org\/\">Apache Hadoop<\/a> integration<\/li>\n<li>And many, many more \u2026<\/li>\n<\/ul>\n<h3><a name=\"cassandra\"><\/a>6.2. Cassandra<\/h3>\n<p>As mostly all data stores in this category, <a href=\"http:\/\/cassandra.apache.org\/\">Apache Cassandra<\/a> is build on foundations of Google\u2019s <a href=\"http:\/\/research.google.com\/archive\/bigtable.html\">BigTable<\/a> (and Amazon\u2019s <a href=\"http:\/\/www.allthingsdistributed.com\/files\/amazon-dynamo-sosp2007.pdf\">Dynamo<\/a>) and is the right choice when you need scalability and high availability without compromising performance. It offers the linear scalability and proven fault-tolerance on commodity hardware or cloud infrastructure, which makes it the perfect platform for mission-critical data. It is totally decentralized with no single points of failure.<\/p>\n<p>One of the killing features of <a href=\"http:\/\/cassandra.apache.org\/\">Apache Cassandra<\/a> is support for multiple datacenters replication, so demanding capability for lowering the latency and disaster recovery. From the <a href=\"https:\/\/en.wikipedia.org\/wiki\/CAP_theorem\">CAP theorem<\/a> stand point, <a href=\"http:\/\/cassandra.apache.org\/\">Apache Cassandra<\/a> is an <strong>AP<\/strong> system with tunable consistency levels.<\/p>\n<p>Recommendations on when to use (for more details please refer to <a href=\"http:\/\/www.planetcassandra.org\/apache-cassandra-use-cases\/\">Apache Cassandra: Use Cases<\/a>):<\/p>\n<ul>\n<li>Write-heavy workloads<\/li>\n<li>Product Catalog \/ Playlist<\/li>\n<li>Recommendation \/ Personalization<\/li>\n<li>Fraud Detection<\/li>\n<li>Messaging<\/li>\n<li>IOT \/ Sensor Data<\/li>\n<li>Real-time analytics<\/li>\n<li>And many, many more \u2026<\/li>\n<\/ul>\n<p>To finish up with <a href=\"http:\/\/cassandra.apache.org\/\">Apache Cassandra<\/a>, it is worth to mention quite a few case studies:<\/p>\n<ul>\n<li><a href=\"https:\/\/moz.com\/devblog\/cassandra-in-production-things-we-learned\/\">Cassandra In Production: Things We Learned<\/a><\/li>\n<li><a href=\"http:\/\/target.github.io\/infrastructure\/tuning-cassandra\/\">Tuning Cassandra @ Target<\/a><\/li>\n<li>Cassandra Data Modeling Best Practices, <a href=\"http:\/\/www.ebaytechblog.com\/2012\/07\/16\/cassandra-data-modeling-best-practices-part-1\/\">part 1<\/a> and <a href=\"http:\/\/www.ebaytechblog.com\/2012\/08\/14\/cassandra-data-modeling-best-practices-part-2\/\">part 2<\/a><\/li>\n<\/ul>\n<h3><a name=\"hbase\"><\/a>6.3. HBase<\/h3>\n<p>The <a href=\"http:\/\/hbase.apache.org\/\">Apache HBase<\/a> is the great option when you need random, real-time read and write access to very large volumes of data. Heavily inspired by Google\u2019s <a href=\"http:\/\/research.google.com\/archive\/bigtable.html\">BigTable<\/a>, it is built on top of <a href=\"https:\/\/en.wikipedia.org\/wiki\/Apache_Hadoop#HDFS\">Hadoop Distributed File System<\/a> (HDFS), much like <a href=\"#_Accumulo\">Accumulo<\/a> does. Linear scalability, strictly consistent reads and writes, automatic and configurable sharding, automatic failover, integration with <a href=\"https:\/\/hadoop.apache.org\/\">Apache Hadoop<\/a> are only a subset of the most important features to highlight with respect to <a href=\"http:\/\/hbase.apache.org\/\">Apache HBase<\/a> design. Getting to <a href=\"https:\/\/en.wikipedia.org\/wiki\/CAP_theorem\">CAP theorem<\/a> basis, <a href=\"http:\/\/hbase.apache.org\/\">Apache HBase<\/a> is a <strong>CP<\/strong> system.<\/p>\n<p>Recommendations on when to use (for more details please refer to <a href=\"http:\/\/wiki.apache.org\/hadoop\/Hbase\/PoweredBy\">Apache HBase: Use Cases<\/a>):<\/p>\n<ul>\n<li>Real-time analytics<\/li>\n<li>Batch Processing<\/li>\n<li>Time Series Data<\/li>\n<li>And many, many more \u2026<\/li>\n<\/ul>\n<p>To finish up with <a href=\"http:\/\/hbase.apache.org\/\">Apache HBase<\/a>, it is worth to mention quite a few case studies:<\/p>\n<ul>\n<li>Why we\u2019re using HBase, <a href=\"http:\/\/hstack.org\/why-were-using-hbase-part-1\/\">part 1<\/a> and <a href=\"http:\/\/hstack.org\/why-were-using-hbase-part-2\/\">part 2<\/a><\/li>\n<li><a href=\"http:\/\/highscalability.com\/blog\/2010\/11\/16\/facebooks-new-real-time-messaging-system-hbase-to-store-135.html\">Facebook&#8217;s New Real-time Messaging System<\/a><\/li>\n<\/ul>\n<h2><a name=\"graph\"><\/a>7. Graph data stores<\/h2>\n<p>The family of <a href=\"https:\/\/en.wikipedia.org\/wiki\/Graph_database\">graph data stores<\/a> is based on <a href=\"https:\/\/en.wikipedia.org\/wiki\/Graph_theory\">graph theory<\/a> and as such its data model is built around nodes (vertices), edges and properties to represent and store linked data. This kind of data stores is particularly powerful in solving graph-like problems, for example finding the shortest path between two nodes.<\/p>\n<h3><a name=\"neo4j\"><\/a>7.1. Neo4J<\/h3>\n<p><a href=\"http:\/\/neo4j.com\">Neo4j<\/a> is an open-source <a href=\"https:\/\/en.wikipedia.org\/wiki\/NoSQL\">NoSQL<\/a> graph data store which has been publicly available since 2007. <a href=\"http:\/\/neo4j.com\">Neo4j<\/a> provides full set of scalable and reliable data store characteristics, including <a href=\"https:\/\/en.wikipedia.org\/wiki\/ACID\">ACID<\/a> transaction compliance, cluster support, and runtime failover, making it suitable to use for managing graph data in real production scenarios. The commercial offering of <a href=\"http:\/\/neo4j.com\">Neo4j<\/a> also complements the set of features with high-availability, live backups, and comprehensive monitoring. From the standpoint of <a href=\"https:\/\/en.wikipedia.org\/wiki\/CAP_theorem\">CAP theorem<\/a>, <a href=\"http:\/\/neo4j.com\">Neo4j<\/a> is <strong>CA<\/strong> system.<\/p>\n<p>Recommendations on when to use (for more details please refer to <a href=\"http:\/\/neo4j.com\/use-cases\/\">Neoj4: Use Cases<\/a>):<\/p>\n<ul>\n<li>Master Data Management<\/li>\n<li>Network and IT Operations<\/li>\n<li>Real-Time Recommendations<\/li>\n<li>Fraud Detection<\/li>\n<li>Social Network<\/li>\n<li>Identity and Access Management<\/li>\n<li>Graph-Based Search<\/li>\n<\/ul>\n<h3><a name=\"titan\"><\/a>7.2. Titan<\/h3>\n<p><a href=\"http:\/\/thinkaurelius.github.io\/titan\/\">Titan<\/a> is a scalable graph data store optimized for storing and querying graphs containing hundreds of billions of vertices and edges distributed across a multi-node cluster. <a href=\"http:\/\/thinkaurelius.github.io\/titan\/\">Titan<\/a> is a transactional database that can support thousands of concurrent complex graph traversal executions in real time. In addition, <a href=\"http:\/\/thinkaurelius.github.io\/titan\/\">Titan<\/a> provides linear scalability, data distribution and replication, fault tolerance, multi-datacenter high availability and hot backups. <a href=\"http:\/\/thinkaurelius.github.io\/titan\/\">Titan<\/a> does not implement own storage mechanism but instead could be backed by <a href=\"http:\/\/hbase.apache.org\/\">Apache HBase<\/a>, which makes it a <strong>CP <\/strong>system or by <a href=\"http:\/\/cassandra.apache.org\/\">Apache Cassandra<\/a>, which makes it an <strong>AP<\/strong> system.<\/p>\n<p>Recommendations on when to use:<\/p>\n<ul>\n<li>Batch Processing and Analytics<\/li>\n<li>Network and IT Operations<\/li>\n<li>Fraud Detection<\/li>\n<li>Social Network<\/li>\n<li>Graph-Based Search<\/li>\n<\/ul>\n<p>To finish up with <a href=\"http:\/\/thinkaurelius.github.io\/titan\/\">Titan<\/a>, it is worth to mention quite a few case studies:<\/p>\n<ul>\n<li><a href=\"http:\/\/blogs.cisco.com\/security\/attack-analysis-with-a-fast-graph\">Attack Analysis with a Fast Graph<\/a><\/li>\n<\/ul>\n<h3><a name=\"flockdb\"><\/a>7.3. FlockDB<\/h3>\n<p><a href=\"https:\/\/github.com\/twitter\/flockdb\">FlockDB<\/a> is a distributed, fault-tolerant graph database which came out of Twitter. By its design, <a href=\"https:\/\/github.com\/twitter\/flockdb\">FlockDB<\/a> is much simpler than other graph databases because it tries to solve fewer problems. It scales horizontally and is designed for real-time, low-latency, high throughput environments. <a href=\"https:\/\/github.com\/twitter\/flockdb\">FlockDB<\/a> is built on top of <a href=\"http:\/\/dev.mysql.com\/\">MySQL<\/a> storage layer, utilizing another Twitter\u2019s framework called <a href=\"https:\/\/github.com\/twitter\/gizzard\">Gizzard<\/a> for sharding and replication support. Sadly, it seems like development of <a href=\"https:\/\/github.com\/twitter\/flockdb\">FlockDB<\/a> has been put on hold since 2012.<\/p>\n<p>To finish up with <a href=\"https:\/\/github.com\/twitter\/flockdb\">FlockDB<\/a>, it is worth to mention quite a few case studies:<\/p>\n<ul>\n<li><a href=\"https:\/\/blog.twitter.com\/2010\/introducing-flockdb\">Introducing FlockDB<\/a><\/li>\n<\/ul>\n<h3><a name=\"graphbase\"><\/a>7.4. GraphBase<\/h3>\n<p><a href=\"http:\/\/graphbase.net\/\">GraphBase<\/a> is a commercial, distributed <a href=\"https:\/\/en.wikipedia.org\/wiki\/NoSQL\">NoSQL<\/a> graph data store indented to store massive, highly-structured data and manage large graphs. It claims to provide tiny memory and storage footprint, support for built-in traversal heuristics and graph-based transactions. There are not many details available from the vendor to make the trusted assumptions what <a href=\"http:\/\/graphbase.net\/\">GraphBase<\/a> is in terms of <a href=\"https:\/\/en.wikipedia.org\/wiki\/CAP_theorem\">CAP theorem<\/a>.<\/p>\n<p>Recommendations on when to use (for more details please refer to <a href=\"http:\/\/graphbase.net\/Enterprise.html\">GraphBase: Use Cases<\/a>):<\/p>\n<ul>\n<li>Master Data Management<\/li>\n<li>Interactions within large networks of people and\/or things<\/li>\n<li>Social Network<\/li>\n<li>Complex natural models (biological, economic, environmental&#8230;)<\/li>\n<li>Large-scale Intelligence gathering<\/li>\n<\/ul>\n<h3><a name=\"infinitegraph\"><\/a>7.5. InfiniteGraph<\/h3>\n<p><a href=\"http:\/\/www.objectivity.com\/\">InfiniteGraph<\/a>, another player in the niche of commercial <a href=\"https:\/\/en.wikipedia.org\/wiki\/Graph_database\">graph data stores<\/a>, is the distributed and scalable graph database from, designed specifically to traverse complex relationships and provide the foundation for making real-time business decision. <a href=\"http:\/\/www.objectivity.com\/\">InfiniteGraph<\/a> naturally supports partitioning and distribution preserving the ability to query data across node boundaries. Due to limited information available, it is tough to classify <a href=\"http:\/\/www.objectivity.com\/\">InfiniteGraph<\/a> in terms of <a href=\"https:\/\/en.wikipedia.org\/wiki\/CAP_theorem\">CAP theorem<\/a> so better not to make any guesses here.<\/p>\n<p>Recommendations on when to use:<\/p>\n<ul>\n<li>Fraud Detection<\/li>\n<li>Surveillance<\/li>\n<li>Prescription Analytics<\/li>\n<li>Network Security Information and Event Management (SIEM)<\/li>\n<li>Large-scale Intelligence gathering<\/li>\n<\/ul>\n<h2><a name=\"document\"><\/a>8. Document data stores<\/h2>\n<p>Another class of <a href=\"https:\/\/en.wikipedia.org\/wiki\/NoSQL\">NoSQL<\/a> data stores, known as <a href=\"https:\/\/en.wikipedia.org\/wiki\/Document-oriented_database\">document data stores<\/a>, is designed for storing, querying and managing documents (semi-structured data). Instead of relying on usual key \/ value or table \/ column \/ row representations, the data records are stored as full-fledged documents which are organized into collections. It is very natural and easy to dial with model, particularly in the modern web applications where <a href=\"http:\/\/www.json.org\/\">JSON documents<\/a> is a dominant data exchange and representation format.<\/p>\n<h3><a name=\"couchbase1\"><\/a>8.1. Couchbase<\/h3>\n<p><a href=\"http:\/\/www.couchbase.com\/\">Couchbase<\/a> positions itself as a <a href=\"https:\/\/en.wikipedia.org\/wiki\/NoSQL\">NoSQL<\/a> <a href=\"https:\/\/en.wikipedia.org\/wiki\/Document-oriented_database\">document data store<\/a> for interactive web applications. It supports a flexible data schema (as most of the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Document-oriented_database\">document data stores)<\/a>, is easily scalable, and provides consistent high performance and high availability. <a href=\"http:\/\/www.couchbase.com\/\">Couchbase<\/a> is a good choice for the web applications where low\u2013latency and high throughput are the critical requirements. The outstanding capability provided by <a href=\"http:\/\/www.couchbase.com\/\">Couchbase<\/a> is native mobile readiness which allows to build mobile applications with offline support via an embedded database and automatic synchronization. As per <a href=\"https:\/\/en.wikipedia.org\/wiki\/CAP_theorem\">CAP theorem<\/a> trade-offs, <a href=\"http:\/\/www.couchbase.com\/\">Couchbase<\/a> is an <strong>AP <\/strong>system.<\/p>\n<p>Recommendations on when to use (for more details please refer to <a href=\"http:\/\/www.couchbase.com\/use-cases\">Couchbase: Use Cases<\/a>):<\/p>\n<ul>\n<li>Content Management<\/li>\n<li>Fraud Detection<\/li>\n<li>Profile Management<\/li>\n<li>Digital Communication<\/li>\n<li>Personalization<\/li>\n<li>Product Data Management<\/li>\n<li>Real-Time Analytics<\/li>\n<\/ul>\n<h3><a name=\"couchdb1\"><\/a>8.2. CouchDB<\/h3>\n<p><a href=\"http:\/\/couchdb.apache.org\/\">CouchDB<\/a> is a <a href=\"https:\/\/en.wikipedia.org\/wiki\/Document-oriented_database\">document data store<\/a> built for web: it manages the data as <a href=\"http:\/\/www.json.org\/\">JSON<\/a> documents, has built-in <a href=\"https:\/\/en.wikipedia.org\/wiki\/Hypertext_Transfer_Protocol\">HTTP<\/a> API to access and query documents directly from web browser, and uses <a href=\"https:\/\/en.wikipedia.org\/wiki\/JavaScript\">JavaScript<\/a> language to index, combine and transform documents. <a href=\"http:\/\/couchdb.apache.org\/\">CouchDB<\/a> is scalable, highly available, partition tolerant, eventually consistent system with automatic conflict detection. Following <a href=\"https:\/\/en.wikipedia.org\/wiki\/CAP_theorem\">CAP theorem<\/a> design constraints, <a href=\"http:\/\/couchdb.apache.org\/\">CouchDB<\/a> is an <strong>AP <\/strong>system.<\/p>\n<p>It is worth to mention that <a href=\"http:\/\/couchdb.apache.org\/\">CouchDB<\/a> and <a href=\"http:\/\/www.couchbase.com\/\">Couchbase<\/a>, although are different products, do have a lot of things in common that sometimes make the choice which one to pick harder. The article <a href=\"http:\/\/www.couchbase.com\/couchbase-vs-couchdb\">Couchbase vs. Apache CouchDB: A comparison of two open source NoSQL database technologies<\/a> does a fair look on the history of these two solutions and key differences between them.<\/p>\n<p>Recommendations on when to use:<\/p>\n<ul>\n<li>Content Management<\/li>\n<li>Product Data Management<\/li>\n<li>Profile Management<\/li>\n<li>Personalization<\/li>\n<\/ul>\n<h3><a name=\"mongodb1\"><\/a>8.3. MongoDB<\/h3>\n<p><a href=\"https:\/\/www.mongodb.org\/\">MongoDB<\/a> is a horizontally scalable and highly available <a href=\"https:\/\/en.wikipedia.org\/wiki\/Document-oriented_database\">document data store<\/a> which manages and stores <a href=\"http:\/\/www.json.org\/\">JSON<\/a>-style documents, grouped into collections. Among many other features, it supports dynamic schemas, advanced monitoring, complex querying, secondary indexes (including full-text search support), replication, automatic failover and sharding. One of the distinguishing features of <a href=\"https:\/\/www.mongodb.org\/\">MongoDB<\/a> is ease of getting started and use, by such promoting a faster development process. From the <a href=\"https:\/\/en.wikipedia.org\/wiki\/CAP_theorem\">CAP theorem<\/a> point of view, <a href=\"https:\/\/www.mongodb.org\/\">MongoDB<\/a> is <strong>CP <\/strong>system.<\/p>\n<p>Recommendations on when to use (for more details please refer to <a href=\"http:\/\/docs.mongodb.org\/ecosystem\/use-cases\/\">MongoDB: Use Cases<\/a>):<\/p>\n<ul>\n<li>Operational Intelligence<\/li>\n<li>Product Data Management<\/li>\n<li>Inventory Management<\/li>\n<li>Content Management<\/li>\n<li>Log Data<\/li>\n<li>Reporting<\/li>\n<\/ul>\n<p>To finish up with <a href=\"https:\/\/www.mongodb.org\/\">MongoDB<\/a>, it is worth to mention quite a few interesting resources:<\/p>\n<ul>\n<li><a href=\"http:\/\/highscalability.com\/blog\/2014\/3\/5\/10-things-you-should-know-about-running-mongodb-at-scale.html\">10 Things You Should Know About Running MongoDB at Scale<\/a><\/li>\n<li><a href=\"https:\/\/www.compose.io\/articles\/how-we-scale-mongodb\/\">How We Scale MongoDB<\/a><\/li>\n<\/ul>\n<h2><a name=\"time\"><\/a>9. Time series data stores<\/h2>\n<p><a href=\"https:\/\/en.wikipedia.org\/wiki\/Time_series_database\">Time series data stores<\/a> (also known as <strong>TSDB<\/strong>s) is the special class of <a href=\"https:\/\/en.wikipedia.org\/wiki\/NoSQL\">NoSQL<\/a> solutions which are highly optimized for handling time series data: arrays of data points ordered by time. Although it is generally possible to use other classes of <a href=\"https:\/\/en.wikipedia.org\/wiki\/NoSQL\">NoSQL<\/a> \/ <a href=\"https:\/\/en.wikipedia.org\/wiki\/NewSQL\">NewSQL<\/a> data stores (for example <a href=\"#_Document_data_stores_1\">Document data stores<\/a> or <a href=\"#_Columnar_data_stores\">Columnar data stores<\/a>) to manage time series data, it is not an adequate replacement in most cases due to the nature of the problem being solved.<\/p>\n<h3><a name=\"influxdb\"><\/a>9.1. InfluxDB<\/h3>\n<p><a href=\"https:\/\/influxdb.com\/\">InfluxDB<\/a> is a distributed time series, metrics, and analytics data store. Although at the moment of this writing the support of clustering, replication and high-availability is considered to be in an alpha state, there are a lot of other great stable features which <a href=\"https:\/\/influxdb.com\/\">InfluxDB<\/a> offers:<\/p>\n<ul>\n<li>Powerful SQL-like query language<\/li>\n<li>Native HTTP(S) API for data ingestion and queries<\/li>\n<li>Retention policies for data<\/li>\n<li>On the fly aggregation<\/li>\n<\/ul>\n<p>Probably the most distinguishing feature of the <a href=\"https:\/\/influxdb.com\/\">InfluxDB<\/a> comparing to other <a href=\"https:\/\/en.wikipedia.org\/wiki\/Time_series_database\">time series data stores<\/a> is the absence of external dependencies: just download, unpack and run. Although <a href=\"https:\/\/influxdb.com\/\">InfluxDB<\/a> has been designed to be highly available and eventually consistent data store, from the <a href=\"https:\/\/en.wikipedia.org\/wiki\/CAP_theorem\">CAP theorem<\/a> prospective it is neither <strong>CP <\/strong>nor <strong>AP<\/strong> system.<\/p>\n<p>Recommendations on when to use:<\/p>\n<ul>\n<li>Metrics \/ Events data<\/li>\n<li>Sensor Data<\/li>\n<li>Real-time Analytics<\/li>\n<\/ul>\n<p>To finish up with <a href=\"https:\/\/influxdb.com\/\">InfluxDB<\/a>, it is worth to mention quite a few case studies:<\/p>\n<ul>\n<li><a href=\"http:\/\/tech.trivago.com\/2015\/04\/14\/timeseries_influxdb\/\">Realtime metrics with Go: Running InfluxDB in production<\/a><\/li>\n<\/ul>\n<h3><a name=\"opentsdb\"><\/a>9.2. OpenTSDB<\/h3>\n<p><a href=\"http:\/\/opentsdb.net\">OpenTSDB<\/a> is a scalable <a href=\"https:\/\/en.wikipedia.org\/wiki\/Time_series_database\">time series data store<\/a> which is designed from the ground up to store and serve massive amounts of time series data without losing granularity. <a href=\"http:\/\/opentsdb.net\">OpenTSDB<\/a> is built on top of <a href=\"http:\/\/hbase.apache.org\/\">Apache HBase<\/a> (see please <a href=\"#_HBase\">HBase<\/a> for more details) and as such inherits most of its characteristics: linear scalability, automatic replication, efficient scans and high write throughput.<\/p>\n<p>Recommendations on when to use:<\/p>\n<ul>\n<li>Metrics data<\/li>\n<li>Sensor Data<\/li>\n<li>Real-time Analytics<\/li>\n<\/ul>\n<h3><a name=\"druid\"><\/a>9.3. Druid<\/h3>\n<p><a href=\"http:\/\/druid.io\/\">Druid<\/a> is a distributed, scalable, high-performance, column-oriented, real-time analytic data store built for interactive analytics. Among its most important features are high availability, low latency data ingestion, flexible data exploration, and fast data aggregation. In many respects, <a href=\"http:\/\/druid.io\/\">Druid<\/a> is OLAP data store and its focus is more on storing aggregated data (however joins are not supported at the moment). <a href=\"http:\/\/druid.io\/\">Druid<\/a>&#8216;s native query language is JSON and under the hood <a href=\"http:\/\/druid.io\/\">Druid<\/a> uses different storages for metadata and actual data.<\/p>\n<p>Recommendations on when to use:<\/p>\n<ul>\n<li>Real-time Analytics<\/li>\n<li>Metrics \/ Events data<\/li>\n<\/ul>\n<p>To finish up with <a href=\"http:\/\/druid.io\/\">Druid<\/a>, it is worth to mention quite a few case studies:<\/p>\n<ul>\n<li><a href=\"https:\/\/metamarkets.com\/2014\/building-a-data-pipeline-that-handles-billions-of-events-in-real-time\/\">Building a Data Pipeline That Handles Billions of Events in Real-Time<\/a><\/li>\n<li><a href=\"http:\/\/techblog.netflix.com\/2013\/12\/announcing-suro-backbone-of-netflixs.html\">Announcing Suro: Backbone of Netflix&#8217;s Data Pipeline<\/a><\/li>\n<\/ul>\n<h3><a name=\"prometheus\"><\/a>9.4. Prometheus<\/h3>\n<p><a href=\"http:\/\/prometheus.io\/\">Prometheus<\/a> is a service monitoring system and <a href=\"https:\/\/en.wikipedia.org\/wiki\/Time_series_database\">time series data store<\/a> designed for reliability. It works exceptionally well for collecting any purely numeric time series and its support of multi-dimensional data collection and querying is a particular strength. Perhaps, the most crucial <a href=\"http:\/\/prometheus.io\/\">Prometheus<\/a> feature is that it is not a strictly distributed system: every server runs independently of each other and only relies on its local storage for core functionality.<\/p>\n<p>Recommendations on when to use:<\/p>\n<ul>\n<li>Metrics data<\/li>\n<\/ul>\n<p>To finish up with <a href=\"http:\/\/prometheus.io\/\">Prometheus<\/a>, it is worth to mention quite a few case studies:<\/p>\n<ul>\n<li><a href=\"https:\/\/developers.soundcloud.com\/blog\/prometheus-monitoring-at-soundcloud\">Prometheus: Monitoring at SoundCloud<\/a><\/li>\n<\/ul>\n<h2><a name=\"hybrid\"><\/a>10. Hybrid data stores<\/h2>\n<p>There is a particular set of solutions which belong to more than one class of <a href=\"https:\/\/en.wikipedia.org\/wiki\/NoSQL\">NoSQL<\/a> \/ <a href=\"https:\/\/en.wikipedia.org\/wiki\/NewSQL\">NewSQL<\/a> data stores we have seen so far. Those could be classified as hybrid data stores and typically they provide multiple ways to store and manage data (for example, the same <a href=\"https:\/\/en.wikipedia.org\/wiki\/NoSQL\">NoSQL<\/a> data store may be used as a <a href=\"#_Key\/Value_data_stores\">key\/value data store<\/a> for caching and as a <a href=\"https:\/\/en.wikipedia.org\/wiki\/Document-oriented_database\">document data store<\/a> for content management).<\/p>\n<p>From operational standpoint, single hybrid data store may replace two or more specialized data stores as such reducing the complexity of the distributed system deployment, monitoring and maintenance. Similarly, from use cases prospective hybrid data stores may be a great fit and serve equally well with respect to data store(s) being replaced.<\/p>\n<p>Interestingly enough, from <a href=\"https:\/\/en.wikipedia.org\/wiki\/CAP_theorem\">CAP theorem<\/a> prospective, in most cases hybrid data stores follow a different trade-offs depending on the way there are being exploited by applications.<\/p>\n<h3><a name=\"arangodb1\"><\/a>10.1. ArangoDB<\/h3>\n<p><a href=\"https:\/\/www.arangodb.com\/\">ArangoDB<\/a> is distributed, high performance <a href=\"https:\/\/en.wikipedia.org\/wiki\/NoSQL\">NoSQL<\/a> data store with a flexible data model to manage documents, graphs, and key\/values. Essentially, it is designed to be a general-purpose data store by offering most of the features typically needed for modern web applications. Those include (but are not limited to) schema-less data model, general <a href=\"https:\/\/en.wikipedia.org\/wiki\/Hypertext_Transfer_Protocol\">HTTP<\/a> <a href=\"https:\/\/en.wikipedia.org\/wiki\/Representational_state_transfer\">REST<\/a> API, SQL-like query language (AQL) which supports complex filter conditions and joins, <a href=\"https:\/\/en.wikipedia.org\/wiki\/ACID\">ACID<\/a> multi-collection \/ single-document transactions. <a href=\"https:\/\/www.arangodb.com\/\">ArangoDB<\/a> also has a dedicated JavaScript framework for data-centric microservices called Foxx with a variety of microservices available (e.g. user- or session-service). From scalability prospective, <a href=\"https:\/\/www.arangodb.com\/\">ArangoDB<\/a> supports asynchronous master-slave replication and sharded clusters.<\/p>\n<p>Recommendations on when to use: as the in-place replacement for specialized data stores when different data model projections (documents, graphs, key\/values) are required.<\/p>\n<h3><a name=\"orientdb1\"><\/a>10.2. OrientDB<\/h3>\n<p><a href=\"http:\/\/orientdb.com\">OrientDB<\/a> is a linearly scalable high-performance operational <a href=\"https:\/\/en.wikipedia.org\/wiki\/NoSQL\">NoSQL<\/a> data store which brings together the power of graphs and the flexibility of documents. The strongest points of <a href=\"http:\/\/orientdb.com\">OrientDB<\/a> are full support of <a href=\"https:\/\/en.wikipedia.org\/wiki\/ACID\">ACID<\/a> transactions, multi-master replication, sharding and use of SQL (with some extensions) to manipulate trees and graphs. Presence of <a href=\"https:\/\/en.wikipedia.org\/wiki\/Hypertext_Transfer_Protocol\">HTTP<\/a> <a href=\"https:\/\/en.wikipedia.org\/wiki\/Representational_state_transfer\">REST<\/a> API is also worth to mention.<\/p>\n<p>Recommendations on when to use: as the in-place replacement for specialized data stores when different data model projections (documents, graphs) are required. However, <a href=\"http:\/\/orientdb.com\">OrientDB<\/a> can go beyond that (as <a href=\"http:\/\/orientdb.com\/docs\/2.1\/Use-Cases.html\">OrientDB: Use Cases<\/a> highlights) and could be used for:<\/p>\n<ul>\n<li>Time series<\/li>\n<li>Chat rooms<\/li>\n<li>Queue systems<\/li>\n<\/ul>\n<h3><a name=\"hyperdex\"><\/a>10.3. HyperDex<\/h3>\n<p><a href=\"http:\/\/hyperdex.org\">HyperDex<\/a> is the next-generation distributed key-value and document <a href=\"https:\/\/en.wikipedia.org\/wiki\/NoSQL\">NoSQL<\/a> data store, designed from the ground up with reliability, robustness and performance in mind. It values strong consistency, fault-tolerance, scalability, rich API and ease of use. Although <a href=\"http:\/\/hyperdex.org\">HyperDex<\/a> claims to have full <a href=\"https:\/\/en.wikipedia.org\/wiki\/ACID\">ACID<\/a> transactions support, this feature is not a part of the standard <a href=\"http:\/\/hyperdex.org\">HyperDex<\/a> distribution and seems to constitute the commercial offering.<\/p>\n<p>Recommendations on when to use: as the in-place replacement for specialized data stores when different data model projections (documents, key\/value) are required.<\/p>\n<h2><a name=\"specialized\"><\/a>11. Specialized data stores<\/h2>\n<p>In this last section of the article we are going to take a look on some <a href=\"https:\/\/en.wikipedia.org\/wiki\/NoSQL\">NoSQL<\/a> \/ <a href=\"https:\/\/en.wikipedia.org\/wiki\/NewSQL\">NewSQL<\/a> solutions which do not fall into particular well-defined class or category. They have been designed to address a specific use cases faced by modern web applications, driven by responsive and reactive design principles.<\/p>\n<h3><a name=\"rethinkdb2\"><\/a>11.1. RethinkDB<\/h3>\n<p><a href=\"https:\/\/www.rethinkdb.com\">RethinkDB<\/a> is a scalable, distributed, highly-available <a href=\"https:\/\/en.wikipedia.org\/wiki\/NoSQL\">NoSQL<\/a> data store built from the ground up for the real-time web. It inverts the traditional polling architecture when data (and changes) is being polled by applications from the data store. Instead, it offers a pushing architecture when updated query results are being continuously pushed to applications in real-time.<\/p>\n<p>To distinguish itself from the data stores which just push raw data changes, <a href=\"https:\/\/www.rethinkdb.com\">RethinkDB<\/a> allows to subscribe to changes using advanced query language that supports joins, sub-queries, and massively parallelized distributed computation. From the <a href=\"https:\/\/en.wikipedia.org\/wiki\/CAP_theorem\">CAP theorem<\/a> point of view, <a href=\"https:\/\/www.rethinkdb.com\">RethinkDB<\/a> is <strong>CP <\/strong>system.<\/p>\n<p>Recommendations on when to use (see please <a href=\"https:\/\/www.rethinkdb.com\/faq\/\">RethinkDB: Use Cases<\/a>):<\/p>\n<ul>\n<li>Collaborative web and mobile apps<\/li>\n<li>Streaming analytics apps<\/li>\n<li>Multiplayer games<\/li>\n<li>Real-time marketplaces<\/li>\n<li>Connected devices<\/li>\n<\/ul>\n<h3><a name=\"crate2\"><\/a>11.2. Crate<\/h3>\n<p><a href=\"https:\/\/crate.io\">Crate<\/a> is a distributed <a href=\"https:\/\/en.wikipedia.org\/wiki\/NewSQL\">NewSQL<\/a> data store. Based on the familiar SQL syntax, it comes with high availability, resiliency, and scalability and allows to query large volumes of data in real-time. It offers dynamic cluster resizing, auto-rebalancing \/ auto-sharding and replication, multi-index queries, distributed aggregations and sorting, full-text search, and simple cluster management. Although in the heart of <a href=\"https:\/\/crate.io\">Crate<\/a> lays distributed query analyzer, planner and execution engine, the large set of its features heavily relies on built-in functionality of other excellent solutions, <a href=\"https:\/\/lucene.apache.org\/\">Apache Lucene<\/a> and <a href=\"https:\/\/www.elastic.co\/\">Elasticsearch<\/a>.<\/p>\n<p>Recommendations on when to use (see please <a href=\"https:\/\/crate.io\/use-cases\/\">Crate: Use Cases<\/a>):<\/p>\n<ul>\n<li>Aggregation Across Large Data Sets<\/li>\n<li>Data Analytics<\/li>\n<li>Full text search<\/li>\n<li>Geospatial queries<\/li>\n<\/ul>\n<h2><a name=\"conclusions\"><\/a>12. Conclusions<\/h2>\n<p><a href=\"https:\/\/en.wikipedia.org\/wiki\/NoSQL\">NoSQL<\/a> \/ <a href=\"https:\/\/en.wikipedia.org\/wiki\/NewSQL\">NewSQL<\/a> revolution brought to the world a lot of new fresh ideas and huge variety of excellent non-relation data stores to satisfy the needs of modern software systems in managing massive volumes of data. Although with time <a href=\"https:\/\/en.wikipedia.org\/wiki\/NoSQL\">NoSQL<\/a> \/ <a href=\"https:\/\/en.wikipedia.org\/wiki\/NewSQL\">NewSQL<\/a> data stores mature and are gaining more and more popularity, the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Relational_database_management_system\">relational database management systems<\/a> are not going anywhere, they are here to stay, adapting to new challenges. Despite their constraints and endless alternatives, these days <a href=\"https:\/\/www.mysql.com\/\">MySQL<\/a> or <a href=\"http:\/\/www.postgresql.org\/\">PostgreSQL<\/a> are still often a primary choice, successfully managing enormous volumes of data never seen before.<\/p>\n<p>The goal of this article is not to identify the winners and losers but introduce a lot more options available. Picking the right <a href=\"https:\/\/en.wikipedia.org\/wiki\/Relational_database_management_system\">relational database management system<\/a> or <a href=\"https:\/\/en.wikipedia.org\/wiki\/NoSQL\">NoSQL<\/a> \/ <a href=\"https:\/\/en.wikipedia.org\/wiki\/NewSQL\">NewSQL<\/a> data store may significantly speed up development process, helps to reduce time to market, and addresses short or long term scalability requirements. It is very common these days to have many heterogeneous data stores working side by side to serve different application workloads and data access patterns.<\/p>\n<p>But, be warned: to get most out of your data store it is worth to invest time understanding its design, constraints, tradeoffs, limitations, and most importantly, data access and durability guarantee. Discovering corrupted data or even complete data loss in production because of bugs, edge cases or pure misunderstanding of the key guarantees of shiny new data store of your choice is not worth following the hottest industry trends. Always do a conscious choice by learning and experimenting, particularly paying attentions to different failure conditions and heavy workloads.<\/p>\n<p><em>Do you have any other NoSQL or NewSQL systems you would like to suggest? Let us know in the comments!<\/em><\/p>\n<p>Don&#8217;t forget to Retweet this, share it with your social media followers!<\/p>\n<blockquote class=\"twitter-tweet\" data-width=\"500\" data-dnt=\"true\">\n<p lang=\"fr\" dir=\"ltr\"><a href=\"https:\/\/twitter.com\/hashtag\/NoSQL?src=hash&amp;ref_src=twsrc%5Etfw\">#NoSQL<\/a> vs. <a href=\"https:\/\/twitter.com\/hashtag\/SQL?src=hash&amp;ref_src=twsrc%5Etfw\">#SQL<\/a>: Choosing a Data Management Solution <a href=\"https:\/\/t.co\/lRJ9e1IDnn\">https:\/\/t.co\/lRJ9e1IDnn<\/a> <a href=\"https:\/\/t.co\/3IcZd7kn3C\">pic.twitter.com\/3IcZd7kn3C<\/a><\/p>\n<p>&mdash; Java Code Geeks (@javacodegeeks) <a href=\"https:\/\/twitter.com\/javacodegeeks\/status\/659360445606834176?ref_src=twsrc%5Etfw\">October 28, 2015<\/a><\/p>\n<\/blockquote>\n<p><script async src=\"https:\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Table Of Contents 1. Introduction 2. Distributed systems: the CAP theorem 3. Relational data stores 3.1. MySQL \/ MariaDB 3.2. PostgreSQL 3.3. Others 4. Why NoSQL? 5. Key\/Value data stores 5.1. DynamoDB 5.2. Memcached 5.3. Redis 5.4. Riak 5.5. Aerospike 5.6. FoundationDB 6. Columnar data stores 6.1. Accumulo 6.2. Cassandra 6.3. HBase 7. Graph data &hellip;<\/p>\n","protected":false},"author":141,"featured_media":2386,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[15],"tags":[834,584,850,415,974,1229,112,349,1236,113,657,117,939,244],"class_list":["post-48232","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-software-development","tag-cassandra","tag-couchbase","tag-couchdb","tag-dynamo","tag-hbase","tag-mariadb","tag-mongodb","tag-neo4j","tag-newsql","tag-nosql","tag-postgresql","tag-redis","tag-riak","tag-sql"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.5 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>NoSQL vs. SQL: Choosing a Data Management Solution - Java Code Geeks<\/title>\n<meta name=\"description\" content=\"Table Of Contents 1. Introduction 2. Distributed systems: the CAP theorem 3. Relational data stores 3.1. MySQL \/ MariaDB 3.2. PostgreSQL 3.3. Others 4.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.javacodegeeks.com\/2015\/10\/nosql-vs-sql.html\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"NoSQL vs. SQL: Choosing a Data Management Solution - Java Code Geeks\" \/>\n<meta property=\"og:description\" content=\"Table Of Contents 1. Introduction 2. Distributed systems: the CAP theorem 3. Relational data stores 3.1. MySQL \/ MariaDB 3.2. PostgreSQL 3.3. Others 4.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.javacodegeeks.com\/2015\/10\/nosql-vs-sql.html\" \/>\n<meta property=\"og:site_name\" content=\"Java Code Geeks\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/javacodegeeks\" \/>\n<meta property=\"article:published_time\" content=\"2015-10-28T13:21:50+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2023-12-08T07:46:23+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.javacodegeeks.com\/wp-content\/uploads\/2012\/10\/software-development-2-logo.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"150\" \/>\n\t<meta property=\"og:image:height\" content=\"150\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Andrey Redko\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@javacodegeeks\" \/>\n<meta name=\"twitter:site\" content=\"@javacodegeeks\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Andrey Redko\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"23 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/2015\\\/10\\\/nosql-vs-sql.html#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/2015\\\/10\\\/nosql-vs-sql.html\"},\"author\":{\"name\":\"Andrey Redko\",\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/#\\\/schema\\\/person\\\/771a6504862edc45322776832cbce413\"},\"headline\":\"NoSQL vs. SQL: Choosing a Data Management Solution\",\"datePublished\":\"2015-10-28T13:21:50+00:00\",\"dateModified\":\"2023-12-08T07:46:23+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/2015\\\/10\\\/nosql-vs-sql.html\"},\"wordCount\":5145,\"commentCount\":6,\"publisher\":{\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/#organization\"},\"image\":{\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/2015\\\/10\\\/nosql-vs-sql.html#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/www.javacodegeeks.com\\\/wp-content\\\/uploads\\\/2012\\\/10\\\/software-development-2-logo.jpg\",\"keywords\":[\"Cassandra\",\"Couchbase\",\"CouchDB\",\"Dynamo\",\"HBase\",\"MariaDB\",\"MongoDB\",\"Neo4j\",\"NewSQL\",\"NoSQL\",\"PostgreSQL\",\"Redis\",\"Riak\",\"SQL\"],\"articleSection\":[\"Software Development\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/www.javacodegeeks.com\\\/2015\\\/10\\\/nosql-vs-sql.html#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/2015\\\/10\\\/nosql-vs-sql.html\",\"url\":\"https:\\\/\\\/www.javacodegeeks.com\\\/2015\\\/10\\\/nosql-vs-sql.html\",\"name\":\"NoSQL vs. SQL: Choosing a Data Management Solution - Java Code Geeks\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/2015\\\/10\\\/nosql-vs-sql.html#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/2015\\\/10\\\/nosql-vs-sql.html#primaryimage\"},\"thumbnailUrl\":\"https:\\\/\\\/www.javacodegeeks.com\\\/wp-content\\\/uploads\\\/2012\\\/10\\\/software-development-2-logo.jpg\",\"datePublished\":\"2015-10-28T13:21:50+00:00\",\"dateModified\":\"2023-12-08T07:46:23+00:00\",\"description\":\"Table Of Contents 1. Introduction 2. Distributed systems: the CAP theorem 3. Relational data stores 3.1. MySQL \\\/ MariaDB 3.2. PostgreSQL 3.3. Others 4.\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/2015\\\/10\\\/nosql-vs-sql.html#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/www.javacodegeeks.com\\\/2015\\\/10\\\/nosql-vs-sql.html\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/2015\\\/10\\\/nosql-vs-sql.html#primaryimage\",\"url\":\"https:\\\/\\\/www.javacodegeeks.com\\\/wp-content\\\/uploads\\\/2012\\\/10\\\/software-development-2-logo.jpg\",\"contentUrl\":\"https:\\\/\\\/www.javacodegeeks.com\\\/wp-content\\\/uploads\\\/2012\\\/10\\\/software-development-2-logo.jpg\",\"width\":150,\"height\":150},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/2015\\\/10\\\/nosql-vs-sql.html#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\\\/\\\/www.javacodegeeks.com\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Software Development\",\"item\":\"https:\\\/\\\/www.javacodegeeks.com\\\/category\\\/software-development\"},{\"@type\":\"ListItem\",\"position\":3,\"name\":\"NoSQL vs. SQL: Choosing a Data Management Solution\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/#website\",\"url\":\"https:\\\/\\\/www.javacodegeeks.com\\\/\",\"name\":\"Java Code Geeks\",\"description\":\"Java Developers Resource Center\",\"publisher\":{\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/#organization\"},\"alternateName\":\"JCG\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\\\/\\\/www.javacodegeeks.com\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/#organization\",\"name\":\"Exelixis Media P.C.\",\"url\":\"https:\\\/\\\/www.javacodegeeks.com\\\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/#\\\/schema\\\/logo\\\/image\\\/\",\"url\":\"https:\\\/\\\/www.javacodegeeks.com\\\/wp-content\\\/uploads\\\/2022\\\/06\\\/exelixis-logo.png\",\"contentUrl\":\"https:\\\/\\\/www.javacodegeeks.com\\\/wp-content\\\/uploads\\\/2022\\\/06\\\/exelixis-logo.png\",\"width\":864,\"height\":246,\"caption\":\"Exelixis Media P.C.\"},\"image\":{\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/#\\\/schema\\\/logo\\\/image\\\/\"},\"sameAs\":[\"https:\\\/\\\/www.facebook.com\\\/javacodegeeks\",\"https:\\\/\\\/x.com\\\/javacodegeeks\"]},{\"@type\":\"Person\",\"@id\":\"https:\\\/\\\/www.javacodegeeks.com\\\/#\\\/schema\\\/person\\\/771a6504862edc45322776832cbce413\",\"name\":\"Andrey Redko\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/16419ce8394173028eddaeb992859862bab50cfcf74589fa9bb9a3dd8bb27518?s=96&d=mm&r=g\",\"url\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/16419ce8394173028eddaeb992859862bab50cfcf74589fa9bb9a3dd8bb27518?s=96&d=mm&r=g\",\"contentUrl\":\"https:\\\/\\\/secure.gravatar.com\\\/avatar\\\/16419ce8394173028eddaeb992859862bab50cfcf74589fa9bb9a3dd8bb27518?s=96&d=mm&r=g\",\"caption\":\"Andrey Redko\"},\"description\":\"Andriy is a well-grounded software developer with more then 12 years of practical experience using Java\\\/EE, C#\\\/.NET, C++, Groovy, Ruby, functional programming (Scala), databases (MySQL, PostgreSQL, Oracle) and NoSQL solutions (MongoDB, Redis).\",\"sameAs\":[\"http:\\\/\\\/aredko.blogspot.com\\\/\",\"http:\\\/\\\/ca.linkedin.com\\\/in\\\/aredko\"],\"url\":\"https:\\\/\\\/www.javacodegeeks.com\\\/author\\\/andrey-redko\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"NoSQL vs. SQL: Choosing a Data Management Solution - Java Code Geeks","description":"Table Of Contents 1. Introduction 2. Distributed systems: the CAP theorem 3. Relational data stores 3.1. MySQL \/ MariaDB 3.2. PostgreSQL 3.3. Others 4.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.javacodegeeks.com\/2015\/10\/nosql-vs-sql.html","og_locale":"en_US","og_type":"article","og_title":"NoSQL vs. SQL: Choosing a Data Management Solution - Java Code Geeks","og_description":"Table Of Contents 1. Introduction 2. Distributed systems: the CAP theorem 3. Relational data stores 3.1. MySQL \/ MariaDB 3.2. PostgreSQL 3.3. Others 4.","og_url":"https:\/\/www.javacodegeeks.com\/2015\/10\/nosql-vs-sql.html","og_site_name":"Java Code Geeks","article_publisher":"https:\/\/www.facebook.com\/javacodegeeks","article_published_time":"2015-10-28T13:21:50+00:00","article_modified_time":"2023-12-08T07:46:23+00:00","og_image":[{"width":150,"height":150,"url":"https:\/\/www.javacodegeeks.com\/wp-content\/uploads\/2012\/10\/software-development-2-logo.jpg","type":"image\/jpeg"}],"author":"Andrey Redko","twitter_card":"summary_large_image","twitter_creator":"@javacodegeeks","twitter_site":"@javacodegeeks","twitter_misc":{"Written by":"Andrey Redko","Est. reading time":"23 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.javacodegeeks.com\/2015\/10\/nosql-vs-sql.html#article","isPartOf":{"@id":"https:\/\/www.javacodegeeks.com\/2015\/10\/nosql-vs-sql.html"},"author":{"name":"Andrey Redko","@id":"https:\/\/www.javacodegeeks.com\/#\/schema\/person\/771a6504862edc45322776832cbce413"},"headline":"NoSQL vs. SQL: Choosing a Data Management Solution","datePublished":"2015-10-28T13:21:50+00:00","dateModified":"2023-12-08T07:46:23+00:00","mainEntityOfPage":{"@id":"https:\/\/www.javacodegeeks.com\/2015\/10\/nosql-vs-sql.html"},"wordCount":5145,"commentCount":6,"publisher":{"@id":"https:\/\/www.javacodegeeks.com\/#organization"},"image":{"@id":"https:\/\/www.javacodegeeks.com\/2015\/10\/nosql-vs-sql.html#primaryimage"},"thumbnailUrl":"https:\/\/www.javacodegeeks.com\/wp-content\/uploads\/2012\/10\/software-development-2-logo.jpg","keywords":["Cassandra","Couchbase","CouchDB","Dynamo","HBase","MariaDB","MongoDB","Neo4j","NewSQL","NoSQL","PostgreSQL","Redis","Riak","SQL"],"articleSection":["Software Development"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.javacodegeeks.com\/2015\/10\/nosql-vs-sql.html#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.javacodegeeks.com\/2015\/10\/nosql-vs-sql.html","url":"https:\/\/www.javacodegeeks.com\/2015\/10\/nosql-vs-sql.html","name":"NoSQL vs. SQL: Choosing a Data Management Solution - Java Code Geeks","isPartOf":{"@id":"https:\/\/www.javacodegeeks.com\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.javacodegeeks.com\/2015\/10\/nosql-vs-sql.html#primaryimage"},"image":{"@id":"https:\/\/www.javacodegeeks.com\/2015\/10\/nosql-vs-sql.html#primaryimage"},"thumbnailUrl":"https:\/\/www.javacodegeeks.com\/wp-content\/uploads\/2012\/10\/software-development-2-logo.jpg","datePublished":"2015-10-28T13:21:50+00:00","dateModified":"2023-12-08T07:46:23+00:00","description":"Table Of Contents 1. Introduction 2. Distributed systems: the CAP theorem 3. Relational data stores 3.1. MySQL \/ MariaDB 3.2. PostgreSQL 3.3. Others 4.","breadcrumb":{"@id":"https:\/\/www.javacodegeeks.com\/2015\/10\/nosql-vs-sql.html#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.javacodegeeks.com\/2015\/10\/nosql-vs-sql.html"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.javacodegeeks.com\/2015\/10\/nosql-vs-sql.html#primaryimage","url":"https:\/\/www.javacodegeeks.com\/wp-content\/uploads\/2012\/10\/software-development-2-logo.jpg","contentUrl":"https:\/\/www.javacodegeeks.com\/wp-content\/uploads\/2012\/10\/software-development-2-logo.jpg","width":150,"height":150},{"@type":"BreadcrumbList","@id":"https:\/\/www.javacodegeeks.com\/2015\/10\/nosql-vs-sql.html#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.javacodegeeks.com\/"},{"@type":"ListItem","position":2,"name":"Software Development","item":"https:\/\/www.javacodegeeks.com\/category\/software-development"},{"@type":"ListItem","position":3,"name":"NoSQL vs. SQL: Choosing a Data Management Solution"}]},{"@type":"WebSite","@id":"https:\/\/www.javacodegeeks.com\/#website","url":"https:\/\/www.javacodegeeks.com\/","name":"Java Code Geeks","description":"Java Developers Resource Center","publisher":{"@id":"https:\/\/www.javacodegeeks.com\/#organization"},"alternateName":"JCG","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.javacodegeeks.com\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.javacodegeeks.com\/#organization","name":"Exelixis Media P.C.","url":"https:\/\/www.javacodegeeks.com\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.javacodegeeks.com\/#\/schema\/logo\/image\/","url":"https:\/\/www.javacodegeeks.com\/wp-content\/uploads\/2022\/06\/exelixis-logo.png","contentUrl":"https:\/\/www.javacodegeeks.com\/wp-content\/uploads\/2022\/06\/exelixis-logo.png","width":864,"height":246,"caption":"Exelixis Media P.C."},"image":{"@id":"https:\/\/www.javacodegeeks.com\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/javacodegeeks","https:\/\/x.com\/javacodegeeks"]},{"@type":"Person","@id":"https:\/\/www.javacodegeeks.com\/#\/schema\/person\/771a6504862edc45322776832cbce413","name":"Andrey Redko","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/16419ce8394173028eddaeb992859862bab50cfcf74589fa9bb9a3dd8bb27518?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/16419ce8394173028eddaeb992859862bab50cfcf74589fa9bb9a3dd8bb27518?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/16419ce8394173028eddaeb992859862bab50cfcf74589fa9bb9a3dd8bb27518?s=96&d=mm&r=g","caption":"Andrey Redko"},"description":"Andriy is a well-grounded software developer with more then 12 years of practical experience using Java\/EE, C#\/.NET, C++, Groovy, Ruby, functional programming (Scala), databases (MySQL, PostgreSQL, Oracle) and NoSQL solutions (MongoDB, Redis).","sameAs":["http:\/\/aredko.blogspot.com\/","http:\/\/ca.linkedin.com\/in\/aredko"],"url":"https:\/\/www.javacodegeeks.com\/author\/andrey-redko"}]}},"_links":{"self":[{"href":"https:\/\/www.javacodegeeks.com\/wp-json\/wp\/v2\/posts\/48232","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.javacodegeeks.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.javacodegeeks.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.javacodegeeks.com\/wp-json\/wp\/v2\/users\/141"}],"replies":[{"embeddable":true,"href":"https:\/\/www.javacodegeeks.com\/wp-json\/wp\/v2\/comments?post=48232"}],"version-history":[{"count":0,"href":"https:\/\/www.javacodegeeks.com\/wp-json\/wp\/v2\/posts\/48232\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.javacodegeeks.com\/wp-json\/wp\/v2\/media\/2386"}],"wp:attachment":[{"href":"https:\/\/www.javacodegeeks.com\/wp-json\/wp\/v2\/media?parent=48232"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.javacodegeeks.com\/wp-json\/wp\/v2\/categories?post=48232"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.javacodegeeks.com\/wp-json\/wp\/v2\/tags?post=48232"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}