NoSQL
NoSQL
NoSQL means?
No + SQL?
No relational DBMS
NonRel, suggested alternative name
Not only SQL
Try to keep the relation with the RDMS
Now, NoSQL stands for
Db or data storage
Do not follow conventional RDBMS
2 Advanced Database system principle
History
Long long ago, there is . . .
With the first batch of computers
Hierarchical catalog:
needs storage authorization and authentication
Flourished in the era of
Parallelized computation (DBMS)
Distributed computation (DBMS)
Reborn in Web scale applications
AliBaba, FB, Weibo
3 Advanced Database system principle
Challenges to RDBMS (1)
Do you know the first search engine?
Archie: 1990
McGill University, 3 students
Search a file by filename
First one to index internet ftp files
Still not a real search engine
Veronica (Gopher)
Gopher, information search system
By Nevada System Computing Services University, 1993
Might be the first search engine
Inktomi
Could be seen as the second one
Yahoo later use this search engine
4 Advanced Database system principle
Challenges to RDBMS (2)
RDBMS is challenged a lot!
From Inktomi to Google
RDBMS suppose
data attributes are well defined
in a good structure
has relations among data (entity)
index could be built to speedup queries
allows some irregular or missing structure
However, is hard to handle large-scale, sparse data
5 Advanced Database system principle
Challenges to RDBMS (3)
To cope with challenges, RDBMS could
Denormalization
Remove constraints
Relax requirements on transactions
But, after that, RDBMS is more like a NoSQL
6 Advanced Database system principle
NoSQL & Hadoop
NoSQL is proposed, which could
Reduce the difficulty to handle big, sparse data
But, without the ability:
guarantee transaction integrity
flexible index & query
The most missing property of NoSQL is
SQL !!!
Google built and published many works
The first open source software is released
By the founder of Lucene - open source search engine
2003, core group of Lucene joined Yahoo!
Then, Hadoop is born! Core person Doug Cutting
7 Advanced Database system principle
Content
Types of NoSQL DB
Column-oriented DB
K-value Store
Document DBs
Map-reduce
8 Advanced Database system principle
Types & Produces of NoSQL DB
Key-value
Graph database
Document-oriented
Column-oriented
9 Advanced Database system principle
NoSQL databases key-value
Name Producer Data model Querying
SimpleDB Amazon set of couples (key, {attribute}), restricted SQL; select, delete,
where attribute is a couple GetAttributes, and
(name, value) PutAttributes operations
Redis Salvatore set of couples (key, value), primitive operations for each
Sanfilippo where value is simple typed value type
value, list, ordered (according
to ranking) or unordered set,
hash value
Dynamo Amazon like SimpleDB simple get operation and put
in a context
Voldemort LinkeId like SimpleDB similar to Dynamo
10 Advanced Database system principle
NoSQL databases Column-oriented
Name Producer Data model Querying
BigTable Google set of couples (key, {value}) selection (by combination of
(in doubt) row, column, and time stamp
ranges)
HBase Apache groups of columns (a BigTable JRUBY IRB-based shell
clone) (similar to SQL)
Hypertable Hypertable like BigTable HQL (Hypertext Query
Language)
CASSANDRA Apache columns, groups of columns simple selections on key,
(originally corresponding to a key range queries, column or
Facebook) (supercolumns) columns ranges
PNUTS Yahoo (hashed or ordered) tables, selection and projection from a
typed arrays, flexible schema single table (retrieve an
arbitrary single record by
primary key, range queries,
complex predicates, ordering,
top-k)
11 Advanced Database system principle
NoSQL databases document-based
Name Producer Data model Querying
MongoDB 10gen object-structured manipulations with objects in
documents stored in collections (find object or
collections; objects via simple selections
each object has a primary and logical expressions,
key called ObjectId delete, update,)
Couchbase Couchbase1 document as a list of by key and key range, views
named (structured) items via Javascript and
(JSON document) MapReduce
1after merging Membase and CouchOne
12 Advanced Database system principle
Content
Types of NoSQL DB
Column-oriented DB
K-value Store
Document DBs
Map-reduce
13 Advanced Database system principle
Column-oriented DB
A table like this: (example from wiki)
EmpId Lastname Firstname Salary
1 Smith Joe 40000
2 Jones Mary 50000
3 Johnson Cathy 44000
14 Advanced Database system principle
Column-oriented DB
Row-oriented DB (called row-stores)
stores by row, as
1,Smith,Joe,40000;
2,Jones,Mary,50000;
3,Johnson,Cathy,44000;
Column-oriented DB (column-stores)
Stores by column, as
1,2,3;
Smith,Jones,Johnson;
Joe,Mary,Cathy;
40000,50000,44000;
15 Advanced Database system principle
Column-oriented DB
Advantage, or fits for
Aggregates by values of certain columns
Frequently modify some columns
Can cope with temporal data
Do not need to define schema in advance
Column store cares more about disk storage
But not the upper level data model (column family)
16 Advanced Database system principle
Column-oriented DB
Frequently modify some columns
A table in RDBMS like
EmpId name Salary
1 Smith Joe 40000
2 Jones Mary 50000
3 Johnson Cathy 44000
Having millions of records, then split
name -> first name, last name
Then have to alter table structure, which is unacceptable
17 Advanced Database system principle
Column-oriented DB
Can cope with temporal data
Lastname Firstname street zip habit address
Consider different timestamp
Hard to handle in row-stores
Easy to handle in column-stores
Organized in column, with a time attribute
18 Advanced Database system principle
Column-oriented DB
Do not need to define schema in advance
Lastname Firstname street zip habit address
Row-stores needs to pre-define schema
Column-stores do not
Pre-defines column family
A column family contains any # of columns
Example can be:
3 Column family: {name}, {address}, {preferences}
19 Advanced Database system principle
Column-oriented DB
Friendly to sparse, mutated data
Only valid column value can be stored
Null will not be stored
Design is a scalable one
Single table is stored at multiple sites
could be easily extended to have millions, billions of columns
20 Advanced Database system principle
Column-oriented DB
With column family, column, timestamp, a record:
{
“row_key_i”:{
“name”:{
“firstname”: {
1:“Mary”,
5: “Angel”
}
}
…
}
21 Advanced Database system principle
}
Column-oriented DB
Compression
join
22 Advanced Database system principle
Benefits of Compression
Reduces the size of Data
Improve I/O Performance
Reduce seek time:
data are stored nearer to each other.
Reduce transfer time:
as there is less data
Increase buffer hit rate:
buffer can hold larger fraction of data
23 Advanced Database system principle
High Compressibility
Each attribute is stored in a separate column.
Not only use traditional compression techniques
Dictionary encoding, Huffman Encoding, etc
But also use column-oriented techniques
Run-Length Encoding
24 Advanced Database system principle
Run-Length Encoding
Principles:
A scanned row, compress same pixel with the color and its counter.
Example:
aaabccccccddeee, is compressed to
3a1b6c2d3e
Good for compressing images
with large, simple colored area
Various algorithms are proposed based on this
25 Advanced Database system principle
How to Query compressed Column?
De-compress the data
Query the compressed column directly?
For Run-Length Encoding
26 Advanced Database system principle
A Simple Example
In column C1, the value “42” appears 1000 times
consecutively.
Assume C1 is compressed by Run-Length Encoding
A simple query: SUM(C1), can be directly got
==42 * 1000
Match a key
Extract its occurence
27 Advanced Database system principle
Compression-Aware Optimization
Natural Join
One input column is compressed by RLE
Another input column is uncompressed
Do the join directly
Reduce the number of operations by a factor of k, where k is the
run-length of the RLE triple.
Count
28 Advanced Database system principle
29 Advanced Database system principle
Content
Types of NoSQL DB
Column-oriented DB
K-value Store
Document DBs
Map-reduce
30 Advanced Database system principle
Key-Value Store
Key-Value Database
queries faster
store large volumes of data
support high-concurrency
very suitable for query by primary key
Complex query conditions
Real-Time Search Engine
31 Advanced Database system principle
Amazon dynamo
32 Advanced Database system principle
Oracle NoSQL Database
distributed key/value pair storage
data is stored as key-value pairs
which are written to particular storage nodes
based on the hashed value of the primary key
33 Advanced Database system principle
Oracle NoSQL Database
34 Advanced Database system principle
Key-value Store
A collection of Storage Nodes which host a set of Replication
Nodes.
Given a traditional three-tier web architecture
Takes the place of the back-end database
Runs alongside it
35 Advanced Database system principle
Key-value Store
36 Advanced Database system principle
Content
Types of NoSQL DB
Column-oriented DB
K-value Store
Document DBs
Map-reduce
37 Advanced Database system principle
An example of Document store
To create a db for location-based services
Create db prefs
Create collection: location
Codes:
use prefs
w = {name: “John Doe”, zip: 10001};
x = {name: “Don Doe”, zip: 10001};
db.location.save(w);
db.location.save(x);
38 Advanced Database system principle
An example of Document store
use prefs : make prefs as the current db
But db is not created yet
Collection is never created
Only after db.location.save(w);
Query all records:
Code: db.location.find();
results:
{“_id”:ObjectedId(“45dfaft3daf…24B”), “name”: “John Doe”, “zip”:
10001};
…
39 Advanced Database system principle
An example of Document store
Query by zip:
Code: db.location.find({zip:10001});
results:
{“_id”:ObjectedId(“45dfaft3daf…24B”), “name”: “John Doe”, “zip”:
10001};
{“_id”:ObjectedId(“4cdfsf33daf…24B”), “name”: “Don Doe”, “zip”:
10001};
40 Advanced Database system principle
An example of Document store
ObjectedId uniquely identify a document
Stored in _id
Both are maintained by developers
41 Advanced Database system principle
Document store
A MongoDB deployment hosts a number of databases.
A database holds a set of collections.
A collection holds a set of documents.
A document is a set of key-value pairs.
The document structure in MongoDB are BSON objects with
support for the full range of BSON types
42 Advanced Database system principle
BSON
BSON: Binary Serialized Document Format
A kind of storage format
Like JSON: support document/array object
Difference: support data type: Date,BinData
Advantage: schema-less, flexible
Disadvantage: space waste
Example: {"hello":"world"}
Key name: “hello”
cstring::= (byte*) "/x00"
*: 0-> n, byte /x00: end of key
43 Advanced Database system principle
Term mapping
44 Advanced Database system principle
Schema design
MongoDB: embedding and link
Embedding:
the nesting of objects and arrays,
inside a BSON document(prejoined).
Links:
references between documents(client-side follow-up query).
"contains" relationships, one to many; duplication of data, many
to many
45 Advanced Database system principle
Schema design
46 Advanced Database system principle
Schema design
47 Advanced Database system principle
Content
Types of NoSQL DB
Column-oriented DB
K-value Store
Document DBs
Map-reduce
48 Advanced Database system principle
Map Reduce
Input & Output: a set of key/value pairs
Programmer specifies two functions:
map (in_key, in_value) -> list(out_key, intermediate_value)
Processes input key/value pair
Produces set of intermediate pairs
reduce (out_key, list(intermediate_value)) -> list(out_value)
Combines all intermediate values for a particular key
Produces a set of merged output values (usually just one)
49 Advanced Database system principle
Example: Count word occurrences
map(String input_key, String input_value):
// input_key: document name
// input_value: document contents
for each word w in input_value:
EmitIntermediate(w, "1");
reduce(String output_key, Iterator intermediate_values):
// output_key: a word
// output_values: a list of counts
int result = 0;
for each v in intermediate_values:
result += ParseInt(v);
Emit(AsString(result));
50 Advanced Database system principle
Implementation Environment- Google
Large clusters of commodity PCs connected together with
switched Ethernet
100s/1000s of 2-CPU x86 machines, 2-4 GB of memory
Commodity networking hardware: 100 MB/s or 1GB/s.
Storage is on local IDE disks
GFS: distributed file system manages data (SOSP'03)
Job scheduling system: jobs made up of tasks, scheduler assigns tasks
to machines
Implementation is a C++ library linked into user programs
51 Advanced Database system principle
Distributed Execution Overview
User
Program
fork fork fork
assign Master
assign
map
reduce
Input Data Worker
write Output
local Worker File 0
Split 0 read write
Split 1 Worker
Split 2 Output
Worker File 1
Worker remote
read,
sort
52 Advanced Database system principle
Dynamo:
Amazon’s Highly Available Key-value Store
53 Advanced Database system principle
Existing storage techniques
A number of
Famous one is simple DB
Full name:
Amazon Simple Storage Service
Known as Amazon S3
In Amazon, many applications only need primary-key access
Best seller list
Shopping carts
Customer preferences
Session management …
54 Advanced Database system principle
Dynamo
Provides:
Good scalability, high availability
Through a synthesis of:
Partitioning, replicating (consistent hashing)
Consistency (versioning)
Consistency (quorum-like protocol)
Failure detection (gossip based algorithm)
…
A decentralized one
Storage nodes can be added, removed without manual
partitioning or redistribution.
55 Advanced Database system principle
System Assumption and Requirements
Query model:
Simple read and write to a data item
Item is uniquely identified by a key
No operations span multiple data items
ACID (atomicity, consistency, isolation, durability) properties:
Weak consistency, as it results in high availability
Permits only single key updates
56 Advanced Database system principle
Service Level Agreements (SLA)
To provide functionality in a bounded time:
Most of systems measuring SLA, in terms of
Mean, median value
Amazon performs 99.9% of the distribution:
To provide a better experience
57 Advanced Database system principle
Design Considerations
Strong consistency cause low availability:
Most data replication algorithms synchronize data replicas to
provide strong consistency
Indicates program have to wait, until a correct answer could be
made to a query
Therefore, cause low availability
Dynamo is designed to be eventually consistent, i.e., all
updates reach all replicas eventually.
58 Advanced Database system principle
Design Considerations
Need to consider:
When to process update conflicts
Who to perform the process
When?
Conflicts should be resolved during either reads or writes
If resolve during writes, read is simple, but writes may be
rejected, within a given time
Dynamo tries to provide “always writeable” data store, process
it during reads
Shop cart: over 10m request, 3m checkout per day
Writes are never rejected
59 Advanced Database system principle
Design Considerations
Who?
Done by db store or applications
If done by db store
Limited choices (simple), such as
Last write wins
If done by application
Can merge different versions of conflicts, return a single unified
result
Shop cart
Could have other strategies
60 Advanced Database system principle
Design Considerations
Incremental scalability
Scale out one storage node, with minimal impact on the system
symmetry
Each node should have the same set of responsibilities as its
peers
No distinguishable node is allowed
Easy for maintenance
Decentralization
Centralization control may cause outage
Need a peer-to-peer decentralized control
61 Advanced Database system principle
Other Issues
All nodes are trusted
Do not support hierarchical namespaces
Nor complex schema
Latency sensitive system
99.9% reads, writes operations
Finished within 100 milliseconds
To meet latency requirement
Avoid routing requests through multiple nodes
Multiple hop routing increase response time
Dynamo is 0-hop, each node maintains enough routing info, route a
request to appropriate node directly.
62 Advanced Database system principle
System architecture
Should be scalable and robust
Have following components:
Load balancing
Membership and failure detection
Replica synchronization
Overload handling
State transfer
Concurrency and job scheduling
Request marshalling
Request routing
…
63 Advanced Database system principle
System interface
Get() and put()
Get(key)
Locates the object (or replicas) associated with the key in the
storage system
Returns a list of objects with conflicting versions, along with a
context.
Context
Encodes system metadata about the object
Opaque to caller
Includes info such as versions
Stores together with the objects
64 Advanced Database system principle
System interface
put(key, context, object)
Determines where the replicas of the object should be placed
Based on the key value
Writes the replicas to disk
65 Advanced Database system principle
Partitioning Algorithm
For incremental scalability
Need to partition data dynamically
Consistent hashing:
to partition data over all available data nodes
66 Advanced Database system principle
Consistent Hashing
Out range of hash function:
a fixed ring
i.e., the largest hash value wraps around
the smallest one
Each node -> a random value, indicate its
position
Item is hashed (by key)
get a position on the ring
Then it walks along the ring clockwise, to
find the first node with position larger
than the item’s position
67 Advanced Database system principle
Advantage of Consistent Hashing
Each node is responsible for the region in the ring between it
and its predecessor node
Departure or arrival of node
Only affects its immediate neighbors
Other nodes remain unaffected
Challenges:
The random assigned value, may cause
non-uniform data, non-uniform load distribution
Resolution
mapping a node to multiple points, instead of one
To this end, use virtual nodes
A node corresponds to multiple virtual nodes
68 Advanced Database system principle
Strategies adopted by Dynamo
Strategy 1: T random tokens per node and partition by token
value
Each node is assigned with T tokens (position in the ring)
Chose uniformly at random in the ring
Tokens are ordered by their values in the ring
Every two consecutive tokens form a range
The last token and
the first token form
the wrapping range
A,B,C, 3 unique
preferred nodes for k1
69 Advanced Database system principle
Strategies adopted by Dynamo
Strategy 2: T random tokens per node and equal-sized
partitions:
The hash space is divided into Q equally sized ranges,
Q>> S*T, Q>>N
S: # of nodes in the system
Tokens are used to build function, mapping the hash space to
the ordered list of nodes.
70 Advanced Database system principle
Strategies adopted by Dynamo
Strategy 3: Q/S tokens per node, equal-sized partitions:
Similar to S2, the hash space is divided into Q equally sized
ranges,
Each node is assigned Q/S tokens
A node leaves system, its tokens are randomly distributed to the
remaining nodes such that the properties keeps.
71 Advanced Database system principle
End of Chapter