Notes NoSQL Module 2 Leason 6

Uploaded by

shobhacr24

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views3 pages

Notes NoSQL Module 2 Leason 6

Uploaded by

shobhacr24

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

Chapter 6.

Version Stamps

Many critics of NoSQL databases focus on the lack of support for transactions. Transactions are a useful
tool that helps programmers support consistency. One reason why many NoSQL proponents worry less
about a lack of transactions is that aggregate-oriented NoSQL databases do support atomic updates
within an aggregate—and aggregates are designed so that their data forms a natural unit of update.
That said, it’s true that transactional needs are something to take into account when you decide what
database to use.

As part of this, it’s important to remember that transactions have limitations. Even within a
transactional system we still have to deal with updates that require human intervention and usually
cannot be run within transactions because they would involve holding a transaction open for too long.
We can cope with these using version stamps—which turn out to be handy in other situations as well,
particularly as we move away from the single-server distribution model.

6.1. Business and System Transactions

The need to support update consistency without transactions is actually a common feature of systems
even when they are built on top of transactional databases. When users think about transactions, they
usually mean business transactions. A business transaction may be something like browsing a product
catalog, choosing a bottle of Talisker at a good price, filling in credit card information, and confirming
the order. Yet all of this usually won’t occur within the system transaction provided by the database
because this would mean locking the database elements while the user is trying to find their credit card
and gets called off to lunch by their colleagues.

Usually applications only begin a system transaction at the end of the interaction with the user, so that
the locks are only held for a short period of time. The problem, however, is that calculations and
decisions may have been made based on data that’s changed. The price list may have updated the price
of the Talisker, or someone may have updated the customer’s address, changing the shipping charges.

Some databases provide a similar mechanism of conditional update that allows you to ensure updates
won’t be based on stale data. You can do this check yourself, although you then have to ensure no other
thread can run against the resource between your read and your update. (Sometimes this is called a
compare-and-set (CAS) operation, whose name comes from the CAS operations done in processors. The
difference is that a processor CAS compares a value before setting it, while a database conditional
update compares a version stamp of the value.)

There are various ways you can construct your version stamps. You can use a counter, always
incrementing it when you update the resource. Counters are useful since they make it easy to tell if one
version is more recent than another. On the other hand, they require the server to generate the counter
value, and also need a single master to ensure the counters aren’t duplicated.

Another approach is to create a GUID, a large random number that’s guaranteed to be unique. These
use some combination of dates, hardware information, and whatever other sources of randomness they
can pick up. The nice thing about GUIDs is that they can be generated by anyone and you’ll never get a
duplicate; a disadvantage is that they are large and can’t be compared directly for recentness.
A third approach is to make a hash of the contents of the resource. With a big enough hash key size, a
content hash can be globally unique like a GUID and can also be generated by anyone; the advantage is
that they are deterministic—any node will generate the same content hash for same resource data.
However, like GUIDs they can’t be directly compared for recentness, and they can be lengthy.

A fourth approach is to use the timestamp of the last update. Like counters, they are reasonably short
and can be directly compared for recentness, yet have the advantage of not needing a single master.
Multiple machines can generate timestamps—but to work properly, their clocks have to be kept in sync.
One node with a bad clock can cause all sorts of data corruptions. There’s also a danger that if the
timestamp is too granular you can get duplicates—it’s no good using timestamps of a millisecond
precision if you get many updates per millisecond.

6.2. Version Stamps on Multiple Nodes

The basic version stamp works well when you have a single authoritative source for data, such as a
single server or master-slave replication. In that case the version stamp is controlled by the master. Any
slaves follow the master’s stamps. But this system has to be enhanced in a peer-to-peer distribution
model because there’s no longer a single place to set the version stamps.

If you’re asking two nodes for some data, you run into the chance that they may give you different
answers. If this happens, your reaction may vary depending on the cause of that difference. It may be
that an update has only reached one node but not the other, in which case you can accept the latest
(assuming you can tell which one that is). Alternatively, you may have run into an inconsistent update, in
which case you need to decide how to deal with that. In this situation, a simple GUID or etag won’t
suffice, since these don’t tell you enough about the relationships.

The simplest form of version stamp is a counter. Each time a node updates the data, it increments the
counter and puts the value of the counter into the version stamp. If you have blue and green slave
replicas of a single master, and the blue node answers with a version stamp of 4 and the green node
with 6, you know that the green’s answer is more recent.

In multiple-master cases, we need something fancier. One approach, used by distributed version control
systems, is to ensure that all nodes contain a history of version stamps. That way you can see if the blue
node’s answer is an ancestor of the green’s answer. This would either require the clients to hold onto
version stamp histories, or the server nodes to keep version stamp histories and include them when
asked for data. This also detects an inconsistency, which we would see if we get two version stamps and
neither of them has the other in their histories. Although version control systems keep these kinds of
histories, they aren’t found in NoSQL databases.

A simple but problematic approach is to use timestamps. The main problem here is that it’s usually
difficult to ensure that all the nodes have a consistent notion of time, particularly if updates can happen
rapidly. Should a node’s clock get out of sync, it can cause all sorts of trouble. In addition, you can’t
detect write-write conflicts with timestamps, so it would only work well for the single master case—and
then a counter is usually better.

The most common approach used by peer-to-peer NoSQL systems is a special form of version stamp
which we call a vector stamp. In essence, a vector stamp is a set of counters, one for each node. A vector
stamp for three nodes (blue, green, black) would look something like [blue: 43, green: 54, black: 12].
Each time a node has an internal update, it updates its own counter, so an update in the green node
would change the vector to [blue: 43, green: 55, black: 12]. Whenever two nodes communicate, they
synchronize their vector stamps. There are several variations of exactly how this synchronization is
done. We’re coining the term “vector stamp” as a general term in this book; you’ll also come across
vector clocks and version vectors—these are specific forms of vector stamps that differ in how they
synchronize.

By using this scheme you can tell if one version stamp is newer than another because the newer stamp
will have all its counters greater than or equal to those in the older stamp. So [blue: 1, green: 2, black: 5]
is newer than [blue:1, green: 1, black 5] since one of its counters is greater. If both stamps have a
counter greater than the other, e.g. [blue: 1, green: 2, black: 5] and [blue: 2, green: 1, black: 5], then you
have a write-write conflict. There may be missing values in the vector, in which case we use treat the
missing value as 0. So [blue: 6, black: 2] would be treated as [blue: 6, green: 0, black: 2]. This allows you
to easily add new nodes without invalidating the existing vector stamps.

6.3. Key Points

• Version stamps help you detect concurrency conflicts. When you read data, then update it, you can
check the version stamp to ensure nobody updated the data between your read and write.

• Version stamps can be implemented using counters, GUIDs, content hashes, timestamps, or a
combination of these.

• With distributed systems, a vector of version stamps allows you to detect when different nodes have
conflicting updates.

Version Stamps in NoSQL Databases
No ratings yet
Version Stamps in NoSQL Databases
5 pages
4 - Key-Value Stores
No ratings yet
4 - Key-Value Stores
47 pages
Unit-6 Transactions & Replications Syllabus: Introduction, System Model and Group Communication, Concurrency Control in Distributed
No ratings yet
Unit-6 Transactions & Replications Syllabus: Introduction, System Model and Group Communication, Concurrency Control in Distributed
20 pages
Explain The Update Consistency - Update (Write-Write Conflict), Read (Read-Write Conflict) With An Example and A Neat Diagram
No ratings yet
Explain The Update Consistency - Update (Write-Write Conflict), Read (Read-Write Conflict) With An Example and A Neat Diagram
6 pages
Chap. 6 Consistency & Replication: Distributed Systems
No ratings yet
Chap. 6 Consistency & Replication: Distributed Systems
31 pages
Big Data Analytics Module 2 Continued....
No ratings yet
Big Data Analytics Module 2 Continued....
56 pages
Nosql Module 2
100% (1)
Nosql Module 2
87 pages
Amazon Dynamo: A Key-Value Store Overview
No ratings yet
Amazon Dynamo: A Key-Value Store Overview
23 pages
No SQL
No ratings yet
No SQL
12 pages
CT113H Lecture 4 - Key-Value Stores
No ratings yet
CT113H Lecture 4 - Key-Value Stores
55 pages
CT113H Lecture 4 - Key-Value Stores
No ratings yet
CT113H Lecture 4 - Key-Value Stores
55 pages
Consistency in Distributed Systems
No ratings yet
Consistency in Distributed Systems
21 pages
Module-2 NOSQL
No ratings yet
Module-2 NOSQL
5 pages
Replication
No ratings yet
Replication
11 pages
07 Replication
No ratings yet
07 Replication
14 pages
IAT-I Question Paper With Solution of 18CS823 Nosql Database May-2021-Poonam Tijare
100% (1)
IAT-I Question Paper With Solution of 18CS823 Nosql Database May-2021-Poonam Tijare
12 pages
Dynamo: Amazon's Highly Available Key-Value Store
No ratings yet
Dynamo: Amazon's Highly Available Key-Value Store
21 pages
Dynamo
No ratings yet
Dynamo
19 pages
Dos 6
No ratings yet
Dos 6
22 pages
Replication Module7 Selfstudy
No ratings yet
Replication Module7 Selfstudy
32 pages
DS CH6 - Consistency and Replication
No ratings yet
DS CH6 - Consistency and Replication
18 pages
Notes NoSQL Module 2 Leason 5
No ratings yet
Notes NoSQL Module 2 Leason 5
6 pages
Understanding Distributed File Systems
No ratings yet
Understanding Distributed File Systems
10 pages
Nosql 1
No ratings yet
Nosql 1
40 pages
An Assignment On Maintaining Cache Consistency: Group - 6 Roll: 160103, 160109 and 160137
No ratings yet
An Assignment On Maintaining Cache Consistency: Group - 6 Roll: 160103, 160109 and 160137
4 pages
Consistency
No ratings yet
Consistency
42 pages
Consistency and Rep Contd
No ratings yet
Consistency and Rep Contd
28 pages
Managing Replicated Objects: Deterministic Thread Scheduling
No ratings yet
Managing Replicated Objects: Deterministic Thread Scheduling
12 pages
Intro To DS Chapter 5
No ratings yet
Intro To DS Chapter 5
76 pages
Distributed Systems Overview
No ratings yet
Distributed Systems Overview
48 pages
4.3 Amazon Dynamodb
No ratings yet
4.3 Amazon Dynamodb
16 pages
NoSQL Data Management Techniques
No ratings yet
NoSQL Data Management Techniques
27 pages
On Object and Database
No ratings yet
On Object and Database
15 pages
Dos
No ratings yet
Dos
86 pages
Key-Value Databases Explained
No ratings yet
Key-Value Databases Explained
75 pages
Lecture 27
No ratings yet
Lecture 27
19 pages
Distributed File Systems Guide
No ratings yet
Distributed File Systems Guide
16 pages
WINSEM2012-13 CP0029 06-Mar-2013 RM01 DFT 2
No ratings yet
WINSEM2012-13 CP0029 06-Mar-2013 RM01 DFT 2
46 pages
Replication and Consistency in Distributed Systems (Cont'd)
No ratings yet
Replication and Consistency in Distributed Systems (Cont'd)
17 pages
DS Chapter V7replication
No ratings yet
DS Chapter V7replication
33 pages
Lecture 1
No ratings yet
Lecture 1
38 pages
Differences in Distributed Computing
No ratings yet
Differences in Distributed Computing
14 pages
3 Key Value
No ratings yet
3 Key Value
32 pages
Slides
No ratings yet
Slides
31 pages
Understanding Distributed File Systems
No ratings yet
Understanding Distributed File Systems
42 pages
Hashing Techniques in Distributed Systems
No ratings yet
Hashing Techniques in Distributed Systems
38 pages
Multiversion Schemes
100% (1)
Multiversion Schemes
2 pages
Distributed Systems: Naming & Consistency
No ratings yet
Distributed Systems: Naming & Consistency
65 pages
Operating System
No ratings yet
Operating System
14 pages
Nosql Data Management
No ratings yet
Nosql Data Management
13 pages
Notes On Memcached
No ratings yet
Notes On Memcached
5 pages
3 MultiVersion Protocol - STD
No ratings yet
3 MultiVersion Protocol - STD
7 pages
Active vs Passive Replication in Systems
No ratings yet
Active vs Passive Replication in Systems
20 pages
Consistency & Replication in Distributed Systems
No ratings yet
Consistency & Replication in Distributed Systems
57 pages
3distributed File System
No ratings yet
3distributed File System
42 pages
SF Topics List
No ratings yet
SF Topics List
3 pages
Fax Error Codes and Remedies Guide
No ratings yet
Fax Error Codes and Remedies Guide
18 pages
Lab Assignment 5: Task-01
No ratings yet
Lab Assignment 5: Task-01
9 pages
Aix Interview Qust
100% (2)
Aix Interview Qust
21 pages
Delphi Tutorial
No ratings yet
Delphi Tutorial
84 pages
Xiaomi Camera Plugin Initialization Log
No ratings yet
Xiaomi Camera Plugin Initialization Log
204 pages
Application Update Using The USB Device Firmware Upgrade Class
No ratings yet
Application Update Using The USB Device Firmware Upgrade Class
33 pages
Formats: PSD AI Indd Jpeg PNG Gif PDF Tiff Eps SVG
No ratings yet
Formats: PSD AI Indd Jpeg PNG Gif PDF Tiff Eps SVG
2 pages
Database Design Document LocADoc
No ratings yet
Database Design Document LocADoc
12 pages
Anton Chuvakin SANS GIAC GCFA Certification Document
No ratings yet
Anton Chuvakin SANS GIAC GCFA Certification Document
68 pages
Hitachi AMS 2100 - Datasheet
No ratings yet
Hitachi AMS 2100 - Datasheet
306 pages
JNTUA DBMS Unit5 QuestionBank
No ratings yet
JNTUA DBMS Unit5 QuestionBank
1 page
Everdrive-64 X7: User Manual
No ratings yet
Everdrive-64 X7: User Manual
12 pages
Practical File - Sagar Singh-1
No ratings yet
Practical File - Sagar Singh-1
36 pages
Set Up OpenLDAP Server On Centos
No ratings yet
Set Up OpenLDAP Server On Centos
15 pages
Overview of Computer Memory Types
No ratings yet
Overview of Computer Memory Types
42 pages
New Text Document
No ratings yet
New Text Document
11 pages
2021 FWC ICT 6th Term Paper
No ratings yet
2021 FWC ICT 6th Term Paper
27 pages
Samsung FLASH Product Guide
No ratings yet
Samsung FLASH Product Guide
11 pages
Unit 5 Database Quiz Questions
No ratings yet
Unit 5 Database Quiz Questions
3 pages
Terms of Reference: Database Administrator
No ratings yet
Terms of Reference: Database Administrator
2 pages
Database Guide for IT Professionals
No ratings yet
Database Guide for IT Professionals
4 pages
Maxdb Database Backup and Recovery White Paper
No ratings yet
Maxdb Database Backup and Recovery White Paper
44 pages
Computer Science Test
No ratings yet
Computer Science Test
16 pages
MySQL Sports Club DBMS Project
No ratings yet
MySQL Sports Club DBMS Project
5 pages
DriveWire Manual
No ratings yet
DriveWire Manual
37 pages
ICT208 Algorithms and Data Structures
No ratings yet
ICT208 Algorithms and Data Structures
11 pages
Mazatrol Manual PDF
100% (2)
Mazatrol Manual PDF
41 pages
Linked Lists: Suppose We Have A Long List of Integers
No ratings yet
Linked Lists: Suppose We Have A Long List of Integers
9 pages
NYC Data Science Bootcamp Curriculum
No ratings yet
NYC Data Science Bootcamp Curriculum
8 pages

Notes NoSQL Module 2 Leason 6

Uploaded by

Notes NoSQL Module 2 Leason 6

Uploaded by

Chapter 6.

6.1. Business and System Transactions

6.2. Version Stamps on Multiple Nodes

6.3. Key Points

You might also like