Introduction to Replication and
Replica Sets
Norberto Leite
Senior Solutions Architect
Agenda
• Replica Sets Lifecycle
• Developing with Replica Sets
• Operational Considerations
Why Replication?
• How many have faced node failures?
• How many have been woken up from sleep to do a
fail-over(s)?
• How many have experienced issues due to network
latency?
• Different uses for data
– Normal processing
– Simple analytics
Replica Set Lifestyle
Node 1 Node 2
Node 3
Replica Set – Creation
Node 1 Node 2
Secondary Secondary
Heartbeat
Re
n
tio
p
lic
ica
ati
pl
o
Re
n
Node 3
Primary
Replica Set – Initialize
Primary Election
Node 1 Node 2
Secondary Heartbeat Secondary
Node 3
Replica Set – Failure
Replication
Node 1 Node 2
Secondary Primary
Heartbeat
Node 3
Replica Set – Failover
Replication
Node 1 Node 2
Secondary Primary
Heartbeat
n
tio
ica
pl
Re
Node 3
Recovery
Replica Set – Recovery
Replication
Node 1 Node 2
Secondary Primary
Heartbeat
n
tio
ica
pl
Re
Node 3
Secondary
Replica Set – Recovered
Replica Set Roles &
Configuration
Node 1 Node 2
Secondary Arbiter
Heartbeat
Re
p
lic
ati
on
Node 3
Primary
Replica Set Roles
Configuration Options
> conf = {
_id : "mySet",
members : [
{_id : 0, host : "A”, priority : 3},
{_id : 1, host : "B", priority : 2},
{_id : 2, host : "C”},
{_id : 3, host : "D", hidden : true},
{_id : 4, host : "E", hidden : true, slaveDelay : 3600}
]
}
> [Link](conf)
Configuration Options
> conf = {
_id : "mySet”,
members : [ Primary DC
{_id : 0, host : "A”, priority : 3},
{_id : 1, host : "B", priority : 2},
{_id : 2, host : "C”},
{_id : 3, host : "D", hidden : true},
{_id : 4, host : "E", hidden : true, slaveDelay : 3600}
]
}
> [Link](conf)
Configuration Options
> conf = {
Secondary DC
_id : "mySet”,
Default Priority = 1
members : [
{_id : 0, host : "A”, priority : 3},
{_id : 1, host : "B", priority : 2},
{_id : 2, host : "C”},
{_id : 3, host : "D", hidden : true},
{_id : 4, host : "E", hidden : true, slaveDelay : 3600}
]
}
> [Link](conf)
Configuration Options
> conf = {
_id : "mySet”,
members : [
{_id : 0, host : "A”, priority : 3},
Analytics
{_id : 1, host : "B", priority : 2},
node
{_id : 2, host : "C”},
{_id : 3, host : "D", hidden : true},
{_id : 4, host : "E", hidden : true, slaveDelay : 3600}
]
}
> [Link](conf)
Configuration Options
> conf = {
_id : "mySet”,
members : [
{_id : 0, host : "A”, priority : 3},
{_id : 1, host : "B", priority : 2},
{_id : 2, host : "C”},
{_id : 3, host : "D", hidden : true},
{_id : 4, host : "E", hidden : true, slaveDelay : 3600}
]
} Backup node
> [Link](conf)
Developing with
Replica Sets
Client Application
Driver
Write
Read
Primary
Secondary Secondary
Strong Consistency
Client Application
Driver
Write
d
Re
a
Re
a
Primary
d
Secondary Secondary
Delayed Consistency
Write Concern
• Network acknowledgement
• Wait for error
• Wait for journal sync
• Wait for replication
Driver
write
Primary
apply in
memory
Unacknowledged
Driver
getLastError
Primary
apply in
memory
MongoDB Acknowledged (wait for error)
Driver
getLastError
j:true
write
Primary
apply in write to
memory journal
Wait for Journal Sync
Driver
getLastError
write
w:2
Primary
replicate
apply in
memory
Secondary
Wait for Replication
Tagging
• Control where data is written to, and read from
• Each member can have one or more tags
– tags: {dc: "ny"}
– tags: {dc: "ny",
subnet: "192.168",
rack: "row3rk7"}
• Replica set defines rules for write concerns
• Rules can change without changing app code
Tagging Example
{
_id : "mySet",
members : [
{_id : 0, host : "A", tags : {"dc": "ny"}},
{_id : 1, host : "B", tags : {"dc": "ny"}},
{_id : 2, host : "C", tags : {"dc": "sf"}},
{_id : 3, host : "D", tags : {"dc": "sf"}},
{_id : 4, host : "E", tags : {"dc": "cloud"}}],
settings : {
getLastErrorModes : {
allDCs : {"dc" : 3},
someDCs : {"dc" : 2}} }
}
> [Link]({...})
> [Link]({getLastError : 1, w : "someDCs"})
Driver
getLastError
W:allDCs
write
Primary (SF)
replicate
apply in
memory
Secondary (NY)
replicate
Secondary (Cloud)
Wait for Replication (Tagging)
Read Preference Modes
• 5 modes
– primary (only) - Default
– primaryPreferred
– secondary
– secondaryPreferred
– Nearest
When more than one node is possible, closest node is used for
reads (all modes but primary)
Operational Considerations
Maintenance and Upgrade
• No downtime
• Rolling upgrade/maintenance
– Start with Secondary
– Primary last
Replica Set – 1 Data Center
Single datacenter
Datacenter
Single switch & power
Member 1 Points of failure:
– Power
Member 2 – Network
– Data center
Datacenter 2
Member 3 – Two node failure
Automatic recovery of
single node crash
Replica Set – 2 Data Centers
Multi data center
Datacenter 1
DR node for safety
Member 1
Can’t do multi data center
Member 2 durable write safely since
only 1 node in distant DC
Datacenter 2
Member 3
Replica Set – 3 Data Centers
Datacenter 1 Three data centers
Member 1
Member 2 Can survive full data
center loss
Datacenter 2
Member 3 Can do w= { dc : 2 } to
Member 4 guarantee write in 2 data
centers (with tags)
Datacenter 3
Member 5
Behind the Curtain
Implementation details
• Heartbeat every 2 seconds
– Times out in 10 seconds
• Local DB (not replicated)
– [Link]
– [Link]
• Capped collection
• Idempotent version of operation stored
Op(erations) Log is idempotent
> [Link]({_id:1,value:1})
{ "ts" : Timestamp(1350539727000, 1), "h" :
NumberLong("6375186941486301201"), "op" : "i", "ns" :
"[Link]", "o" : { "_id" : 1, "value" : 1 } }
> [Link]({_id:1},{$inc:{value:10}})
{ "ts" : Timestamp(1350539786000, 1), "h" :
NumberLong("5484673652472424968"), "op" : "u", "ns" :
"[Link]", "o2" : { "_id" : 1 },
"o" : { "$set" : { "value" : 11 } } }
Single operation can have many entries
> [Link]({},{$set:{name : ”foo”}, false, true})
{ "ts" : Timestamp(1350540395000, 1), "h" :
NumberLong("-4727576249368135876"), "op" : "u", "ns" :
"[Link]", "o2" : { "_id" : 2 }, "o" : { "$set" : { "name" :
"foo" } } }
{ "ts" : Timestamp(1350540395000, 2), "h" :
NumberLong("-7292949613259260138"), "op" : "u", "ns" :
"[Link]", "o2" : { "_id" : 3 }, "o" : { "$set" : { "name" :
"foo" } } }
{ "ts" : Timestamp(1350540395000, 3), "h" :
NumberLong("-1888768148831990635"), "op" : "u", "ns" :
"[Link]", "o2" : { "_id" : 1 }, "o" : { "$set" : { "name" :
"foo" } } }
Recent improvements
• Read preference support with sharding
– Drivers too
• Improved replication over WAN/high-latency
networks
• [Link] command
• buildIndexes setting
• replIndexPrefetch setting
Just Use It
Use replica sets
Easy to setup
– Try on a single machine
Check doc page for RS tutorials
– [Link]
Questions?
Thank You