Load Balancers
Distribute load across resources
Can be done using both hardware and software
Different ways to balance – round robin, active connections, response time, source IP
Cache
Types – application server cache (response caching)
Due to load balancer, different servers will have different cache – either global or distributed
cache
Distributed caching – using consistent hashing direct request to the correct node that will have
the data in cache
Global cache – cache miss will require the cache mechanism to go and find the required data
CDNs – large static media like images, cache miss-request and cache from back-end
Cache invalidation – when to expire cache entry, mechanisms – write-through, write around,
write back
Cache eviction – FIFO, LIFO, LRU, least frequently used
Sharding
Break big databases into smaller parts
Horizontal scaling – add more machines
Vertical scaling – improving machines
Horizontal partitioning – different rows in different DBs
Vertical partitioning – divide tables based on features
Directory partitioning – query directory server that holds the mapping between each tuple key
to its DB server
Challenges – ACID compliance because of distributed data, joins are inefficient
Indexes
Improve data retrieval, slower writes, but faster reads
Ordered indexing – sorted order
Hashing indexing – index is based on hash function, faster
Proxy Server
intermediary between client and server
filter requests, cache responses, collate multiple requests into a single request
used to bypass IP address blocking
Message Queue
Used for async communication
Producer and consumer – producer added messages to the queue, consumer can read those
messages
SQL vs No-SQL
SQL – rows and columns, fixed schema, vertically scalable,
No-SQL – no fixed schema, query based on collection of documents, difficult ACID compliance,
horizontal scaling
wide-column DB – column families, suited for analyzing large datasets, Cassandra, HBase
Graph DB – Neo4j, data is saved in graph structure with nodes and properties
Microservices
Monolithic architecture – single application contains all the logic, bad fault tolerance, no agility
Microservices – decouple logic into independent applications, easier to update and test
REST API
End points for request and response, end points connect to DB and contain business logic
Uniform interface – manipulation of data is done through representation
Stateless – all states are accessed using query and parameters, server does not store much
Caching responses
Hashing
Used to index data
Cryptography
Easy to compute, minimal collision, even distribution