In-Memory Data Structure Store - Major Report
In-Memory Data Structure Store - Major Report
A PROJECT REPORT
Submitted by
NAVEEN PAUDEL(211B192)
NIKHIL SAHU(211195)
PANSHUL KHURCHWAL(211B202)
UNDER THE GUIDANCE OF:
DR AJAY KUMAR
November 2024
Submitted in the fulfilment for the award of the degree
of
BACHELOR OF TECHNOLOGY
IN
COMPUTER SCIENCE AND ENGINEERING
Department of Computer Science of Engineering
JAYPEE UNIVERSITY OF ENGINEERING & TECHNOLOGY,
AB ROAD, RAGHOGARH,DT.GUNA-473226 MP, INDIA
1
DECLARATION
We hereby declare that the work reported in 7th semester Major project entitled “In Memory
DataStructure store”, in partial fulfillment for the award of the degree of B.Tech. submitted at
Jaypee University of Engineering and Technology, Guna, as per the best of our knowledge
and belief there is no infringement of intellectual property rights and copyright. In case of any
violation, we will solely be responsible.
Signature of Students
NAVEEN PAUDEL(211B192)
NIKHIL SAHU(211195)
PANSHUL KHURCHWAL(211B202)
Date:
2
JAYPEE UNIVERSITY OF ENGINEERING & TECHNOLOGY
Accredited with Grade-A+ by NAAC & Approved U/S 2(f) of the UGC Act, 1956
A.B. Road, Raghogarh, District Guna (MP), India, Pin-473226 Phone: 07544
267051, 267310-14, Fax: 07544 267011
Website: www.juet.ac.in
CERIFICATE
This is to certify that the work titled “In Memory DataStructure store ” submitted
by “Naveen Paudel(211b192), Nikhil Sahu(211b195), Panshul Khurchwal(211b202))” in
partial fulfillment for the award of degree of Bachelor of Technology (Computer science
Engineering) of Jaypee University of Engineering & Technology, Guna has been carried out
under my supervision. As per best of my knowledge and belief there is no infringement of
intellectual property right and copyright. Also, this work has not been submitted partially or
wholly to any other University or Institute for the award of this or any other degree or
diploma. In case of any violation concern student will solely be responsible.
Signature of Supervisor
(Name of Supervisor)
Designation
Date
3
ACKNOWLEDGEMENT
Any endeavour cannot lead to success unless and until a proper platform is provided for
the same. This is the reason why we find ourselves very fortunate to complete our work of
major project under the supervision of DR. AJAY KUMAR. Our sincere gratitude to him,
for having faith in us and thus allowing us to carry out a project on a technology
completely new to us, for which we had to research and learn many new things, which
will help us dealing with advanced work in future. He helped immensely by guiding us
throughout the project.
Secondly, we would like to thank the Department of Computer Science & Engineering that
created this opportunity.
Last but not the least, we would like to thank our family and friends who continuously
encouraged and helped us in any of the possible way they could.
4
Executive Summary
The advent of digital transformation and the proliferation of data-driven applications have
underscored the need for high-performance, low-latency database systems capable of
processing vast amounts of information in real-time. Traditional disk-based databases, while
reliable, often fall short in scenarios requiring instantaneous data access and rapid event
processing. In response to these limitations, this project explores the development and
implementation of an In-Memory Database (IMDB), a system designed to revolutionize
how data is stored, accessed, and managed in real-time environments.
The IMDB operates by storing data directly in Random Access Memory (RAM) rather
than on traditional disk storage. This design eliminates the latency associated with disk
I/O operations, enabling faster query responses and higher throughput. By leveraging
RAM’s speed, the IMDB is particularly well-suited for applications where
performance is paramount, such as real-time analytics, IoT data streams, financial
transactions, and online gaming environments. This report provides a comprehensive
analysis of the IMDB's architecture, features, implementation, and its transformative
impact on modern computing systems.
A key aspect of the project was designing a hybrid storage architecture that balances
the speed of in-memory storage with the durability of persistent storage. To address
concerns about reliability and fault tolerance, the database integrates features like
replication, automated failover, and snapshotting, ensuring data consistency and
availability even in the event of system failures.
In conclusion, this project not only addresses the limitations of existing database
technologies but also sets a foundation for the next generation of real-time data
management systems. The IMDB's innovative features and architecture make it an
indispensable tool for modern, data-intensive applications, underscoring its role as a
cornerstone in the evolution of high-performance computing. The findings and
advancements outlined in this report pave the way for future research and
development, with potential enhancements including integration of non-volatile
memory (NVM) and predictive analytics to further extend its capabilities. This project
represents a significant contribution to the field of database systems, highlighting the
transformative power of in-memory technology in meeting the demands of a data-
driven world.
5
List of figures
6
TABLE OF CONTENTS
1. INTRODUCTION ……………………………………………………………………. 8
1.1 Problem Definition …………………………………………………………… 8
1.2 Project Overview …………………………………………………………….. 11
1.3 Hardware Specification ………………………………………………………. 11
1.4 Software Specification ……………………………………………………….. 12
2. LITERATURE REVIEW…..………………………………………………………. 14
2.1 Overview Of Existing Research on in Memory database…….………………. 14
2.2 Comparison with Traditional databases……….……..………………………… 22
2.3 Examples of data systems……………………………………………………… 25
3. METHODOLOGIES……………………………………………………………….. 27
3.1 Foundations of Data systems ….……………………………………………… 27
3.2 Design and Architecture….…………………………………………………… 29
3.3 DATA Distributing Techniques.……………………………………………… 32
4. RESULTS & DISCUSSIONS………………………………………………………. 38
5. CONCLUSIONS ……………………………………………………………………… 64
6. REFERENCES ……………………………………………………………………….. 66
7. STUDENTS PROFILE ………………………………………………………………. 27
7
Chapter1
INTRODUCTION
In-Memory Data Structure Databases (IMDSDBs) are a class of databases that store data
entirely in a computer's main memory (RAM) rather than on disk. This design allows for
faster data access and processing, so IMDSDBs are suitable for applications that are in need
of high-speed operations: real-time analytics and caching are two good examples. Improving
performance considerably and reducing latency can be attained by minimizing disk I/O. While
IMDSDBs offer great speed advantages, there are usually some sacrifices involved with
regards to memory usage and durability, as data in RAM is more volatile than that in
traditional disk-based systems.
With the increase in data-intensive applications, databases that have long seemed to be
based on disk-based relational models that have been the backbone of data management
for many years have been challenged as to their performance and scalability.A significant
limitation of conventional databases is the issue of disk I/O latency. In fact, due to data
storage on physical disks, all read or write operations imply a disk access, which is
considerably slower than the retrieval time from memory. This is particularly critical in
high-throughput applications like real-time analytics, online transactions, or gaming,
where latency in the order of microseconds can impair performance or user experience.
Although the gap between disk and memory speeds has narrowed significantly through
improvements like Solid-State Drives, it remains considerable. Another limitation is
scalability. Relational databases are optimized for vertical scaling; therefore, better
performance often requires the hardware in servers to be upgraded-to more powerful
processors and more memory, for example. However, vertical scaling is limited by
physical and economic constraints, which limits its use in handling the large amounts of
data in modern large-scale applications and their increasing needs.
For instance, in case where data in the cache and database go unsynchronized properly, it
normally results in recovery of incorrect results for queries or indeed system crashes. One
major disadvantage in traditional database systems relates to the supporting of real-time
analytics. The highest dependency of such a disk-oriented architecture and the use of
batch processing methods on complex queries make traditional systems not suitable for
8
applications requiring instantaneous insights, like fraud detection or recommendation
algorithms. Interactions with large data sometimes incur heavy overheads, which multiply
the latency related to actionable insights exponentially.
The ultimate issues relate to cost and maintenance. Enhancing the performance of
conventional databases frequently requires substantial investments in hardware upgrades
and alternative software approaches, such as data warehouses or distributed architectures.
Such configurations contribute to increased expenses and complexities associated with
deployment, implementation, and ongoing maintenance. Additionally, the reliability and
resilience of traditional databases heavily rely on the replication of data across various
servers, which further escalates the total expenditure. In the current environment of
overwhelming needs for instant access to vast and diversified datasets, conventional
databases are inappropriate for addressing these challenges. The inherent design
restriction of such systems combined with the growing need for fast, agile, and scalable
solutions has led to a growing desire for modern alternatives, namely in-memory
databases, which in turn raise a sequence of interlinked challenges.
9
Guaranteeing Reliability: Incorporate mechanisms for data durability and fault
Key Features
Hybrid Storage Model: Combines in-memory data storage for speed with persistent
Implementation
The project involves the following stages:
System Design: Development of a modular architecture to support high performance
and flexibility.
scenarios, including stress testing and comparative analysis with existing solutions.
10
Deployment and Validation: Implementing the database in a simulated real-world
Significance
The IMDB developed in this project addresses critical gaps in existing database systems,
particularly for applications that demand real-time data access and processing. By
combining speed, scalability, and reliability, it offers a transformative solution for
industries that rely on rapid decision-making and high-frequency data analysis. Examples
include fraud detection systems in financial services, event monitoring in IoT, and
matchmaking algorithms in gaming platforms.
Memory (RAM):
o Size: Minimum 8 GB (32 GB recommended for larger data sets)
Storage:
o Reason: For persistence mechanisms like snapshot storage and logs; faster
storage ensures quick data retrieval during restarts.
11
1.4 Software Requirements
1. Operating System:
o Options:
Windows 10/11
2. Programming Language:
o Language: Go
3. Development Tools:
o Compiler/IDE:
4. Version Control:
o Tool: Git
techniques.
compression.
7. Command-Line Tools:
o Redis CLI, Custom-built CLI commands for database interaction (e.g., GET,
SET, PSYNC).
o Tools like Telnet or Netcat for sending raw commands to test socket-based
communication.
13
Chapter 2
LITERATURE REVIEW
Many new tools for data storage and processing have emerged in recent years. They are
optimized for a variety of different use cases, and they no longer neatly fit into traditional
categories [1]. For example, there are datastores that are also used as message queues
(Redis), and there are message queues with database-like durability guar antees (Apache
Kafka). The boundaries between the categories are becoming blurred. Secondly,
increasingly many applications now have such demanding or wide-ranging requirements
that a single tool can no longer meet all of its data processing and storage needs. Instead,
the work is broken down into tasks that can be performed efficiently on a single tool, and
those different tools are stitched together using application code
We typically think of databases, queues, caches, etc. as being very different categories of
tools. Although a database and a message queue have some superficial similarity— both
store data for some time—they have very different access patterns, which means different
performance characteristics, and thus very different implementations.
1. Architectural Advancements
Early research focused on developing architectures that fully leverage the speed of main
memory while addressing traditional database issues like durability and consistency:
Hybrid Models: Studies explore combining in-memory storage with persistent disk-
based storage to ensure durability. SAP HANA, for instance, utilizes columnar storage
with in-memory processing for analytical workloads.
2. Query Optimization
14
Indexing Mechanisms: Adaptive and dynamic indexing techniques, such as tree-
based and hash-based indexes, have been extensively studied. Research highlights
their role in speeding up queries in memory-intensive environments.
Vectorized Processing: Studies show that vectorized execution, where multiple rows
of data are processed simultaneously, can significantly boost performance.
Sharding and Partitioning: Techniques for partitioning data across multiple nodes in
a cluster have been extensively researched to maintain performance during scaling.
15
Real-Time Analytics: Studies highlight the use of IMDBs in scenarios like stock
trading, fraud detection, and IoT platforms, where real-time insights are critical.
Caching and Acceleration: Research has shown that using IMDBs as caching layers
can significantly improve the performance of backend systems.
Benchmarks: Research often uses benchmarks like YCSB (Yahoo! Cloud Serving
Benchmark) and TPC-C to evaluate IMDB performance.
1. Cost: High memory requirements make IMDBs expensive for large-scale adoption.
2. Durability vs. Performance: Balancing speed with data persistence remains a key
challenge.
3. Scalability: Maintaining low latency while scaling across multiple nodes is non-
trivial.
4. Specialized Workloads: Designing IMDBs that adapt to diverse workload
requirements, such as analytics vs. transactions.
Existing research provides a strong foundation for understanding in-memory databases,
their capabilities, and their limitations. The field continues to evolve, with emerging
technologies like NVM and edge computing offering new avenues for innovation. Future
research can build on these advancements to further optimize performance, scalability, and
integration into modern data architectures.
Performance
2. Suitability: More suited for applications where the speed of data access is
less critical.
Use Cases
2. Large Data Sets: Ideal for applications with large data sets that cannot be
cost-effectively stored in memory.
Advantages
limited by the available system memory. They can efficiently manage terabytes
or petabytes of data
This ensures data durability and consistency, critical for transactional and
mission-critical applications.
18
4. Flexible Storage Options: On-disk databases offer various storage
configurations and optimizations, such as partitioning, indexing, and
compression, to optimize performance and storage efficiency.
Limitations
1. Slower Performance: Accessing data from disk is slower compared to memory access,
leading to higher latency for database operations. This may be acceptable for batch processing
or non-real-time applications but can be a bottleneck for latency-sensitive workloads.
2. Disk I/O Bottleneck: Disk I/O operations can become a performance bottleneck,
particularly in high- concurrency environments or when dealing with large datasets.
Optimizing disk I/O is essential for maximizing database performance.
1. MySQL:
2. PostgreSQL:
3. MongoDB:
4. Oracle Database:
19
■ A relational database management system developed by Microsoft, offering a
wide range of data analytics, business intelligence, and transaction processing
capabilities.
6. SQLite:
1. Storage Mechanism
2. Performance
Optimized Writes:
Writing to disk is typically asynchronous, using techniques like write-ahead logging
(WAL) to minimize performance impact.
Scalability:
Memory-first databases can scale horizontally by distributing data across multiple
nodes, though they remain limited by the available memory per node.
3. Use Cases
Real-Time Analytics:
Ideal for dashboards, fraud detection systems, and recommendation engines where
low-latency analytics are critical.
Event Processing:
Used in IoT platforms, gaming leaderboards, and social media for processing high
volumes of events in real-time.
Session Stores:
Memory-first databases are often used to manage user sessions in web and mobile
applications.
4. Advantages
Flexibility:
Supports both transactional (OLTP) and analytical (OLAP) workloads.
Cost-Efficiency:
While relying on RAM for speed, it also utilizes cheaper disk storage for less critical
data, optimizing resource usage.
Memory Dependency:
Limited by the amount of available RAM, making it challenging for very large
datasets unless tiering to disk is well-optimized.
Complexity in Management:
Balancing memory and disk storage requires careful configuration and tuning.
Cost:
High-performance memory modules (e.g., DRAM) are more expensive than
traditional disk storage.
6. Examples
Redis:
A popular key-value store that supports persistence through RDB snapshots and AOF
(Append-Only File) mechanisms.
SAP HANA:
Combines in-memory storage with disk persistence, excelling in analytical and
transactional processing.
22
Memcached with Persistence:
A memory caching solution extended with disk persistence for durability.
VoltDB:
A memory-first database optimized for high-speed transaction processing with disk-
based durability.
Aerospike:
Hybrid architecture using in-memory data for speed and SSDs for persistent storage,
making it highly scalable and reliable.
Memory-first databases bridge the gap between the speed of in-memory databases and the
reliability of disk-based databases. By offering high performance and durability, they have
become integral to modern applications requiring real-time insights and high availability.
However, their success depends on efficient memory management and balanced resource
utilization to overcome limitations like memory dependency and recovery overhead.
1. Redis
Overview: Redis (Remote Dictionary Server) is an open-source, highly scalable in-
memory key-value store that supports multiple data structures such as strings, hashes,
lists, sets, and sorted sets.
Features:
o Persistence: Offers RDB snapshots and append-only file (AOF) options for
durability.
o Replication: Master-slave replication for high availability.
o Performance: Handles millions of read/write operations per second.
o Modules: Extensible with modules like RedisGraph and RedisJSON.
Use Cases:
o Real-time analytics
o Caching layer
o Message brokering (via Pub/Sub model)
2. Memcached
Overview: A simple, distributed memory caching system primarily used to speed up
dynamic web applications by reducing database load.
Features:
23
o Lightweight and highly performant.
o Limited to simple key-value pair storage.
o No persistence mechanism.
Use Cases:
o Session caching
o Content delivery networks (CDNs)
o Query caching in web applications
3. Apache Ignite
Overview: An open-source distributed database and computing platform that
combines in-memory data storage with compute capabilities.
Features:
o SQL support for relational queries.
o Durable memory with disk-based persistence.
o Built-in machine learning and streaming APIs.
Use Cases:
o Real-time transaction processing.
o High-performance data grids.
o Event-driven architectures.
4. SAP HANA
Overview: A relational database management system (RDBMS) developed by SAP
that uses in-memory storage for real-time processing.
Features:
o Columnar storage for faster query execution.
o Built-in advanced analytics (e.g., predictive analytics, machine learning).
o Enterprise-grade reliability and security.
Use Cases:
o Real-time business intelligence.
o Supply chain optimization.
o Financial data analysis.
5. Amazon ElastiCache
24
Overview: A fully managed in-memory data store offered by AWS, supporting Redis
and Memcached.
Features:
o High availability with auto-scaling.
o Monitoring via AWS CloudWatch.
o Easy integration with AWS services like Lambda and RDS.
Use Cases:
o Caching to reduce database query loads.
o Gaming leaderboards.
o Real-time analytics pipelines.
6. VoltDB
Overview: An in-memory database designed for high-speed transactions and real-time
analytics.
Features:
o ACID compliance for transactional integrity.
o Supports SQL and Java for application development.
o Built-in fault tolerance and high availability.
Use Cases:
o Fraud detection.
o Telecommunication networks.
o IoT data processing.
25
8. Aerospike
Features:
Use Cases:
o Ad-tech platforms.
26
Chapter 3
METHODLOGIES
Reliability
Everybody has an intuitive idea of what it means for something to be reliable or unreliable.
For software, typical expectations include:
It can tolerate the user making mistakes or using the software in unexpected ways.
Its performance is good enough for the required use case, under the expected load and
data volume.
Scalability
Even if a system is working reliably today, that doesn’t mean it will necessarily work reliably
in the future. One common reason for degradation is increased load: perhaps the system has
grown from 10,000 concurrent users to 100,000 concurrent users, or from 1 million to 10
million. Perhaps it is processing much larger volumes of data than it did before. Scalability is
the term we use to describe a system’s ability to cope with increased load. Note, however, that
it is not a one-dimensional label that we can attach to a system: it is meaningless to say “X is
scalable” or “Y doesn’t scale.” | Applications scalability means considering questions like “If
27
the system grows in a particular way, what are our options for coping with the growth?” and
“How can we add computing resources to handle the additional load?
Describing Load
Load can be described with a few numbers which we call load parameters. The best choice of
parameters depends on the architecture of your system: it may be requests per second to a web
server, the ratio of reads to writes in a database, the number of simultaneously active users in
a chat room, the hit rate on a cache, or something else. Perhaps the average case is what
matters for you, or perhaps your bottleneck is dominated by a small number of extreme cases.
To make this idea more concrete, let’s consider Twitter as an example, using data
published in November 2012 [16]. Two of Twitter’s main operations are:
Post tweet
A user can publish a new message to their followers (4.6k requests/sec on average, over
12k requests/sec at peak).
Home timeline
A user can view tweets posted by the people they follow (300k requests/sec). Simply
handling 12,000 writes per second (the peak rate for posting tweets) would be fairly easy.
28
Maintainability
It is well known that the majority of the cost of software is not in its initial development, but
in its ongoing maintenance—fixing bugs, keeping its systems operational, investigating
failures, adapting it to new platforms, modifying it for new use cases, repaying technical debt,
and adding new features.
Yet, unfortunately, many people working on software systems dislike maintenance of so-
called legacy systems—perhaps it involves fixing other people’s mistakes, or working with
platforms that are now outdated, or systems that were forced to do things they were never
intended for. Every legacy system is unpleasant in its own way, and so it is difficult to give
general recommendations for dealing with them. However, we can and should design software
in such a way that it will hopefully minimize pain during maintenance, and thus avoid
creating legacy software ourselves. To this end, we will pay particular attention to three
design principles for software systems:
Operability Make it easy for operations teams to keep the system running smoothly.
Core Components:
o Data Engine: Handles query execution and transaction management with
focus on low-latency operations.
o Memory Storage: Acts as the primary data store for active datasets, avoiding
disk I/O bottlenecks.
o Persistence Layer: Ensures durability through mechanisms like snapshots,
append-only files, and write-ahead logging (WAL).
o Indexing Structures: Advanced in-memory indexes, such as hash tables, AVL
trees, or B-trees, enable faster lookups and updates.
Key Features of the Architecture:
o Data is fully or partially stored in RAM.
o Concurrent access is managed through efficient locking mechanisms or multi-
version concurrency control (MVCC).
o Redundant copies or replicas are maintained for fault tolerance and scalability.
29
Hash Indexes
Key-value stores are quite similar to the dictionary type that you can find in most
programming languages, and which is usually implemented as a hash map (hash table).
Let’s say our data storage consists only of appending to a file, as in the preceding
example. Then the simplest possible indexing strategy is this: keep an in-memory hash
map where every key is mapped to a byte offset in the data file—the location at which the
value can be found, as illustrated in Figure 3-1. Whenever you append a new key-value
pair to the file, you also update the hash map to reflect the offset of the data you just wrote
(this works both for inserting new keys and for updating existing keys).
Fig 3.2 Compaction of a key-value update log (counting the number of times each cat video
was played), retaining only the most recent value for each key.
Segments are never modified after they have been written, so the merged segment is written
to a new file. The merging and compaction of frozen segments can be done in a background
thread, and while it is going on, we can still continue to serve read and write requests as
normal, using the old segment files. After the merging process is complete, we switch read
requests to using the new merged seg ment instead of the old segments—and then the old
segment files can simply be deleted
30
SSTables and LSM-Trees
Each log-structured storage segment is a sequence of key-value pairs. These pairs appear
in the order that they were written, and values later in the log take precedence over values
for the same key earlier in the log. Apart from that, the order of key-value pairs in the file
matter.
Now we can make a simple change to the format of our segment files: we require that the
sequence of key-value pairs is sorted by key. At first glance, that requirement seems to
break our ability to use sequential writes, but we’ll get to that in a moment.
In order to find a particular key in the file, you no longer need to keep an index of all the
keys in memory. See Figure 3-5 for an example: say you’re looking for the key
31
handiwork, but you don’t know the exact offset of that key in the segment file. However,
you do know the offsets for the keys handbag and handsome, and because of the sorting
you know that handiwork must appear between those two. This means you can jump to the
offset for handbag and scan from there until you find handiwork (or not, if the key is not
present in the file)
Btrees
Like SSTables, B-trees keep key-value pairs sorted by key, which allows efficient key
value lookups and range queries. But that’s where the similarity ends: B-trees have a very
different design philosophy. The log-structured indexes we saw earlier break the database
down into variable-size segments, typically several megabytes or more in size, and always
write a segment sequentially. By contrast, B-trees break the database down into fixed-size
blocks or pages, traditionally 4 KB in size (sometimes bigger), and read or write one page
at a time. This design corresponds more closely to the underlying hardware, as disks are
also arranged in fixed-size blocks.
If you want to update the value for an existing key in a B-tree, you search for the leaf page
containing that key, change the value in that page, and write the page back to disk (any
references to that page remain valid). If you want to add a new key, you need to find the
page whose range encompasses the new key and add it to that page. If there isn’t enough
32
free space in the page to accommodate the new key, it is split into two half-full pages, and
the parent page is updated to account for the new subdivision of key ranges.
REPLICATION
Replication means keeping a copy of the same data on multiple machines that are connected
via a network.
To keep data geographically close to your users (and thus reduce latency)
To allow the system to continue working even if some of its parts have failed (and thus
increase availability)
To scale out the number of machines that can serve read queries (and thus increase
read throughput)
Each node that stores a copy of the database is called a replica. With multiple replicas, a
question inevitably arises: how do we ensure that all the data ends up on all the replicas?
Every write to the database needs to be processed by every replica; otherwise, the replicas
would no longer contain the same data. The most common solution for this is called leader-
based replication (also known as active/passive or master–slave replication) and is illustrated
in Figure . It works as follows:
1. One of the replicas is designated the leader (also known as master or primary). When
clients want to write to the database, they must send their requests to the leader, which
first writes the new data to its local storage.
2. The other replicas are known as followers (read replicas, slaves, secondaries, or hot
standbys).Whenever the leader writes new data to its local storage, it also sends the
data change to all of its followers as part of a replication log or change stream. Each
follower takes the log from the leader and updates its local copy of the data base
accordingly, by applying all writes in the same order as they were processed on the
leader
3. When a client wants to read from the database, it can query either the leader or any of
the followers. However, writes are only accepted on the leader (the follow ers are
read-only from the client’s point of view).
33
FIG-3.6 Leader-based(master–slave)replication.
PARTITIONING
Partitioning is usually combined with replication so that copies of each partition are stored on
multiple nodes. This means that, even though each record belongs to exactly one partition, it
may still be stored on several different nodes for fault toler ance. A node may store more than
one partition. If a leader–follower replication model is used, the combination of partitioning
and replication can look like Figure 6-1. Each partition’s leader is assigned to one node, and
its followers are assigned to other nodes. Each node may be the leader for some partitions and
a follower for other partition
34
FIG 3.6 Combining replication and partitioning: each node acts as leader for some
partitions and follower for other partitions.
Our goal with partitioning is to spread the data and the query load evenly across nodes. If
every node takes a fair share, then—in theory—10 nodes should be able to handle 10
times as much data and 10 times the read and write throughput of a single node (ignoring
replication for now). If the partitioning is unfair, so that some partitions have more data or
queries than others, we call it skewed. The presence of skew makes partitioning much less
effective. In an extreme case, all the load could end up on one partition, so 9 out of 10
nodes are idle and your bottleneck is the single busy node. A partition with disproportion
ately high load is called a hot spot.
• Key range partitioning, where keys are sorted, and a partition owns all the keys from some
minimum up to some maximum. Sorting has the advantage that efficient range queries are
possible, but there is a risk of hot spots if the application often accesses keys that are close
together in the sorted order. In this approach, partitions are typically rebalanced
dynamically by splitting the range into two subranges when a partition gets too big.
35
• Hash partitioning, where a hash function is applied to each key, and a partition owns a
range of hashes. This method destroys the ordering of keys, making range queries
inefficient, but may distribute load more evenly.
TRANSACTIONS
A transaction is a way for an application to group several reads and writes together into a
logical unit. Conceptually, all the reads and writes in a transaction are executed as one
operation: either the entire transaction succeeds (commit) or it fails (abort, rollback). If it fails,
the application can safely retry. With transactions, error handling becomes much simpler for
an application, because it doesn’t need to worry about partial failure—i.e., the case where
some operations succeed and some fail (for whatever reason).
1. Dirty reads
One client reads another client’s writes before they have been committed. The read
committed isolation level and stronger levels prevent dirty reads.
2. Dirty writes
One client overwrites data that another client has written, but not yet committed. Almost
all transaction implementations prevent dirty writes.
A client sees different parts of the database at different points in time. This issue is most
commonly prevented with snapshot isolation, which allows a transaction to read from a
consistent snapshot at one point in time. It is usually implemented with multi-version
concurrency control (MVCC).
4. Lost updates
Two clients concurrently perform a read-modify-write cycle. One overwrites the other’s
write without incorporating its changes, so data is lost. Some implementations of snapshot
isolation prevent this anomaly automatically, while others require a manual lock (SELECT
FOR UPDATE).
5. Write skew
36
A transaction reads something, makes a decision based on the value it saw, and writes the
decision to the database. However, by the time the write is made, the premise of the
decision is no longer true. Only serializable isolation prevents this anomaly.
6. Phantom reads
A transaction reads objects that match some search condition. Another client makes a
write that affects the results of that search. Snapshot isolation prevents straightforward
phantom reads, but phantoms in the context of write skew require special treatment, such
as index-range locks.
37
Chapter 4
RESULTS & DISCUSSION
./start_server.sh
This script will compile the Go code and start the Redis server.
Supported Commands
Basic Commands
38
- `KEYS <pattern>`: Returns all keys matching a pattern.
Stream Commands
Transaction Commands
1.app.go
Description:
This is likely the entry point for your application. It initializes the server, sets up
routes or commands, and starts the application.
Key Responsibilities:
39
o Managing application lifecycle (start, stop, etc.).
Example Features:
2.handler.go
Description:
Contains the logic to handle user commands or API requests. Handlers act as
intermediaries, parsing user inputs and invoking the appropriate database operations.
Key Responsibilities:
package main
import (
"bytes"
"fmt"
"io"
"math"
"strconv"
"strings"
"time"
radix "server/radix"
)
// ------------------------------------------------------------------------
----
42
// General commands -------------------------------------------------------
----
func commandFunc() *RESP {
return &RESP{Type: NULL, Value: "Command"}
}
// ------------------------------------------------------------------------
----
43
// Header section
header := make([]byte, 9)
_, err := io.ReadFull(data, header)
if err != nil {
return ErrResp("Error reading RDB header")
}
if string(header[:5]) != "REDIS" {
return ErrResp("Invalid RDB file")
}
// version := string(header[5:])
// if version != "0007" {
// return ErrResp("Invalid RDB version")
// }
// Metadata section
for {
fa, err := data.ReadByte()
if err != nil {
return ErrResp("Error reading metadata section")
}
if fa != 0xfa {
data.UnreadByte()
break
}
// Metadataa Key
_, err = decodeString(data)
if err != nil {
return ErrResp("Error reading metadata section")
}
// Metadata Value
_, err = decodeString(data)
if err != nil {
return ErrResp("Error reading metadata section")
}
}
for {
byt, _ := data.Peek(1)
if byt[0] == 0xff {
break
}
// Database section - 0xfe
data.ReadByte()
44
// This byte is the database index
// TODO - Implement support for multiple databases
decodeSize(data)
// Expiry size
_, err = decodeSize(data)
if err != nil {
return ErrResp("Error reading database section")
}
// Key
key, err := decodeString(data)
if err != nil {
return ErrResp("Error reading key")
}
// Value
value, err := decodeString(data)
if err != nil {
return ErrResp("Error reading value")
}
s.SETsMu.Lock()
45
s.SETs[string(key)] = string(value)
if expiryTime > 0 {
s.EXPs[string(key)] = expiryTime
fmt.Println("Key: ", key, "Value: ", value, "Expiry: ",
expiryTime)
}
s.SETsMu.Unlock()
}
next, _ := data.Peek(1)
if next[0] == 0xff {
break
}
}
return OkResp()
}
pattern := args[0].Value
keys := []string{}
if pattern == "*" {
s.SETsMu.Lock()
for k := range s.SETs {
keys = append(keys, k)
}
s.SETsMu.Unlock()
} else {
s.SETsMu.Lock()
for k := range s.SETs {
if strings.Contains(k, pattern) {
keys = append(keys, k)
}
}
s.SETsMu.Unlock()
}
return &RESP{
Type: ARRAY,
Values: ToRespArray(keys),
}
}
46
func (s *Server) set(args []*RESP) *RESP {
if !(len(args) == 2 || len(args) == 4) {
return &RESP{Type: ERROR, Value: "ERR wrong number of arguments for
'set' command"}
}
s.NeedAcks = true
var length int
if len(args) > 2 {
if strings.ToLower(args[2].Value) != "px" {
return &RESP{Type: ERROR, Value: "ERR syntax error"}
}
l, err := strconv.Atoi(args[3].Value)
if err != nil {
return &RESP{Type: ERROR, Value: "ERR value is not an integer
or out of range"}
}
length = l
}
s.SETsMu.Lock()
s.SETs[key] = value
if length > 0 {
// Set expiry time in milliseconds
s.EXPs[key] = time.Now().Add(time.Duration(length) *
time.Millisecond).UnixMilli()
}
s.SETsMu.Unlock()
return OkResp()
}
key := args[0].Value
s.SETsMu.Lock()
value, ok := s.SETs[key]
if exp, ok := s.EXPs[key]; ok {
expTime := time.UnixMilli(exp)
47
if time.Now().After(expTime) {
delete(s.SETs, key)
delete(s.EXPs, key)
s.SETsMu.Unlock()
return NullResp()
}
}
s.SETsMu.Unlock()
if !ok {
return NullResp()
}
streamKey := args[0].Value
stream, ok := s.XADDs[streamKey]
if !ok {
s.XADDsMu.Lock()
stream = radix.NewRadix()
s.XADDs[streamKey] = stream
stream.Insert("0-0", &StreamTop{Time: 0, Seq: 0})
s.XADDsMu.Unlock()
}
id := args[1].Value
time, seq, err := validateEntryID(stream, id)
if err != nil {
return ErrResp(err.Error())
}
entries := []*StreamKV{}
for i := 2; i < len(args); i += 2 {
entries = append(entries, &StreamKV{Key: args[i].Value, Value:
args[i+1].Value})
}
if s.XREADsBlock {
s.XADDsCh <- false
}
streamKey := args[0].Value
stream, ok := s.XADDs[streamKey]
if !ok {
return ErrResp("ERR stream not found")
}
// st = starttime, ss = startseq
st, ss, err := splitEntryId(args[1].Value)
if err != nil {
return ErrResp(err.Error())
}
if st == math.MinInt64 {
key, _, _ := stream.GetFirst()
st, ss, _ = splitEntryId(key)
}
// et = endtime, es = endseq
et, es, err := splitEntryId(args[2].Value)
if err != nil {
return ErrResp(err.Error())
}
if et == math.MaxInt64 {
key, _, _ := stream.GetLast()
et, es, _ = splitEntryId(key)
}
entries := []*RESP{}
blockTime := -1
if strings.ToUpper(args[0].Value) == "BLOCK" {
t, err := strconv.Atoi(args[1].Value)
if err != nil {
return ErrResp("ERR block time is not an integer or out of
range")
}
blockTime = t
args = args[2:]
}
if blockTime > 0 {
time.Sleep(time.Duration(blockTime) * time.Millisecond)
} else if blockTime == 0 {
s.XREADsBlock = true
s.XREADsBlock = <-s.XADDsCh
50
}
if args[0].Value != "streams" {
return &RESP{Type: ERROR, Value: "ERR can only read streams at the
moment"}
}
args = args[1:]
if len(args)%2 != 0 {
return ErrResp("Err wrong number of arguments for 'xread' command")
}
readLen := len(args) / 2
streamLst := []*RESP{}
start := args[i+readLen].Value
if start == "$" {
start, _, _ = stream.GetLast()
} else {
start, _, ok = stream.GetNext(start)
if !ok {
return NullResp()
}
}
// st = starttime, ss = startseq
st, ss, err := splitEntryId(start)
if err != nil {
return ErrResp(err.Error())
}
// et == endtime, es = endseq
last, _, _ := stream.GetLast()
et, es, _ := splitEntryId(last)
entryLst := []*RESP{BulkString(streamKey)}
numReplicas, _ := strconv.Atoi(args[0].Value)
timeout, _ := strconv.Atoi(args[1].Value)
s.RedirectRead = true
go func() {
for _, c := range s.Conns {
if c.Type != REPLICA {
continue
}
Write(c.Writer, getAck)
}
}()
for {
select {
case <-timeoutChan:
return &RESP{
Type: INTEGER,
Value: strconv.Itoa(acks),
}
default:
for _, c := range s.Conns {
if c.Type != REPLICA {
continue
}
select {
case parsedResp := <-c.Chan:
fmt.Println("Received ACK from replica")
_, args := parsedResp.getCmdAndArgs()
result := s.replConfig(args, c)
strconv.Atoi(result.Value)
// replOffset, _ := strconv.Atoi(result.Value)
// if replOffset == s.MasterReplOffset {
acks++
if acks == numReplicas {
return &RESP{
Type: INTEGER,
Value: strconv.Itoa(acks),
}
54
}
// }
case <-timeoutChan:
return &RESP{
Type: INTEGER,
Value: strconv.Itoa(acks),
}
default:
continue
}
}
}
}
}
for !q.IsEmpty() {
resp, _ := q.Dequeue()
results := s.Handler(resp.(*RESP), conn)
response.Values = append(response.Values, results...)
55
}
Write(conn.Writer, response)
conn.RedirectRead = false
return OkResp()
}
key := args[0].Value
56
s.SETsMu.Lock()
_, ok := s.SETs[key]
s.SETsMu.Unlock()
if ok {
return SimpleString("string")
}
s.XADDsMu.Lock()
_, ok = s.XADDs[key]
s.XADDsMu.Unlock()
if ok {
return SimpleString("stream")
}
return SimpleString("none")
}
// ------------------------------------------------------------------------
----
server.go
Description:
Implements the core server functionality, managing client connections, request
routing, and execution.
Key Responsibilities:
package main
import (
"errors"
"flag"
"fmt"
"net"
"os"
"strings"
"sync"
"time"
"math/rand/v2"
queue "server/queue"
57
radix "server/radix"
)
// ------------------------------------------------------------------------
----
func initRandom() {
// rand.Seed(uint64(time.Now().UnixNano()))
rand.New(rand.NewPCG(uint64(time.Now().UnixNano()), 0))
}
s.decodeRDB(NewBuffer(file))
}
// ------------------------------------------------------------------------
----
if s.Role == MASTER {
go s.handleClientConnAsMaster(conn)
} else {
go s.handleClientConnAsReplica(conn)
59
}
}
resp := NewBuffer(conn)
writer := NewWriter(conn)
connRW := &ConnRW{MASTER, conn, resp, writer, nil, false, false,
queue.NewQueue()}
// Stage 1
Write(writer, PingResp())
parsedResp, _, err := resp.Read()
if err != nil {
return err
}
if !parsedResp.IsPong() {
return errors.New("master server did not respond with PONG")
}
// Stage 2
Write(writer, ReplconfResp(1, s.Port))
parsedResp, _, err = resp.Read()
if err != nil {
return err
}
if !parsedResp.IsOkay() {
return errors.New("master server did not respond with OK")
}
// Stage 3
Write(writer, Psync(0, 0))
rdb, err := resp.ReadFullResync()
if err != nil {
return err
}
s.Handler(rdb, connRW)
s.MasterReplOffset = 0
go s.handleMasterConnAsReplica(connRW)
return nil
}
// ------------------------------------------------------------------------
----
if s.RedirectRead || connRW.RedirectRead {
fmt.Println("Handling client connection on redirect",
parsedResp)
connRW.Chan <- parsedResp
} else {
fmt.Println("Handling client connection on main loop",
parsedResp)
results := s.Handler(parsedResp, connRW)
// ------------------------------------------------------------------------
----
flag.Parse()
if repl != "" {
config.IsReplica = true
ap := strings.Split(repl, " ")
if len(ap) != 2 {
return nil, errors.New("wrong argument count for --
replicaof")
}
config.MasterHost, config.MasterPort = ap[0], ap[1]
}
return config, nil
}
func main() {
config, err := parseFlags()
if err != nil {
fmt.Println(err)
os.Exit(1)
}
server.serverListen()
}
63
Chapter 5
CONCLUSION
At the core of in-memory databases lies the promise of unparalleled speed and efficiency. By
storing data primarily in RAM, these systems eliminate the latency associated with disk I/O,
making them indispensable for applications requiring real-time data processing. Whether it’s
financial trading, fraud detection, IoT data streams, or gaming, the low-latency nature of in-
memory databases ensures that critical operations can be executed with minimal delay. This
characteristic has positioned them as essential tools in industries where time-sensitive
decisions can have significant implications.
Moreover, the use of advanced indexing techniques, such as hash tables and AVL trees,
further enhances performance by enabling rapid data retrieval and updates. The integration of
compression algorithms ensures that memory resources are used efficiently, making these
systems viable for datasets of substantial size.
One of the challenges inherent to in-memory databases is their reliance on volatile memory,
which can lead to data loss in the event of power failures or crashes. However, modern
systems have introduced innovative solutions to mitigate this risk. Techniques such as write-
ahead logging (WAL), append-only files (AOF), and snapshot-based persistence ensure that
critical data can be recovered even after a failure. These advancements have struck a balance
between performance and durability, enabling in-memory databases to cater to both volatile
and non-volatile use cases.
64
This dual focus on speed and reliability has expanded their adoption in areas such as hybrid
transactional/analytical processing (HTAP), where both transactional consistency and
analytical insights are required in near-real time.
The versatility of in-memory databases extends beyond traditional data management tasks.
They are instrumental in powering high-performance caching solutions, enabling faster web
application responses, and supporting distributed computing frameworks. Their ability to
process vast amounts of data in parallel makes them ideal for big data analytics, where
insights must be derived from rapidly changing datasets.
Examples like Redis, SAP HANA, and Aerospike illustrate the diversity of use cases, ranging
from simple key-value storage to complex analytics. These systems demonstrate how in-
memory databases can adapt to varied requirements, from small-scale applications to
enterprise-level deployments.
Despite their advantages, in-memory databases are not without challenges. The cost of
maintaining large amounts of RAM can be prohibitive, particularly for startups or small
organizations. Furthermore, the reliance on memory imposes limitations on data storage
capacity, requiring careful consideration of sharding and replication strategies for scalability.
Another concern is the evolving nature of data compliance and privacy regulations. In-
memory databases must address challenges related to secure data handling, especially when
used in sensitive industries such as healthcare and finance.
66
REFERENCES
1. Depoortere, R., & Van Landeghem, H. (2022). A survey of in-memory databases. Journal of
Database Management, 34(2), 45-60.
2. "High Performance In-Memory Computing with Apache Ignite" by Shamim Bhuiyan,
Michael Zheludkov, and Dmitriy Setrakyan
3. "Redis Essentials" by Maxwell Dayvson da Silva and Hugo Lopes Tavares
4. "Designing Data-Intensive Applications" by Martin Kleppmann
5. https://aws.amazon.com/nosql/in-memory/
6. https://www.couchbase.com/resources/concepts/in-memory-database/
7. https://www.dragonflydb.io/guides/in-memory-databases
8. https://www.mongodb.com/resources/basics/databases/in-memory-
database?msockid=0e141f4b826c654939870bc083f76467
9. https://en.wikipedia.org/wiki/In-memory_database
67
STUDENTS PROFILE
Email: [email protected]
Contact: 7049005595
Email: [email protected]
Contact: 8299115905
Email: [email protected]
Contact: 8813982207
68
69