0% found this document useful (0 votes)
35 views50 pages

Cassandra Notes

The document defines data as raw, unprocessed facts and information as processed data with context, highlighting their differences in structure, processing, and utility. It discusses RDBMS, its limitations in big data applications, and explains ACID properties in relation to Cassandra's behavior, noting its eventual consistency model. Additionally, it covers the evolution of Apache Cassandra, comparing it with traditional RDBMS in scalability, performance, and data structure, while also explaining the CAP theorem and how Cassandra addresses its trade-offs.

Uploaded by

Mohammed Faiz K
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views50 pages

Cassandra Notes

The document defines data as raw, unprocessed facts and information as processed data with context, highlighting their differences in structure, processing, and utility. It discusses RDBMS, its limitations in big data applications, and explains ACID properties in relation to Cassandra's behavior, noting its eventual consistency model. Additionally, it covers the evolution of Apache Cassandra, comparing it with traditional RDBMS in scalability, performance, and data structure, while also explaining the CAP theorem and how Cassandra addresses its trade-offs.

Uploaded by

Mohammed Faiz K
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 50

UNIT – 1

(10-Marks)

1. Define Data and Information. Explain how they differ with examples.

Definition:

⦁ Data: Raw, unprocessed facts or observations collected from various sources, lacking
context or meaning until analyzed. It can be numbers, text, images, or any other form
(e.g., sensor readings from an IoT device).

⦁ Information: Processed data that has been organized, analyzed, and given context to
provide meaning and support decision-making. It is the output of interpreting data (e.g.,
a report summarizing sensor trends).

Differences:

⦁ Structure: Data is unorganized (e.g., a list of temperatures: 25°C, 27°C, 23°C), while
information is structured and meaningful (e.g., "Average temperature today is 25°C,
indicating stable weather").

⦁ Processing: Data requires processing (e.g., aggregation, filtering), whereas information


is the result of that processing.

⦁ Utility: Data is passive and needs interpretation, while information is actionable (e.g.,
data from a smart grid shows power usage; information predicts a peak load at 6 PM).

⦁ Example in Context: In an IoT healthcare system, raw data might be heart rate readings
(e.g., 72, 75, 80 bpm) collected every minute. Information emerges when these are
analyzed to show "Patient’s heart rate averaged 75 bpm with a spike at 80 bpm,
suggesting possible stress—alert doctor."

2. What is RDBMS? List its limitations in the context of big data applications.

Definition:
A Relational Database Management System (RDBMS) is a software system that manages data
stored in a structured format using tables (rows and columns) with predefined schemas. It uses
SQL (Structured Query Language) for querying and ensures data integrity through relationships
(e.g., primary and foreign keys). Examples include MySQL, Oracle, and PostgreSQL.

Limitations in the Context of Big Data Applications:

1
⦁ Scalability Issues: RDBMS is designed for vertical scaling (adding more power to a single
server), which is costly and limited compared to the horizontal scaling (adding more
servers) required for big data’s massive volume (e.g., petabytes of IoT sensor data).

⦁ Schema Rigidity: RDBMS requires a fixed schema, making it inflexible for handling
unstructured or semi-structured big data (e.g., JSON logs from smart devices) that
evolves rapidly.

⦁ Performance with High Velocity: The row-based storage and join operations in RDBMS
struggle with real-time processing of high-velocity data streams (e.g., millions of
transactions per second in a smart city).

⦁ Limited Parallel Processing: RDBMS lacks native support for distributed computing,
unlike big data frameworks (e.g., Hadoop), leading to bottlenecks when analyzing large
datasets across multiple nodes.

⦁ Cost and Complexity: Managing large-scale RDBMS instances (e.g., Oracle RAC) involves
high licensing costs and complex administration, whereas big data solutions like NoSQL
databases are more cost-effective for scale.

⦁ Example: In a big data IoT application tracking global shipping, an RDBMS might fail to
handle 1 TB of unstructured GPS and sensor data daily, whereas a NoSQL solution like
Cassandra could scale horizontally and process it efficiently.

3. Explain the ACID properties with respect to Cassandra's behavior.

Definition of ACID Properties:


ACID (Atomicity, Consistency, Isolation, Durability) is a set of properties that ensure reliable
database transactions:

⦁ Atomicity: Ensures that a transaction is fully completed or fully rolled back; no partial
updates occur.

⦁ Consistency: Guarantees that a transaction brings the database from one valid state to
another, adhering to defined rules (e.g., constraints).

⦁ Isolation: Ensures that transactions are executed independently, preventing


interference from concurrent operations.

⦁ Durability: Ensures that once a transaction is committed, it is permanently saved, even


in case of system failure.

Cassandra's Behavior with Respect to ACID:

2
Cassandra is a distributed NoSQL database designed for high availability and scalability,
particularly for big data and IoT applications. However, it does not fully adhere to ACID
properties due to its eventual consistency model and distributed architecture. Here's how it
behaves:

⦁ Atomicity:

⦁ Cassandra ensures atomicity at the row level within a single partition. If a write
operation updates multiple columns in one row, it either succeeds or fails
entirely. However, atomicity across multiple rows or partitions is not
guaranteed unless explicitly managed (e.g., using lightweight transactions with
IF NOT EXISTS).

⦁ Consistency:

⦁ Cassandra follows an eventual consistency model rather than strict consistency.


It uses a tunable consistency level (e.g., ONE, QUORUM, ALL) where data is
consistent only after replication across nodes is complete. For example, with
QUORUM, a majority of nodes must acknowledge a write for it to be considered
consistent.

⦁ Isolation:

⦁ Cassandra provides isolation through its write path, where updates are written
to a commit log and memtable, ensuring no interference during a single
operation. However, it does not support full transaction isolation levels (e.g.,
Serializable) like RDBMS. Concurrent writes to the same row can lead to "last
write wins" conflicts unless resolved with timestamps or lightweight
transactions.

⦁ Durability:

⦁ Cassandra ensures durability by writing data to a commit log on disk before


acknowledging a write, and data is later flushed to SSTables. Replication across
multiple nodes (e.g., with a replication factor of 3) further guarantees data
persistence even if a node fails.

Example in Context:
In an IoT smart grid application, Cassandra stores power usage data. A write operation updating
voltage and current for a single sensor (row) is atomic and durable. However, if two nodes
update the same row concurrently with different values, the latest timestamp wins (weak
isolation), and consistency depends on the chosen level (e.g., QUORUM ensures eventual

3
consistency across nodes).

(15-Marks questions)

1. Discuss in detail the evolution and history of Apache Cassandra. Why was it developed and
how has it grown?

Evolution and History:


Apache Cassandra, an open-source distributed NoSQL database, has evolved significantly since
its inception. Its history can be traced as follows:

⦁ Origins (2008): Cassandra was initially developed by Facebook to power its Inbox
Search feature, which required handling massive amounts of data with high availability
and fault tolerance. It was created by Avinash Lakshman (who also co-authored
Amazon’s Dynamo) and Prashant Malik, combining ideas from Amazon’s Dynamo
(decentralized design) and Google’s Bigtable (column-family data model).

⦁ Open-Source Release (2009): Facebook open-sourced Cassandra under the Apache


License, contributing it to the Apache Software Foundation (ASF). This marked its
transition from a proprietary tool to a community-driven project.

⦁ Apache Incubator (2009–2010): Cassandra entered the Apache Incubator program,


gaining traction among developers. Version 0.6, released in 2010, introduced key
features like multi-data-center replication.

⦁ Maturity (2011–2015): With the 1.0 release in 2011, Cassandra became production-
ready, adding support for CQL (Cassandra Query Language), which resembled SQL and
improved usability. Version 2.0 (2013) introduced materialized views and lightweight
transactions, while version 3.0 (2015) enhanced performance with improved
compaction and storage engine upgrades.

⦁ Recent Developments (2016–Present): Versions like 4.0 (2021) introduced virtual


tables, storage-attached indexing (SAI), and better Java compatibility, reflecting its
adaptation to modern big data needs. As of 2025, Cassandra continues to evolve with
community contributions, supporting IoT, real-time analytics, and AI workloads.

Why It Was Developed:


Cassandra was developed to address limitations in traditional relational databases (RDBMS) for
handling large-scale, distributed data:

⦁ Scalability Needs: Facebook needed a system to scale horizontally across thousands of


nodes to manage billions of inbox messages, which RDBMS struggled to achieve due to

4
vertical scaling constraints.

⦁ High Availability: The requirement for zero downtime during server failures (e.g.,
during peak usage) led to a design inspired by Dynamo’s peer-to-peer architecture.

⦁ Write-Heavy Workloads: Inbox Search involved frequent writes (e.g., message


indexing), necessitating a database optimized for write performance over complex
queries.

⦁ Fault Tolerance: With data centers worldwide, Cassandra was designed to handle node
failures and network partitions, ensuring data accessibility.

Growth and Adoption:

⦁ Community and Ecosystem: The open-source model fostered a global community, with
companies like Netflix, Apple, and Uber adopting Cassandra for real-time applications.
The DataStax enterprise edition further commercialized it, adding tools like OpsCenter.

⦁ Use Cases: It grew to support diverse applications, from IoT sensor data storage to e-
commerce order processing, handling petabytes of data with high throughput.

⦁ Technological Advancements: Integration with Apache Spark, Kafka, and cloud


platforms (e.g., Azure Cosmos DB’s Cassandra API) expanded its ecosystem, making it a
cornerstone for big data architectures.

⦁ Global Impact: By 2025, Cassandra’s adoption in smart cities (e.g., traffic data) and
healthcare (e.g., patient monitoring) reflects its growth into a robust, globally
recognized solution, with over 2,000 contributors and continuous updates.

Example:
Netflix uses Cassandra to manage 1.3 billion requests daily across multiple data centers,
showcasing its scalability and fault tolerance, a direct result of its original design goals.

2. Compare Cassandra with Traditional RDBMS in terms of scalability, performance, and data
structure.

Scalability:

⦁ Cassandra: Excels in horizontal scalability, allowing the addition of nodes to a cluster to


handle increased load without downtime. Its peer-to-peer architecture distributes data
across nodes using consistent hashing, making it ideal for big data (e.g., IoT sensor
networks with millions of devices). For example, adding 10 nodes can linearly increase

5
capacity for a smart grid application.

⦁ Traditional RDBMS: Relies on vertical scaling (upgrading hardware like CPU or RAM),
which is limited by physical constraints and costly. Systems like Oracle or MySQL
struggle with large-scale distributed environments, often requiring sharding or
replication with complex management.

Performance:

⦁ Cassandra: Optimized for write-heavy workloads with high throughput, leveraging an


append-only log-structured merge-tree (LSM) storage engine. It achieves low-latency
writes (e.g., <5ms for IoT data ingestion) and tunable consistency (e.g., QUORUM), but
read performance can degrade with large datasets unless indexed properly. For
instance, it handles 1 million writes per second at Netflix.

⦁ Traditional RDBMS: Excels in read-heavy, transactional workloads with strong


consistency (e.g., ACID compliance), but write performance suffers under high
concurrency due to locking mechanisms (e.g., MySQL might take 50ms per transaction
in a busy e-commerce system). Complex joins and normalization also slow queries.

Data Structure:

⦁ Cassandra: A wide-column store NoSQL database, storing data in tables with flexible
schemas (e.g., rows can have different column sets). It uses a key-value pair approach
within column families, supporting unstructured or semi-structured data (e.g., JSON
from smart devices). This flexibility suits evolving IoT data models.

⦁ Traditional RDBMS: Uses a rigid, tabular structure with fixed schemas (e.g., rows and
columns in MySQL), enforcing relationships via foreign keys. This is efficient for
structured data (e.g., customer records) but inflexible for unstructured big data,
requiring schema changes for new data types.

Summary with Example:


In a smart city IoT system tracking traffic data, Cassandra scales to 1000 nodes to process 10 TB
daily, performs well with frequent sensor updates, and adapts to new data types (e.g., video
metadata). An RDBMS like PostgreSQL, however, might scale vertically to a single powerful
server, struggle with write-heavy traffic, and require schema updates for new sensor types,
limiting its big data applicability.

3. Explain CAP theorem with examples. How does Cassandra address the trade-offs?

6
Explanation of CAP Theorem:
The CAP theorem, proposed by Eric Brewer, states that a distributed system can only guarantee
two out of three properties simultaneously:

⦁ Consistency: All nodes see the same data at the same time after a write operation.

⦁ Availability: Every request receives a response, even if some nodes fail, ensuring no
downtime.

⦁ Partition Tolerance: The system continues to operate despite network partitions (e.g.,
node failures or communication breaks).

In a distributed environment, network partitions are inevitable, forcing a trade-off between


consistency and availability.

Examples:

⦁ Consistency over Availability: A banking system using an RDBMS (e.g., Oracle) ensures
all accounts reflect the same balance after a transaction (consistency), but if a partition
occurs, it may reject requests until resolved (unavailable).

⦁ Availability over Consistency: A social media platform like Twitter uses eventual
consistency, allowing users to post during a partition (available), but some followers
might see updates later (inconsistent).

⦁ Partition Tolerance in Action: An IoT smart grid with Cassandra operates across
multiple regions; if a datacenter fails, it remains functional but may temporarily show
inconsistent data until synchronized.

Cassandra’s Approach to Trade-Offs:


Cassandra is designed as a CP (Consistency and Partition Tolerance) system by default, but it
offers tunable trade-offs to lean toward AP (Availability and Partition Tolerance) when needed:

⦁ Tunable Consistency: Cassandra allows configuring consistency levels (e.g., ONE,


QUORUM, ALL) per operation. For example, using QUORUM (majority of nodes) ensures
stronger consistency during writes, while ONE prioritizes availability by accepting writes
on a single node during a partition.

⦁ Eventual Consistency: In a partitioned scenario, Cassandra replicates data


asynchronously across nodes, achieving eventual consistency once the network heals.
This suits IoT applications where immediate consistency is less critical than uptime (e.g.,
logging sensor data).

7
⦁ Replication Strategy: With a replication factor (e.g., 3), Cassandra stores copies across
nodes, ensuring partition tolerance. If one node fails, others serve data, maintaining
availability.

⦁ Lightweight Transactions: For scenarios requiring stronger consistency (e.g., unique ID


generation), Cassandra supports Paxos-based lightweight transactions, though at the
cost of higher latency.

Example in Context:
In a distributed IoT healthcare system, Cassandra stores patient vitals across three data centers.
During a network partition, it uses a QUORUM consistency level to ensure most nodes agree on
critical updates (e.g., heart rate spikes), maintaining consistency and partition tolerance. For
non-critical data (e.g., routine logs), it switches to ONE, ensuring availability by accepting writes
on any available node, with consistency restored later. This flexibility addresses trade-offs based
on application needs.

UNIT-2

(10-Marks)

1. Explain the differences between a logical and a physical data model with examples.

Definition and Differences:

⦁ Logical Data Model: Represents the conceptual structure of data, focusing on entities,
relationships, and attributes without detailing how data is stored or implemented. It is
independent of the database management system (DBMS) and emphasizes business
requirements.

8
⦁ Key Characteristics: Defines what data exists (e.g., entities like "Customer" and
"Order"), their relationships (e.g., one-to-many), and attributes (e.g.,
CustomerID, OrderDate), using tools like Entity-Relationship (ER) diagrams.

⦁ Example: In an IoT smart home system, a logical model might include entities
"Sensor" and "Room," with a relationship "Sensor monitors Room," and
attributes like SensorID and RoomTemperature, abstracting the business need
to track temperature per room.

⦁ Physical Data Model: Specifies how data is stored, accessed, and managed in a specific
DBMS, including tables, columns, indexes, partitions, and storage details. It translates
the logical model into a technical implementation.

⦁ Key Characteristics: Includes physical storage details (e.g., table names, data
types, indexes), optimization strategies (e.g., partitioning), and performance
considerations (e.g., caching).

⦁ Example: For the same smart home system, a physical model in Cassandra
might define a table "SensorData" with columns (SensorID text, RoomID text,
Temperature float, Timestamp timestamp), partitioned by SensorID, and
indexed for quick timestamp-based queries.

Differences:

⦁ Abstraction Level: Logical models are abstract and business-focused, while physical
models are technical and implementation-specific.

⦁ Dependency: Logical models are DBMS-agnostic, whereas physical models depend on


the chosen system (e.g., Cassandra vs. MySQL).

⦁ Detail: Logical models omit storage or performance details, while physical models
include them (e.g., data types, indexing).

⦁ Example in Context: A logical model for an e-commerce system might define "Product"
and "Order" entities with a many-to-one relationship. The physical model in an RDBMS
might create tables with primary keys (ProductID, OrderID) and foreign key constraints,
while in Cassandra, it might use a wide-row structure with OrderID as the partition key
and ProductID in a collection column.

2. Describe how Cassandra designs differ from RDBMS in terms of data modeling and

9
querying.

Data Modeling Differences:

⦁ Cassandra:

⦁ Uses a wide-column store model, organizing data into tables with flexible,
sparse column families. Data is modeled around query patterns rather than
normalized relationships, using partition keys and clustering columns to
distribute data across nodes.

⦁ Approach: Denormalization is encouraged to optimize read performance, with


duplication of data across tables if needed. For example, a table
"SensorReadings" might be partitioned by SensorID and clustered by
Timestamp to support time-series queries.

⦁ Flexibility: Supports unstructured or semi-structured data (e.g., JSON), suiting


evolving IoT datasets.

⦁ RDBMS:

⦁ Employs a relational model with normalized tables linked by primary and


foreign keys to avoid redundancy. Data is structured into fixed schemas with
rows and columns (e.g., a "Customers" table with CustomerID as the primary
key).

⦁ Approach: Normalization (e.g., 3NF) ensures data integrity but may require
joins, which can complicate queries. For instance, linking "Orders" and
"Customers" tables via CustomerID.

⦁ Rigidity: Requires a predefined schema, making it less adaptable to


unstructured data.

Querying Differences:

⦁ Cassandra:

⦁ Uses CQL (Cassandra Query Language), a SQL-like language, but optimized for
high-throughput, write-heavy workloads. Queries are designed based on
partition keys, limiting flexibility (e.g., cannot query without specifying the
partition key unless indexed).

⦁ Performance: Excels at single-row or range queries within a partition (e.g.,

10
"SELECT * FROM SensorReadings WHERE SensorID = 'S001' AND Timestamp >
'2025-06-20'"), with tunable consistency (e.g., QUORUM).

⦁ Limitation: Joins are not supported; data must be pre-joined during modeling.

⦁ RDBMS:

⦁ Uses SQL with full support for complex queries, including joins, aggregations,
and subqueries (e.g., "SELECT * FROM Orders JOIN Customers ON
[Link] = [Link] WHERE OrderDate >
'2025-06-20'").

⦁ Performance: Strong for read-heavy, transactional workloads but slower for


high write volumes due to locking and normalization overhead.

⦁ Flexibility: Offers rich querying capabilities but may incur latency with large
datasets or complex joins.

Example in Context:
In an IoT traffic monitoring system, Cassandra might model data with a table partitioned by
CameraID, storing timestamped vehicle counts, optimized for queries like "Get counts for
Camera C001 today." An RDBMS might normalize this into "Cameras" and "VehicleCounts"
tables, requiring a join to retrieve the same data, which could be slower with millions of
records.

3. What steps are involved in evaluating and refining a data model in Cassandra?

Steps in Evaluating and Refining a Data Model in Cassandra:

⦁ Understand Query Patterns:

⦁ Analyze the application’s read and write requirements (e.g., time-series queries
for IoT sensor data). Identify primary access patterns, such as retrieving data by
SensorID and Timestamp, to design tables around these needs.

⦁ Design Initial Model:

⦁ Create tables based on query patterns, selecting partition keys (e.g., SensorID)
to distribute data evenly and clustering columns (e.g., Timestamp) for ordering.
Define column families to denormalize data, avoiding joins. For example, a table
"SensorData" might include (SensorID, Timestamp, Value).

11
⦁ Test Data Distribution:

⦁ Load sample data and use tools like nodetool ring to check partition key
distribution. Ensure no hotspots (e.g., all data under one SensorID) to prevent
performance issues. Adjust partition keys if uneven distribution is detected.

⦁ Evaluate Performance:

⦁ Run benchmark tests with realistic workloads (e.g., 1 million inserts, 10,000
reads per second) using tools like Cassandra Stress Tool. Measure latency,
throughput, and error rates. For instance, check if a query on SensorID takes
<10ms.

⦁ Refine Partitioning and Indexing:

⦁ Adjust partition keys or add secondary indexes (e.g., Materialized Views or


Storage-Attached Indexing) if query performance is suboptimal. For example,
add a Materialized View for reverse lookups (Timestamp to SensorID) if needed.

⦁ Optimize Consistency and Replication:

⦁ Tune consistency levels (e.g., LOCAL_QUORUM) and replication factors (e.g., 3)


based on availability vs. consistency needs. Test failure scenarios to ensure data
durability and availability.

⦁ Iterate Based on Feedback:

⦁ Gather feedback from application performance and user experience (e.g., slow
dashboard updates). Redesign tables or add new ones (e.g., a summary table
for aggregated data) if query patterns change, such as adding weekly averages.

⦁ Monitor and Maintain:

⦁ Use Cassandra’s monitoring tools (e.g., OpsCenter, Prometheus) to track


metrics like compaction delays or disk usage. Refine the model by adjusting
compaction strategies or adding TTL (time-to-live) for old data (e.g., expire
sensor readings after 30 days).

Example in Context:
For an IoT weather system, an initial model with a partition key of StationID and clustering by
Timestamp is tested. If queries for hourly averages are slow, a Materialized View is added to
precompute averages, and performance is re-evaluated, ensuring sub-second response times

12
for 1000 stations.

(15-Marks)

1. Design and Explain a Complete Data Model for an E-commerce System


Using Cassandra. Include Keyspaces, Tables, and Primary Keys.
Overview:
A data model for an e-commerce system using Cassandra must be query-driven,
denormalized, and optimized for high write throughput and read scalability, typical of
online retail platforms handling product catalogs, orders, and user interactions.
Cassandra’s distributed nature suits this use case, supporting millions of transactions
daily.

Keyspace Design:

⦁ Keyspace Name: ecommerce_db

⦁ Replication Strategy: NetworkTopologyStrategy with replication factor 3 across


three data centers (e.g., US-East, US-West, EU-West) to ensure high availability
and fault tolerance.

⦁ Configuration: DURABLE_WRITES = true for data durability, and


LOCAL_QUORUM consistency for balancing availability and consistency.

Table Designs with Primary Keys:

⦁ Users Table

⦁ Purpose: Store customer details for authentication and personalization.

⦁ Structure:

sql
CollapseWrap
Copy
CREATE TABLE ecommerce_db.users (

13
user_id text PRIMARY KEY,

email text,

name text,

address text,

registration_date timestamp

);

⦁ Primary Key: user_id (partition key) ensures each user’s data is uniquely
distributed across nodes.

⦁ Explanation: Partitioned by user_id for efficient lookups (e.g., "Get user


profile for user123"). No clustering key is needed for simple queries.

⦁ Products Table

⦁ Purpose: Maintain a catalog of products with details.

⦁ Structure:

sql
CollapseWrap
Copy
CREATE TABLE ecommerce_db.products (

product_id text,

category text,

name text,

price decimal,

stock int,

PRIMARY KEY (category, product_id)

);

⦁ Primary Key: Composite key with category (partition key) and product_id
(clustering column) to distribute data by category and order products
14
within each category.

⦁ Explanation: Allows queries like "List all electronics products" or "Get


product P001 in electronics," optimizing category-based browsing.

⦁ Orders Table

⦁ Purpose: Record customer orders with order details.

⦁ Structure:

sql
CollapseWrap
Copy
CREATE TABLE ecommerce_db.orders (

order_id text,

user_id text,

order_date timestamp,

total_amount decimal,

status text,

PRIMARY KEY (user_id, order_date, order_id)

) WITH CLUSTERING ORDER BY (order_date DESC);

⦁ Primary Key: Composite key with user_id (partition key), order_date


(clustering column), and order_id (clustering column) to distribute orders
by user and order them chronologically.

⦁ Explanation: Supports queries like "Get last 10 orders for user123," with
CLUSTERING ORDER BY DESC for recent orders.

⦁ Order_Items Table

⦁ Purpose: Link orders to products with quantities.

⦁ Structure:

sql

15
CollapseWrap
Copy
CREATE TABLE ecommerce_db.order_items (

order_id text,

product_id text,

quantity int,

unit_price decimal,

PRIMARY KEY (order_id, product_id)

);

⦁ Primary Key: Composite key with order_id (partition key) and product_id
(clustering column) to store items per order.

⦁ Explanation: Enables queries like "List items in order O001,"


denormalizing data to avoid joins.

Explanation of Design:

⦁ Query-Driven Approach: Tables are designed around common queries (e.g.,


user orders, product listings), avoiding joins by denormalizing data (e.g.,
duplicating user_id in orders).

⦁ Partitioning Strategy: Partition keys (e.g., user_id, category) ensure even data
distribution, preventing hotspots in a system with 1 million users.

⦁ Scalability: Horizontal scaling is supported by adding nodes, handling peak


traffic (e.g., Black Friday sales).

⦁ Example: For a user "user123" placing order "O001" with product "P001"
(electronics category), data is stored across nodes, with queries like "SELECT *
FROM orders WHERE user_id = 'user123' LIMIT 10" returning recent orders
efficiently.

2. Compare and Contrast RDBMS and Cassandra Design Philosophies.


Include at Least Five Key Differences.
Overview:
RDBMS (e.g., MySQL, PostgreSQL) and Cassandra represent different design

16
philosophies, rooted in their intended use cases—transactional systems vs. distributed,
high-scale data stores.

Key Differences:

⦁ Data Model:

⦁ RDBMS: Uses a relational model with normalized tables and fixed


schemas, linked by primary-foreign key relationships (e.g., "Customers"
and "Orders" tables with CustomerID).

⦁ Cassandra: Employs a wide-column store model with flexible,


denormalized schemas, designed around query patterns (e.g., "Orders"
table partitioned by user_id).

⦁ Scalability:

⦁ RDBMS: Relies on vertical scaling (upgrading hardware), limiting


scalability for large datasets due to single-server constraints.

⦁ Cassandra: Designed for horizontal scaling, adding nodes to a cluster to


handle increased load, ideal for big data (e.g., scaling to 100 nodes for IoT
traffic data).

⦁ Consistency Model:

⦁ RDBMS: Enforces strict ACID consistency, ensuring all transactions


maintain data integrity (e.g., a bank transfer is either fully committed or
rolled back).

⦁ Cassandra: Uses eventual consistency with tunable levels (e.g.,


QUORUM), prioritizing availability and partition tolerance (e.g.,
accepting writes during network splits).

⦁ Querying Approach:

⦁ RDBMS: Supports complex SQL queries with joins, aggregations, and


subqueries (e.g., "SELECT * FROM Orders JOIN Customers ON
[Link] = [Link]").

⦁ Cassandra: Uses CQL, optimized for simple, partition-key-based queries


without joins (e.g., "SELECT * FROM orders WHERE user_id =
'user123'"), requiring pre-joined data.

⦁ Data Distribution:

⦁ RDBMS: Typically centralized or replicated with master-slave setups,

17
requiring manual sharding for distribution (e.g., MySQL with Galera
cluster).

⦁ Cassandra: Features a peer-to-peer, ring-based distribution using


consistent hashing, automatically balancing data across nodes (e.g., IoT
sensor data spread across a 10-node cluster).

Comparison and Contrast:

⦁ Philosophical Goal: RDBMS prioritizes data integrity and complex querying for
transactional systems (e.g., e-commerce payments), while Cassandra focuses on
high availability and scalability for write-heavy, distributed systems (e.g., real-
time analytics).

⦁ Trade-Offs: RDBMS sacrifices scalability for consistency, whereas Cassandra


trades strict consistency for availability (CAP theorem), aligning with big data
needs.

⦁ Example: In an e-commerce system, an RDBMS ensures a customer’s order total


is consistent across tables, but struggles with 1 million concurrent users.
Cassandra handles this load by distributing orders, accepting temporary
inconsistencies resolved later.

3. Explain the Entire Process of Building, Evaluating, and Refining a Data


Model in Cassandra from Start to Deployment.
Process Overview:
Building, evaluating, and refining a Cassandra data model is an iterative, query-driven
process tailored to distributed systems, ensuring performance, scalability, and
maintainability from conception to deployment.

Steps:

⦁ Requirement Analysis and Query Identification:

⦁ Gather application requirements (e.g., track IoT sensor data). Identify key
queries (e.g., "Get last hour’s readings for Sensor S001") to drive table
design.

⦁ Initial Data Model Design:

⦁ Create keyspaces and tables based on queries. Define partition keys (e.g.,
SensorID) for data distribution and clustering columns (e.g., Timestamp)
for ordering. Example:
sql

18
CollapseWrap
Copy
CREATE KEYSPACE sensor_db WITH replication = {'class':
'NetworkTopologyStrategy', 'dc1': 3};

CREATE TABLE sensor_db.readings (

sensor_id text,

timestamp timestamp,

value float,

PRIMARY KEY (sensor_id, timestamp)

) WITH CLUSTERING ORDER BY (timestamp DESC);

⦁ Denormalize data to avoid joins, ensuring each table serves a specific


query pattern.

⦁ Data Population and Distribution Testing:

⦁ Load sample data (e.g., 1 million sensor readings) using tools like cqlsh or
a data generator. Use nodetool ring to verify even distribution across
nodes, adjusting partition keys if hotspots occur (e.g., too many readings
for one SensorID).

⦁ Performance Evaluation:

⦁ Benchmark the model with realistic workloads using Cassandra Stress


Tool or custom scripts. Test latency (e.g., <10ms for reads), throughput
(e.g., 100,000 writes/second), and consistency levels (e.g.,
LOCAL_QUORUM). Monitor with OpsCenter or Prometheus for CPU,
memory, and disk usage.

⦁ Refinement Based on Feedback:

⦁ Analyze performance results. If reads are slow, add secondary indexes


(e.g., Materialized View for reverse lookups) or adjust clustering order.
For example, create a view:
sql
CollapseWrap

19
Copy
CREATE MATERIALIZED VIEW sensor_db.readings_by_time AS

SELECT * FROM sensor_db.readings

WHERE timestamp IS NOT NULL AND sensor_id IS NOT NULL

PRIMARY KEY (timestamp, sensor_id);

⦁ Optimize partition size (e.g., 100MB–1GB per partition) to balance load


and reduce compaction overhead.

⦁ Consistency and Replication Tuning:

⦁ Adjust replication factor (e.g., from 3 to 5) and consistency levels based on


availability needs. Test failure scenarios (e.g., node down) to ensure data
durability and accessibility.

⦁ Integration and Testing with Application:

⦁ Integrate the model with the application (e.g., a Java app using DataStax
driver). Perform end-to-end tests (e.g., simulate 10,000 users querying
sensor data) to validate query performance and data integrity.

⦁ Deployment and Monitoring:

⦁ Deploy the model to a production cluster using a CI/CD pipeline (e.g., via
Terraform or Ansible). Monitor post-deployment with tools like Grafana,
tracking metrics like read/write latency and node health. Set alerts for
anomalies (e.g., latency > 50ms).

⦁ Continuous Refinement:

⦁ Collect user and system feedback (e.g., slow dashboard updates). Refine
by adding new tables (e.g., aggregated hourly data) or adjusting TTL (e.g.,
expire data after 30 days) to manage storage.

Example in Context:
For an IoT weather system, the process starts with designing a readings table, populating
it with 1 million records, and testing 100 reads/second. If latency exceeds 10ms, a
Materialized View for hourly averages is added. After deployment to a 5-node cluster,
monitoring reveals a hotspot, prompting a partition key change to (SensorID, Region),
ensuring scalability and efficiency.

20
UNIT-3

10-Marks

1. Explain the role of the [Link] file in Cassandra configuration. Discuss at least five
critical properties.

Role of [Link] File:


The [Link] file is the primary configuration file for Apache Cassandra, located in the
conf directory of the Cassandra installation. It defines the operational parameters and settings
for a Cassandra node or cluster, enabling administrators to customize behavior such as data
storage, replication, networking, and performance tuning. This file is critical for initializing and
managing a Cassandra instance, especially during high-demand scenarios like an e-commerce
sale at 10:21 PM IST on June 20, 2025.

Critical Properties:

⦁ cluster_name: Specifies the name of the Cassandra cluster (e.g., "ecommerce_cluster").


It ensures nodes join the correct cluster, preventing misconfiguration in multi-cluster
environments.

⦁ listen_address: Sets the IP address or hostname for inter-node communication within


the cluster (e.g., "[Link]"). Proper configuration is vital for node discovery and
data replication during peak traffic.

⦁ rpc_address: Defines the IP address for client connections (e.g., "[Link]" for all
interfaces). This enables clients to query the node, critical for real-time order processing
tonight.

⦁ num_tokens: Determines the number of virtual nodes (vnodes) per physical node
(default 256). Increasing this (e.g., to 512) improves data distribution across a 10-node
cluster handling 1 million orders.

⦁ endpoint_snitch: Configures the snitch for data center awareness (e.g.,


"GossipingPropertyFileSnitch"). This optimizes replication and read/write operations
across geographically distributed nodes (e.g., India-South, US-East).

Explanation:

21
The [Link] file acts as the central control point, allowing fine-tuning to match workload
demands. For instance, during the current sale, adjusting num_tokens and endpoint_snitch
ensures balanced data distribution and efficient cross-region replication, while listen_address
and rpc_address maintain cluster connectivity and client access.

2. Describe the importance of proper directory configuration in Cassandra, including the


purpose of data_file_directories, commitlog_directory, and hints_directory.

Importance of Proper Directory Configuration:


Proper directory configuration in Cassandra is crucial for performance, durability, and
recoverability. It separates different types of data and logs onto distinct storage devices (e.g.,
SSDs for data, HDDs for logs), optimizing I/O operations and ensuring fault tolerance.
Misconfiguration can lead to performance bottlenecks or data loss, especially under heavy
loads like the e-commerce sale at 10:21 PM IST on June 20, 2025.

Key Directories and Purposes:

⦁ data_file_directories:

⦁ Purpose: Specifies the directories where SSTables (on-disk data files) are stored
(e.g., "/var/lib/cassandra/data"). Multiple directories can be listed for load
balancing across disks.

⦁ Importance: Ensures data is distributed across high-performance storage (e.g.,


SSDs), reducing read/write latency for 1 million order records. Proper sizing
prevents disk space exhaustion.

⦁ commitlog_directory:

⦁ Purpose: Defines the location for the commit log, a durable record of all writes
before they are flushed to SSTables (e.g., "/var/lib/cassandra/commitlog").

⦁ Importance: Isolating the commit log on a separate, fast device (e.g., another
SSD) improves write performance and recovery speed after a node restart,
critical for maintaining order integrity during peak traffic.

⦁ hints_directory:

⦁ Purpose: Stores hint files that track data for nodes that are temporarily down,
enabling repair when they recover (e.g., "/var/lib/cassandra/hints").

⦁ Importance: Ensures data consistency across a 10-node cluster during network

22
partitions (e.g., a datacenter outage), allowing seamless operation and
recovery, which is vital for real-time e-commerce updates.

Explanation:
Configuring these directories on separate drives (e.g., SSD for data_file_directories, another for
commitlog_directory) leverages I/O parallelism, enhancing throughput and fault tolerance. For
example, during tonight’s sale, a well-configured setup ensures quick writes to the commit log
and efficient SSTable access, while hints facilitate recovery if a node fails under load.

3. Discuss the use and significance of [Link] in JVM tuning for Cassandra
performance optimization.

Role of [Link] File:


The [Link] file, located in the conf directory, is a shell script used to configure the
Java Virtual Machine (JVM) settings for Cassandra. It allows administrators to tune memory
allocation, garbage collection (GC), and other JVM parameters to optimize performance for
specific workloads. This is particularly important for handling the high memory and CPU
demands of an e-commerce system during the sale peak at 10:21 PM IST on June 20, 2025.

Use and Significance:

⦁ Memory Configuration:

⦁ Use: Sets JVM heap size via MAX_HEAP_SIZE and HEAP_NEWSIZE (e.g.,
MAX_HEAP_SIZE="8G" for an 8GB heap).

⦁ Significance: Proper sizing prevents out-of-memory errors during 1 million


order processes, balancing heap for young and old generation objects to reduce
GC pauses.

⦁ Garbage Collection Tuning:

⦁ Use: Configures GC algorithms (e.g., -XX:+UseG1GC for the G1 collector) and


parameters like -XX:MaxGCPauseMillis=200 to limit pause times.

⦁ Significance: Optimizes GC to minimize latency (e.g., <200ms pauses), ensuring


real-time query responsiveness for order lookups and updates during peak
traffic.

⦁ Thread and CPU Optimization:

⦁ Use: Adjusts thread pool sizes (e.g., JVM_OPTS="$JVM_OPTS -

23
XX:ParallelGCThreads=4") based on CPU cores.

⦁ Significance: Enhances parallel processing of writes and reads across a 10-node


cluster, improving throughput for the current sale’s high concurrency.

⦁ JVM Options for Stability:

⦁ Use: Includes flags like -XX:+HeapDumpOnOutOfMemoryError to diagnose


issues.

⦁ Significance: Provides debugging data if memory issues arise, ensuring system


stability under load, which is critical for uninterrupted e-commerce operations.

Explanation:
The [Link] file bridges Cassandra’s performance with JVM capabilities, allowing
customization for workload-specific needs. For instance, during tonight’s sale, increasing
MAX_HEAP_SIZE to 8GB and using G1GC ensures the cluster handles 100,000 writes/second
without GC-related delays, while thread tuning maximizes CPU utilization across nodes.

15-Marks

1. Design a Configuration Strategy for Deploying Cassandra in a Production Multi-Node


Cluster. Include Details Such as Cluster Name, Seed Nodes, IP Settings, Directory Separation,
and Memory Tuning.

Overview:
Deploying Cassandra in a production multi-node cluster requires a robust configuration strategy
to ensure scalability, high availability, and performance, especially for a high-traffic scenario like
an e-commerce sale at 10:43 PM IST on June 20, 2025. This strategy balances resource
allocation, data durability, and network efficiency across a 5-node cluster.

Configuration Strategy:

⦁ Cluster Name:

⦁ Setting: cluster_name: 'ecommerce_prod_cluster' in [Link].

⦁ Rationale: A unique name ensures nodes join the intended cluster, preventing
misconfiguration during the sale peak with 1 million orders.

⦁ Seed Nodes:

24
⦁ Setting: seed_provider: [{class_name:
[Link], parameters: [{seeds:
"[Link],[Link]"}]}] in [Link].

⦁ Rationale: Designates nodes [Link] and [Link] as seeds for initial


node discovery and gossip communication, ensuring cluster stability across 5
nodes in India-South and US-East data centers.

⦁ IP Settings:

⦁ listen_address: listen_address: 192.168.1.x (e.g., [Link] for node 1) for


inter-node communication.

⦁ rpc_address: rpc_address: [Link] to allow client connections from any


interface, critical for global order processing.

⦁ broadcast_address: broadcast_address: <public_ip> (e.g., [Link]) for


nodes with NAT, ensuring cross-data-center connectivity.

⦁ Rationale: Proper IP configuration avoids network conflicts and enables


efficient communication, supporting 100,000 writes/second during the sale.

⦁ Directory Separation:

⦁ data_file_directories: /mnt/ssd1/cassandra/data, /mnt/ssd2/cassandra/data


on SSDs for SSTables.

⦁ commitlog_directory: /mnt/ssd3/cassandra/commitlog on a separate SSD.

⦁ hints_directory: /mnt/hdd/cassandra/hints on an HDD.

⦁ saved_caches_directory: /mnt/ssd4/cassandra/saved_caches on an SSD.

⦁ Rationale: Separating directories onto different drives (SSDs for data/caches,


SSD for commit log, HDD for hints) optimizes I/O performance and recovery,
handling 10 TB of order data tonight.

⦁ Memory Tuning:

⦁ JVM Settings in [Link]:

⦁ MAX_HEAP_SIZE="8G" and HEAP_NEWSIZE="2G" for an 8GB heap with


2GB for young generation.

25
⦁ -XX:+UseG1GC for the G1 garbage collector, -XX:MaxGCPauseMillis=200
to limit pauses.

⦁ -XX:ParallelGCThreads=4 based on 4-core CPUs.

⦁ Rationale: Tuning heap size and GC prevents out-of-memory errors, while G1GC
minimizes latency (<200ms) for real-time queries, supporting 10,000 concurrent
users.

Implementation Steps:

⦁ Configure [Link] on each node with the above settings, ensuring consistency.

⦁ Use NetworkTopologyStrategy with replication factor 3 in ecommerce_prod_cluster.

⦁ Start nodes sequentially, verifying cluster status with nodetool status at 10:43 PM IST.

⦁ Monitor with OpsCenter to confirm data distribution and performance.

Example:
During the sale, the 5-node cluster handles 1 million orders, with seed nodes facilitating join,
SSDs ensuring fast data access, and JVM tuning maintaining sub-10ms latency for order lookups.

2. Elaborate on the Step-by-Step Process to Modify Cassandra to Handle Large-Scale


Workloads, Highlighting the Importance of Each Configuration File Involved.

Overview:
Modifying Cassandra to handle large-scale workloads (e.g., 1 million orders at 10:43 PM IST on
June 20, 2025) involves adjusting configuration files to optimize performance, scalability, and
reliability. This process ensures the system adapts to increased data volume and concurrency.

Step-by-Step Process:

⦁ Assess Workload Requirements:

⦁ Analyze expected load (e.g., 100,000 writes/second, 10 TB data). Identify query


patterns (e.g., order retrieval by user_id).

⦁ Importance: Guides configuration adjustments to match real-time demands.

⦁ Update [Link]:

⦁ Modifications: Increase num_tokens to 512 for better data distribution, set

26
endpoint_snitch to "GossipingPropertyFileSnitch" for multi-data-center support,
and adjust concurrent_writers to 32 for high write throughput.

⦁ Importance: Optimizes data placement and write performance, preventing


bottlenecks in a 5-node cluster.

⦁ Tune [Link] for JVM:

⦁ Modifications: Set MAX_HEAP_SIZE="12G" and HEAP_NEWSIZE="3G" for larger


datasets, use -XX:+UseG1GC with -XX:MaxGCPauseMillis=100 for lower latency,
and adjust -XX:ParallelGCThreads=8 for 8-core nodes.

⦁ Importance: Enhances memory management and reduces GC pauses, ensuring


sub-100ms response times for 10,000 users.

⦁ Configure [Link] for Logging:

⦁ Modifications: Set log level to "INFO" and rotate logs daily to


/var/log/cassandra to manage disk space.

⦁ Importance: Prevents log overload during peak traffic, aiding troubleshooting


without performance impact.

⦁ Adjust [Link]:

⦁ Modifications: Define data centers (e.g., dc=India-South, rack=Rack1) and


replicate with NetworkTopologyStrategy (replication factor 3).

⦁ Importance: Ensures data locality and fault tolerance across regions, critical for
global order processing.

⦁ Test and Benchmark:

⦁ Use Cassandra Stress Tool to simulate 1 million writes and 100,000 reads,
monitoring latency and throughput with OpsCenter.

⦁ Importance: Validates configuration changes, ensuring the cluster handles the


sale load at 10:43 PM IST.

⦁ Deploy and Monitor:

⦁ Roll out changes to the 5-node cluster, restarting nodes with nodetool drain
and nodetool start. Monitor with Grafana for metrics (e.g., CPU > 80%, latency >

27
50ms).

⦁ Importance: Ensures seamless transition and ongoing optimization during live


operation.

Example:
For the sale, increasing num_tokens and heap size in [Link] and [Link]
allows the cluster to scale to 1.5 million orders, while [Link] prevents log saturation, and
[Link] ensures data availability across India-South and US-East.

3. Analyze the Implications of Incorrect Configuration in [Link]. Provide Examples of


Misconfigurations and Their Effects on Cluster Behavior.

Overview:
Incorrect configuration in [Link] can lead to performance degradation, data loss, or
cluster instability, especially under the high load of an e-commerce sale at 10:43 PM IST on June
20, 2025. Understanding these implications is key to maintaining a 5-node cluster handling 1
million orders.

Implications:

⦁ Performance Degradation: Misconfigured settings can overload nodes, increasing


latency and reducing throughput.

⦁ Data Inconsistency: Improper replication or consistency settings may lead to data loss
or unavailability.

⦁ Cluster Instability: Wrong network or seed node settings can cause nodes to fail to join,
disrupting operations.

⦁ Resource Exhaustion: Poor directory or memory settings can exhaust disk space or
memory, crashing nodes.

Examples of Misconfigurations and Effects:

⦁ Incorrect listen_address (e.g., listen_address: [Link]):

⦁ Effect: Limits inter-node communication to localhost, preventing the 5-node


cluster from forming. Nodes cannot replicate data, leading to unavailability of
order data during the sale.

⦁ Implication: Orders for 500,000 users may fail to sync, causing transaction

28
losses.

⦁ Unbalanced num_tokens (e.g., num_tokens: 16 on some nodes, 256 on others):

⦁ Effect: Causes uneven data distribution, creating hotspots on nodes with fewer
tokens. This overloads disk I/O, increasing latency to >50ms for order queries.

⦁ Implication: Slows order processing, affecting 1 million transactions and


customer satisfaction.

⦁ Improper endpoint_snitch (e.g., SimpleSnitch in a multi-DC setup):

⦁ Effect: Ignores data center awareness, leading to inefficient replication across


India-South and US-East. Reads may hit remote nodes, adding 100ms latency.

⦁ Implication: Delays real-time order updates, impacting 10,000 concurrent


users.

⦁ Low concurrent_writers (e.g., concurrent_writers: 4):

⦁ Effect: Limits write throughput to 10,000 writes/second, causing queue buildup


during 100,000 writes/second demand. This leads to timeouts and failed orders.

⦁ Implication: Results in lost sales and customer complaints during the peak sale.

⦁ Missing data_file_directories (e.g., default only):

⦁ Effect: Forces all SSTables onto a single disk, exhausting 1 TB space with 10 TB
of order data. This causes node crashes and data unavailability.

⦁ Implication: Halts order processing, disrupting the entire e-commerce


operation.

Mitigation:
Regular validation with nodetool status and monitoring with Grafana can detect these issues.
For example, adjusting listen_address to a node-specific IP and adding multiple
data_file_directories on SSDs ensures stability and performance during the sale.

Example:
If num_tokens is misconfigured to 16, a node handling 2 TB of orders may crash, while correct
configuration to 256 balances the load, maintaining sub-10ms latency for 1 million orders at
10:43 PM IST.

29
UNIT-4

10-Marks

1. Explain the output and significance of the 'nodetool status' command in Cassandra.

Output of 'nodetool status':


The nodetool status command provides a snapshot of the Cassandra cluster's health and node
status. When executed (e.g., at 10:49 PM IST on June 20, 2025), it displays a table with the
following columns:

⦁ Status: A letter indicating node state (e.g., "UN" for Up/Normal, "UD" for Up/Leaving,
"DN" for Down/Normal).

⦁ State: The node’s role (e.g., "N" for Normal, "L" for Leaving).

⦁ Address: The IP address of each node (e.g., [Link]).

⦁ Load: The amount of data stored on the node (e.g., 1.23 TB).

⦁ Tokens: Number of virtual nodes (e.g., 256).

⦁ Owns: Percentage of data owned by the node (e.g., 20.0%).

⦁ Host ID: Unique identifier for the node (e.g., a UUID).

⦁ Rack/DC: Data center and rack information (e.g., "dc1/rack1").

⦁ Example Output:

text

CollapseWrap

Copy

Datacenter: dc1

Status=Up/Down

|/ State=Normal/Leaving/Joining/Moving

-- Address Load Tokens Owns Host ID Rack

UN [Link] 1.23 TB 256 20.0% 550e8400-e29b-11d4-a716-446655440000 rack1

30
UN [Link] 1.25 TB 256 20.0% 550e8400-e29b-11d4-a716-446655440001 rack1

DN [Link] 1.20 TB 256 20.0% 550e8400-e29b-11d4-a716-446655440002 rack1

Significance:

⦁ Cluster Health: Indicates operational status (e.g., "UN" nodes are healthy, "DN" nodes
need investigation), critical for ensuring 1 million e-commerce orders process smoothly
at 10:49 PM IST.

⦁ Load Balancing: Monitors data distribution (e.g., 1.23 TB vs. 1.20 TB) to detect
imbalances, preventing hotspots during peak traffic.

⦁ Fault Detection: Identifies down nodes (e.g., [Link]) for repair or replacement,
ensuring high availability.

⦁ Capacity Planning: Helps assess if the 5-node cluster can handle 10 TB of data, guiding
node additions.

⦁ Example: During tonight’s sale, nodetool status reveals a "DN" node, prompting
immediate repair to maintain order processing for 10,000 concurrent users.

2. Describe how the 'nodetool info' command helps administrators monitor Cassandra nodes.

Output of 'nodetool info':


The nodetool info command provides detailed information about a specific Cassandra node.
When run (e.g., nodetool info on [Link] at 10:49 PM IST), it outputs metrics such as:

⦁ ID: The node’s unique Host ID (e.g., 550e8400-e29b-11d4-a716-446655440000).

⦁ Gossip Active: Whether gossip is running (e.g., "true").

⦁ Thrift Active: Status of the Thrift interface (e.g., "true").

⦁ Native Transport Active: Status of CQL native transport (e.g., "true").

⦁ Load: Data size on the node (e.g., 1.23 TB).

⦁ Generation No: Startup timestamp (e.g., 1624200000).

⦁ Uptime: Time since last restart (e.g., 10d 2h).

31
⦁ Heap Memory Usage: Current and maximum heap (e.g., 6G/8G).

⦁ Off Heap Memory Usage: Non-heap memory (e.g., 512M).

⦁ Data Center/Rack: Location info (e.g., "dc1/rack1").

⦁ Example Output:

ID : 550e8400-e29b-11d4-a716-446655440000

Gossip active : true

Thrift active : true

Native Transport active: true

Load : 1.23 TB

Generation No : 1624200000

Uptime : 10d 2h

Heap Memory (mb) : 6144.00 / 8192.00

Off Heap Memory (mb) : 512.00

Data Center : dc1

Rack : rack1

How It Helps Administrators:

⦁ Health Monitoring: Confirms gossip and Thrift activity, ensuring node communication
and client access, vital for 100,000 writes/second during the sale.

⦁ Resource Utilization: Tracks heap usage (e.g., 6G/8G) to detect memory pressure,
guiding JVM tuning to avoid crashes under 10,000 users.

⦁ Load Assessment: Monitors data load (e.g., 1.23 TB) to identify storage constraints,
aiding capacity planning for 10 TB of order data.

⦁ Uptime and Stability: Verifies uptime (e.g., 10d 2h) to assess reliability, critical for

32
uninterrupted e-commerce operations.

⦁ Troubleshooting: Identifies issues (e.g., "Native Transport active: false") for quick
resolution, preventing downtime.

⦁ Example: At 10:49 PM IST, nodetool info shows 90% heap usage, prompting an increase
to 12G in [Link] to handle the sale peak.

3. List and Explain the Different Thread Pool Metrics Available Through 'nodetool tpstats'.

Overview of 'nodetool tpstats':


The nodetool tpstats command provides statistics on Cassandra’s thread pools, which manage
various tasks like reads, writes, and compaction. Executed at 10:49 PM IST, it helps optimize
performance for the current e-commerce workload.

Thread Pool Metrics and Explanations:

⦁ ReadStage:

⦁ Description: Handles read operations from the database.

⦁ Metrics: Active, Pending, Completed, Blocked tasks (e.g., Active: 5, Pending: 2).

⦁ Significance: High pending tasks (e.g., 10) indicate read bottlenecks, suggesting
index optimization for 100,000 order lookups.

⦁ MutationStage:

⦁ Description: Manages write operations (mutations) to the database.

⦁ Metrics: Active, Pending, Completed, Blocked (e.g., Active: 8, Pending: 3).

⦁ Significance: Excessive pending writes (e.g., 15) during 1 million order inserts
signal the need to increase concurrent_writers in [Link].

⦁ CompactionExecutor:

⦁ Description: Handles SSTable compaction to merge and clean data.

⦁ Metrics: Active, Pending, Completed, Blocked (e.g., Active: 2, Pending: 1).

⦁ Significance: High pending compactions (e.g., 5) can slow reads, requiring


tuning of compaction_throughput_mb_per_sec to free resources during peak

33
traffic.

⦁ RequestResponseStage:

⦁ Description: Manages inter-node communication for reads and writes.

⦁ Metrics: Active, Pending, Completed, Blocked (e.g., Active: 3, Pending: 0).

⦁ Significance: Pending tasks (e.g., 4) indicate network delays, prompting


listen_address verification across the 5-node cluster.

⦁ GossipStage:

⦁ Description: Manages node state propagation via gossip protocol.

⦁ Metrics: Active, Pending, Completed, Blocked (e.g., Active: 1, Pending: 0).

⦁ Significance: High activity (e.g., 5 pending) suggests network issues, affecting


cluster stability and order replication.

⦁ MigrationStage:

⦁ Description: Handles schema changes or data migration.

⦁ Metrics: Active, Pending, Completed, Blocked (e.g., Active: 0, Pending: 0).

⦁ Significance: Pending tasks (e.g., 2) during schema updates can disrupt order
processing, requiring off-peak execution.

⦁ Example Output:

Pool Name Active Pending Completed Blocked

ReadStage 5 2 100000 0

MutationStage 8 3 150000 0

CompactionExecutor 2 1 5000 0

RequestResponseStage 3 0 80000 0

GossipStage 1 0 1000 0

MigrationStage 0 0 10 0

Explanation:

34
These metrics help diagnose performance issues. For instance, at 10:49 PM IST, 3 pending
MutationStage tasks during 1 million order writes suggest increasing concurrent_writers to 32,
while 2 pending ReadStage tasks indicate indexing needs for faster order retrieval, ensuring
sub-10ms latency for 10,000 users.

15-Marks

nodetool is a command-line utility for managing and monitoring Cassandra clusters. It is


essential for maintaining a 5-node cluster handling 1 million e-commerce orders at 10:56 PM
IST on June 20, 2025. Below are demonstrations of major commands with practical scenarios.

Major Commands and Scenarios:

⦁ nodetool status:

⦁ Usage: nodetool status

⦁ Scenario: During the sale peak, a node ([Link]) shows "DN"


(Down/Normal).

⦁ Demonstration

⦁  Action: Restart the node with nodetool start or repair data with nodetool repair to
restore availability for 500,000 order processes.

 nodetool info:

⦁ Usage: nodetool info

35
⦁ Scenario: Heap memory usage reaches 90% (7.2G/8G) on node [Link], slowing
order queries.

⦁ Demonstration:

⦁  Action: Adjust MAX_HEAP_SIZE to 12G in [Link] and restart the node to


handle 10,000 concurrent users.

 nodetool tpstats:

⦁ Usage: nodetool tpstats

⦁ Scenario: MutationStage shows 15 pending tasks, delaying 100,000 writes/second.

⦁ Demonstration:

36
⦁ Action: Increase concurrent_writers to 32 in [Link] and retest to
reduce pending tasks.

⦁ nodetool repair:

⦁ Usage: nodetool repair -full

⦁ Scenario: After a node reboot, data inconsistency affects 100,000 orders.

⦁ Demonstration: Runs a full repair to synchronize data across the cluster.

⦁ Action: Schedule during off-peak (e.g., 2 AM IST) to avoid impacting the sale,
ensuring data integrity.

⦁ nodetool cleanup:

⦁ Usage: nodetool cleanup

⦁ Scenario: After adding a new node, old data on [Link] causes


redundancy.

⦁ Demonstration: Removes obsolete data, reducing load to 1.20 TB.

⦁ Action: Execute post-node addition to optimize storage for 10 TB total data.

Explanation:
These commands enable proactive maintenance, ensuring the cluster handles the sale load at
10:56 PM IST. For example, repair restores consistency, while cleanup optimizes space,
maintaining sub-10ms latency for order processing.

2. Analyze a Scenario Where nodetool tpstats Reveals Thread Pool Congestion. What
Corrective Actions Would You Recommend?

Scenario Analysis:
At 10:56 PM IST on June 20, 2025, during an e-commerce sale with 1 million orders, nodetool
tpstats on a 5-node cluster shows:

37
⦁ Observations:

⦁ ReadStage with 20 pending and 5 blocked tasks indicates read bottlenecks,


slowing order lookups for 10,000 users.

⦁ MutationStage with 30 pending and 10 blocked tasks suggests write congestion,


delaying 100,000 writes/second.

⦁ CompactionExecutor with 10 pending tasks shows compaction delays,


impacting read performance.

⦁ RequestResponseStage with 15 pending tasks hints at inter-node


communication issues.

⦁ Root Causes:

⦁ Insufficient thread pool sizes, high I/O contention, or inadequate JVM memory
(e.g., 8G heap at 90% usage) under peak load.

⦁ Possible misconfiguration of concurrent_reads, concurrent_writers, or


compaction_throughput_mb_per_sec in [Link].

Corrective Actions:

⦁ Adjust Thread Pool Settings in [Link]:

⦁ Increase concurrent_reads to 32 and concurrent_writers to 64 to handle 20

38
pending reads and 30 pending writes.

⦁ Rationale: Reduces queue buildup, ensuring sub-10ms latency for reads and
writes.

⦁ Tune Compaction in [Link]:

⦁ Set compaction_throughput_mb_per_sec to 64 (from default 16) to clear 10


pending compactions.

⦁ Rationale: Frees I/O resources, improving read performance for order queries.

⦁ Optimize JVM Memory in [Link]:

⦁ Increase MAX_HEAP_SIZE to 12G and HEAP_NEWSIZE to 3G, using -


XX:MaxGCPauseMillis=100 with G1GC.

⦁ Rationale: Reduces blocked tasks (e.g., 10 in MutationStage) by minimizing GC


pauses.

⦁ Monitor and Redistribute Load:

⦁ Use nodetool status to check load (e.g., 1.25 TB vs. 1.20 TB) and nodetool
cleanup on uneven nodes.

⦁ Rationale: Balances data, preventing hotspots that exacerbate congestion.

⦁ Scale Cluster if Needed:

⦁ Add a node if pending tasks exceed 50 after tuning, verified by nodetool tpstats.

⦁ Rationale: Distributes load across 6 nodes, supporting 1.2 million orders.

Implementation:
Apply changes, restart nodes with nodetool drain and nodetool start, and re-run nodetool
tpstats to confirm pending tasks drop (e.g., to 5 for MutationStage). Monitor with Grafana to
ensure stability.

Example:
After tuning, ReadStage pending drops to 5, and MutationStage to 10, reducing latency to 5ms,
ensuring smooth order processing for the sale peak.

3. Explain the Role of Gossip in Cassandra. How Can nodetool info Help Verify Its Status and

39
Issues?

Role of Gossip in Cassandra:


Gossip is Cassandra’s peer-to-peer protocol for node communication and cluster state
management. It enables:

⦁ Node Discovery: New nodes join the cluster by contacting seed nodes, sharing state via
gossip messages.

⦁ State Propagation: Each node periodically (every second) exchanges state information
(e.g., uptime, load) with up to three random peers, ensuring all nodes have a consistent
view.

⦁ Failure Detection: Marks nodes as down if they miss heartbeats, triggering repairs (e.g.,
hint replay).

⦁ Load Balancing: Distributes data ownership percentages based on node availability,


critical for a 5-node cluster handling 1 million orders at 10:56 PM IST.

⦁ Example: If node [Link] fails, gossip updates other nodes, enabling the cluster to
redistribute 1.20 TB of data.

How nodetool info Helps Verify Status and Issues:

⦁ Gossip Active Check:

⦁ Output: Gossip active: true indicates the protocol is running.

⦁ Verification: Confirms node [Link] at 10:56 PM IST is exchanging state,


ensuring order data replication.

⦁ Issue Detection: Gossip active: false signals a failure, possibly due to network
issues or listen_address misconfiguration, halting state updates.

⦁ Load and Uptime Correlation:

⦁ Output: Load: 1.23 TB, Uptime: 10d 2h.

⦁ Verification: Consistent load and uptime across nodes (e.g., 1.20–1.25 TB, 10d
uptime) suggest healthy gossip, balancing 10 TB of data.

⦁ Issue Detection: Divergent values (e.g., 0 TB load) indicate a node isn’t receiving
gossip, possibly due to a seed node failure.

40
⦁ Thrift and Native Transport Status:

⦁ Output: Thrift active: true, Native Transport active: true.

⦁ Verification: Ensures client communication aligns with gossip state, supporting


10,000 user queries.

⦁ Issue Detection: false values with active gossip suggest internal inconsistencies,
requiring nodetool drain and restart.

⦁ Actionable Insights:

⦁ If Gossip active: false, check network (e.g., ping [Link]) and


[Link] settings. Use nodetool gossipinfo for detailed state to pinpoint
issues.

⦁ Example: At 10:56 PM IST, nodetool info shows Gossip active: true and Load:
1.23 TB, confirming healthy gossip, while a false state would trigger a network
diagnostic.

Explanation:
Gossip ensures cluster cohesion, and nodetool info provides a real-time health check. For the
sale, verifying gossip status prevents data inconsistency, ensuring all 1 million orders are
processed reliably.

41
UNIT-5

10-marks

1. Explain the role and purpose of the commit log and memtable in Cassandra's write path.

Commit Log:

⦁ Role: The commit log is a durable log file that records all write operations before they
are applied to the memtable. It is stored on disk (e.g., /var/lib/cassandra/commitlog)
and serves as a crash-recovery mechanism.

⦁ Purpose: Ensures data durability by persisting writes immediately, allowing Cassandra


to recover data after a node failure or restart. For example, during the e-commerce sale
at 11:07 PM IST with 100,000 writes/second, if a node crashes, the commit log enables
replay of uncommitted data.

⦁ Process: When a write arrives, it is appended to the commit log synchronously,


guaranteeing that even if the system fails, data up to the last commit is recoverable.

42
Memtable:

⦁ Role: The memtable is an in-memory data structure (e.g., a sorted skip list) that holds
recent write data before it is flushed to disk as an SSTable. It resides in the JVM heap.

⦁ Purpose: Provides fast write performance by allowing in-memory storage and quick
lookups for recent data. During the sale peak, the memtable handles 100,000 order
writes/second, enabling sub-10ms latency for updates.

⦁ Process: Writes are first written to the commit log, then inserted into the memtable.
Once the memtable reaches a size threshold (e.g., 128MB) or a time limit, it is flushed
to an SSTable.

Interaction in Write Path:

⦁ The write path begins with a client request, which is logged in the commit log for
durability and stored in the memtable for performance. This dual mechanism ensures
both speed (memtable) and reliability (commit log), critical for handling 1 million orders
tonight.

2. Describe the flushing process in Cassandra. When does it happen, and what are the
consequences?

Flushing Process:
The flushing process in Cassandra involves transferring data from the memtable to an SSTable
on disk. It occurs in the following steps:

⦁ When the memtable reaches a configurable size threshold (e.g., 128MB by default, set
via memtable_flush_writers in [Link]) or a time interval (e.g., every 10
minutes, controlled by memtable_flush_period_in_ms), it is marked as immutable.

⦁ A new memtable is created to handle incoming writes, while the old memtable is
written to disk as an SSTable in the data_file_directories.

⦁ The commit log entries for the flushed data are cleared to free space, ensuring the log
doesn’t grow indefinitely.

When It Happens:

⦁ Size-Based Trigger: Occurs when the memtable size exceeds the threshold, e.g., after
100,000 order writes accumulate 150MB of data at 11:07 PM IST.

43
⦁ Time-Based Trigger: Happens periodically (e.g., every 10 minutes) even if the size limit
isn’t reached, ensuring regular data persistence.

⦁ Manual Trigger: Administrators can force flushing with nodetool flush during
maintenance, e.g., to stabilize the cluster under load.

Consequences:

⦁ Positive:

⦁ Frees up memory in the JVM heap, preventing out-of-memory errors during the
sale peak with 10,000 concurrent users.

⦁ Persists data to SSTables, enabling durable storage and reducing commit log
size for faster recovery.

⦁ Negative:

⦁ Increases I/O load on data_file_directories (e.g., SSDs), potentially causing


latency spikes (e.g., >20ms) if disks are slow.

⦁ Triggers compaction later, which can consume CPU and I/O resources, slowing
reads if not tuned (e.g., via compaction_throughput_mb_per_sec).

⦁ Example: If flushing occurs for 200MB of order data, it ensures durability but may
temporarily delay reads, requiring SSD optimization to maintain sub-10ms performance.

3. List and Describe the Main Components of an SSTable and Their Purposes.

Main Components of an SSTable:


An SSTable (Sorted String Table) is the on-disk data file in Cassandra, created during the
memtable flush process. Its main components include:

⦁ Data File (.db):

⦁ Description: Contains the actual data in a sorted order based on the partition
key and clustering columns (e.g., order data sorted by user_id and order_date).

⦁ Purpose: Stores the persistent representation of memtable data, enabling

44
efficient read access. For the 1 million orders at 11:07 PM IST, it holds 10 TB of
sorted records.

⦁ Index File (.index):

⦁ Description: A summary index mapping partition keys to file offsets, allowing


quick location of data ranges (e.g., offset for user_id = 'user123').

⦁ Purpose: Speeds up read operations by reducing seek time, critical for


sub-10ms order lookups across a 5-node cluster.

⦁ Filter File (.filter or .bloom):

⦁ Description: A Bloom filter that predicts whether a partition key exists,


minimizing disk reads (e.g., checks if user_id = 'user999' is present).

⦁ Purpose: Enhances read performance by avoiding unnecessary I/O, especially


useful for the 10,000 concurrent queries during the sale.

⦁ Statistics File (.stats):

⦁ Description: Stores metadata about the SSTable, such as min/max partition


keys, compression ratios, and row counts (e.g., min key = 'user001', max =
'user500').

⦁ Purpose: Assists in compaction and repair processes, optimizing space and


ensuring data integrity across 10 TB of data.

⦁ Summary File (.summary):

⦁ Description: A sampled index of partition keys at intervals (e.g., every 100th


key), reducing memory usage compared to the full index.

⦁ Purpose: Improves read efficiency by providing a coarse-grained lookup,


supporting fast range queries (e.g., orders for 'user100' to 'user200').

⦁ TOC File (.toc):

⦁ Description: A table of contents listing all component files (e.g., .db, .index) for
the SSTable.

⦁ Purpose: Facilitates file management and integrity checks during maintenance,


ensuring all 10 TB of SSTables are correctly structured.

45
Explanation:
These components work together to balance storage efficiency and query performance. For
instance, during the sale, the Bloom filter and index file enable rapid order retrieval, while the
statistics file aids in optimizing compactions, maintaining cluster health under load.

15-Marks

1. Draw and Explain the Complete Write Path in Cassandra, Including Commit Log, Memtable,
and Flushing to SSTables.

Description of Write Path Diagram:


Imagine a flowchart starting with a "Client Write Request" (e.g., an order for "user123" at 11:13
PM IST). The path splits into two parallel steps:

⦁ An arrow points to "Commit Log" (a disk file, e.g., /var/lib/cassandra/commitlog), where


the write is appended synchronously.

⦁ Another arrow points to "Memtable" (an in-memory structure), where the write is
inserted.
From the Memtable, an arrow labeled "Flush Trigger (Size/Time)" leads to "SSTable"
(on-disk file, e.g., /var/lib/cassandra/data), with a feedback loop to "Clear Commit Log"
after flushing.

Explanation of Write Path:

⦁ Step 1: Client Write Request: A write operation (e.g., inserting an order with user_id,
order_date, and total_amount) arrives from a client during the e-commerce sale peak.

⦁ Step 2: Commit Log Update: The write is synchronously appended to the commit log on
disk, ensuring durability. For 100,000 writes/second, this guarantees recovery if a node
fails, protecting 1 million orders.

⦁ Step 3: Memtable Insertion: The write is added to the memtable, a sorted in-memory
structure (e.g., a skip list), enabling fast lookups and writes with sub-10ms latency. The
memtable grows with each write.

⦁ Step 4: Flush Trigger: When the memtable reaches a size threshold (e.g., 128MB) or a
time limit (e.g., 10 minutes), it becomes immutable. A new memtable handles new
writes, and the old one is scheduled for flushing.

⦁ Step 5: Flushing to SSTable: The immutable memtable is written to an SSTable on disk


in the data_file_directories (e.g., SSDs), sorted by partition key (e.g., user_id). This

46
persists data for 10 TB of order data.

⦁ Step 6: Commit Log Clearance: After successful flushing, the corresponding commit log
segments are deleted, freeing space and preparing for the next cycle.

⦁ Example: At 11:13 PM IST, an order for "user123" is logged, inserted into the
memtable, and flushed to an SSTable when it hits 150MB, ensuring durability and
performance for 10,000 concurrent users.

Significance:
This write path balances speed (memtable) and reliability (commit log), with flushing ensuring
persistent storage, critical for handling the sale’s high throughput.

2. Illustrate and Discuss How Cassandra Reads Data with Optimizations Like Bloom Filters,
Row Cache, and Compression Maps.

Description of Read Path Diagram:


Visualize a flowchart starting with a "Client Read Request" (e.g., "Get orders for user123"). The
path flows through:

⦁ "Bloom Filter" (in-memory filter) to check key existence.

⦁ "Partition Key Lookup" in the "Index File" to locate data offsets.

⦁ "Row Cache" (in-memory cache) to serve recent data directly.

⦁ "SSTable Data File" with "Compression Map" to decompress and retrieve rows.
An arrow loops back to "Row Cache" for future reads if data is cached.

Explanation of Read Path with Optimizations:

⦁ Step 1: Client Read Request: A query (e.g., SELECT * FROM orders WHERE user_id =
'user123') arrives at 11:13 PM IST during the sale.

⦁ Step 2: Bloom Filter Check: The Bloom filter, an in-memory probabilistic data structure,
checks if the partition key exists in the SSTable. If likely present (low false positive rate),
it proceeds; if not, it skips the SSTable, reducing I/O for 10,000 queries/second.

⦁ Step 3: Partition Key Lookup: The index file maps the key to an offset in the SSTable
data file, enabling a seek to the relevant data range, optimizing read latency to
sub-10ms.

47
⦁ Step 4: Row Cache Utilization: If the requested row (e.g., recent order for "user123") is
in the row cache (configured via row_cache_size_in_mb), it is returned directly from
memory, bypassing disk I/O for frequent accesses.

⦁ Step 5: SSTable Data Retrieval with Compression Map: If not cached, the data file is
accessed. The compression map (stored in the SSTable) provides offsets for
decompressing compressed blocks, retrieving the row efficiently from 10 TB of data.

⦁ Step 6: Cache Update: The retrieved row is added to the row cache for future reads,
improving performance for repeated queries.

⦁ Example: For "user123"’s orders, the Bloom filter skips irrelevant SSTables, the index
locates the data, and the row cache serves a recent order, while the compression map
handles a 1GB compressed block, ensuring fast response times.

Discussion of Optimizations:

⦁ Bloom Filters: Reduce unnecessary disk reads, critical for large datasets (e.g., 10 TB),
though false positives may increase I/O slightly.

⦁ Row Cache: Enhances read performance for hot data (e.g., popular products), but
requires tuning to avoid memory pressure (e.g., 512MB limit).

⦁ Compression Maps: Minimize storage (e.g., 2:1 ratio) and I/O by decompressing only
needed blocks, though excessive compression can slow reads if CPU-bound.

⦁ Trade-Offs: These optimizations prioritize speed but require careful configuration (e.g.,
bloom_filter_fp_chance at 0.01) to balance accuracy and performance during the sale
peak.

3. Analyze the Role of SSTables in Cassandra. What Makes Them Efficient for Reads and How
Are They Structured?

Role of SSTables in Cassandra:


SSTables (Sorted String Tables) are the primary on-disk data storage format in Cassandra,
created when memtables are flushed. They play a critical role in:

⦁ Data Persistence: Store all committed data (e.g., 10 TB of e-commerce orders at 11:13
PM IST) durably on disk.

⦁ Read Optimization: Enable efficient data retrieval through indexing and sorting.

48
⦁ Compaction Support: Facilitate merging and cleanup of data during compaction,
maintaining performance.

⦁ Scalability: Support distributed reads across a 5-node cluster, handling 10,000


queries/second during the sale.

What Makes Them Efficient for Reads:

⦁ Sorted Structure: Data is sorted by partition key (e.g., user_id), allowing binary search-
like access, reducing seek time to sub-10ms.

⦁ Index File: Provides quick offset lookups, minimizing disk scans for large datasets (e.g., 1
million orders).

⦁ Bloom Filters: Pre-filter non-existent keys, cutting I/O by up to 90% for irrelevant
SSTables.

⦁ Compression: Reduces storage (e.g., 2:1 ratio) and I/O, with compression maps
enabling block-level decompression, optimizing read bandwidth.

⦁ Row Cache Integration: Caches frequently accessed rows in memory, bypassing disk for
hot data (e.g., recent orders).

⦁ Immutability: Once written, SSTables are read-only, avoiding write locks and enabling
parallel reads across nodes.

Structure of SSTables:

⦁ Data File (.db): Contains the sorted key-value data (e.g., user_id, order_date,
total_amount), persisted from the memtable. It holds the bulk of 10 TB of order data.

⦁ Index File (.index): Stores a summary of partition keys with offsets, enabling rapid
location of data ranges (e.g., offset for user123’s orders).

⦁ Filter File (.bloom): A Bloom filter predicting key existence, reducing unnecessary reads
(e.g., checks for user999).

⦁ Statistics File (.stats): Metadata like min/max keys and row counts, aiding compaction
and repair (e.g., min key = 'user001').

⦁ Summary File (.summary): A sampled index of keys at intervals (e.g., every 100th key),
optimizing memory usage for range queries.

⦁ Compression Info File (.compactioninfo): Tracks compression details, including the

49
compression map for block offsets.

⦁ TOC File (.toc): Lists all component files, ensuring integrity during maintenance.

⦁ Example: An SSTable for "user123"’s orders includes a .db file with 100 rows, an .index
for offsets, and a .bloom filter, enabling a 5ms read for the latest order.

Analysis:
SSTables’ efficiency stems from their immutable, sorted nature and supporting files, which
minimize I/O and leverage memory caches. However, their read performance depends on
proper indexing and compression tuning (e.g., compression_chunk_length_kb at 64KB). During
the sale, their structure ensures scalability, but excessive SSTable growth requires compaction
to prevent read degradation.

50

You might also like