SQL Cheatsheet for Data Analytics Cheat Sheet
by sourabhsahu12345 via cheatography.com/192985/cs/40126/
RDBMS Softwares BETWEEN AND IN SUBQUERIES
Oracle MYSQL IN checks if a column value matches any Scalar Subquery: Returns a single value
MS SQL Server MariaDB value in a list, while BETWEEN checks if a and can be used within SELECT, WHERE,
column value falls within a specified range. or FROM clauses.
Follow @Sourabh Sahu SELECT * FROM product WHERE category SELECT product_name, (SELECT
IN ('Electronics', 'Clothing', 'Furniture'); MAX(price) FROM products) AS max_price
SELECT * FROM product WHERE price FROM products;
QUERYING DATA BETWEEN 20 AND 50; Single-Row Subquery: Subquery that
Fetch all columns from the products table: returns a single row of results, typically
SELECT * FROM products; CONSTRAINTS used in comparison operations.
Fetch product IDs and names from the Constraints are rules applied to database SELECT product_name FROM products
inventory table: SELECT product_id, tables to ensure data accuracy, integrity, WHERE price = (SELECT MAX(price)
product_name FROM inventory; and consistency. FROM products);
Fetch employee names sorted by their PRIMARY KEY enforces unique, non-null Multi-Row Subquery: Subquery that returns
salary in descending order: SELECT values in the "product_id" column. multiple rows, often used with IN, ANY, or
employee_name FROM employees ORDER ALL operators.
FOREIGN KEY links the "manufacturer_id"
BY salary DESC; column to a related table. SELECT product_name FROM products
Fetch customer names sorted by their WHERE manufacturer_id IN (SELECT
NOT NULL ensures the "product_name"
registration date in ascending order manufacturer_id FROM manufacturers
must have a value.
(default): SELECT customer_name FROM WHERE country = 'USA');
CHECK enforces a condition, like ensuring
customers ORDER BY registration_date; Correlated Subquery: References values
"price" is positive.
from the outer query within the subquery.
DEFAULT assigns a default value to a
ALIASING (AS) SELECT product_name FROM products p
column when no value is provided during
SELECT name AS product_name FROM WHERE price > (SELECT AVG(price)
insertion.
products; FROM products WHERE manufacturer_id =
CREATE TABLE product ( product_id INT
p.manufacturer_id);
PRIMARY KEY, product_name VARCHA‐
INDEXING Inline View Subquery: Creates a temporary
R(255) NOT NULL, price DECIMAL(10, 2)
Indexing improves data retrieval speed by table-like structure for complex subqueries
CHECK (price > 0), manufacturer_id INT,
creating a reference system. It enhances within the FROM clause.
FOREIGN KEY (manufacturer_id)
query performance but may slightly impact REFERENCES manufacturer(manufacture‐ SELECT AVG(subquery.avg_price) FROM
insert/update operations. r_id), discount DECIMAL(5, 2) DEFAULT (SELECT manufacturer_id, AVG(price) AS
Clustered Index: Dictates physical data 0.00 -- Default discount set to 0.00 ); avg_price FROM products GROUP BY
order, often used for primary keys. Non-Cl‐ manufacturer_id) AS subquery;
ustered Index: Creates separate index
structures, allowing multiple indexes for
specific columns, without affecting data
storage order.
By sourabhsahu12345 Not published yet. Sponsored by ApolloPad.com
Last updated 3rd September, 2023. Everyone has a novel in them. Finish
Page 1 of 4. Yours!
https://apollopad.com
cheatography.com/sourabhsahu12345/
SQL Cheatsheet for Data Analytics Cheat Sheet
by sourabhsahu12345 via cheatography.com/192985/cs/40126/
SQL and NO SQL SET OPERATION IS NULL / IS NOT NULL
Uses a structured, Offers various flexible INTERSECT: Returns the common rows IS NULL: Filters rows where the "manufa‐
tabular data data models, suitable between two SELECT statements. cturer_id" is not assigned. SELECT * FROM
model with for semi-structured or Retrieves products that exist in both sets of product WHERE manufacturer_id IS NULL;
predefined unstructured data. results. IS NOT NULL: Filters rows where the "man‐
schemas. SELECT product_id, product_name FROM ufacturer_id" is assigned. SELECT * FROM
Primarily Designed for products INTERSECT SELECT product_id, product WHERE manufacturer_id IS NOT
designed for horizontal scalability, product_name FROM some_other_table; NULL;
vertical scaling, making it easier to UNION: Combines the results of two
can be complex to handle high traffic and SELECT statements, removing duplicates. FILTERING
scale horizontally large datasets. Retrieves all unique products from both sets Retrieve all products with a price greater
of results. than $50: SELECT * FROM product
WINDOW AND RANKING FUNCTIONS
SELECT product_id, product_name FROM WHERE price > 50;
Window functions allow you to perform products UNION SELECT product_id, Get products with a quantity in stock less
calculations across a set of rows related to product_name FROM some_other_table; than or equal to 10: SELECT * FROM
the current row within the result set. Rank product WHERE stock_quantity <= 10;
UNION ALL: Similar to UNION, but includes
functions assign a rank or row number to
all rows, including duplicates. Retrieves all Retrieve products with a specific category
each row based on specified criteria.
products from both sets of results, allowing (e.g., "Electronics"): SELECT * FROM
SELECT product_name, price, ROW_NU‐ duplicates. product WHERE category = 'Electronics';
MBER() OVER (ORDER BY price) AS
SELECT product_id, product_name FROM Find products with names containing the
row_num, DENSE_RANK() OVER (ORDER
products UNION ALL SELECT product_id, word "phone": SELECT * FROM product
BY price) AS dense_rank, SUM(price)
product_name FROM some_other_table; WHERE product_name LIKE '%phone%';
OVER (PARTITION BY manufacturer_id)
EXCEPT: Returns rows that exist in the first Get products added after a certain date
AS manufacturer_total FROM products;
SELECT statement but not in the second. (e.g., '2023-01-01'): SELECT * FROM
we use window functions like ROW_NU‐
Retrieves products that are in the "products" product WHERE date_added > '2023-01-
MBER, DENSE_RANK, and SUM with the "‐
table but not in "some_other_table." 01';
products" table to assign row numbers,
SELECT product_id, product_name FROM Retrieve products with a price between $20
dense ranks, and calculate the total price for
products EXCEPT SELECT product_id, and $30: SELECT * FROM product WHERE
each manufacturer's products.
product_name FROM some_other_table; price BETWEEN 20 AND 30;
RANK assigns the same rank to rows with
Find products with low stock and a price
equal values, leaving gaps, while
greater than $50: SELECT * FROM product
DENSE_RANK assigns consecutive ranks
WHERE stock_quantity < 5 AND price > 50;
without gaps, and ROW_NUMBER assigns
a unique row number to each row.
By sourabhsahu12345 Not published yet. Sponsored by ApolloPad.com
Last updated 3rd September, 2023. Everyone has a novel in them. Finish
Page 2 of 4. Yours!
https://apollopad.com
cheatography.com/sourabhsahu12345/
SQL Cheatsheet for Data Analytics Cheat Sheet
by sourabhsahu12345 via cheatography.com/192985/cs/40126/
FILTERING (cont) JOINS (cont) PERFORMANCE OPTIMIZATION
Retrieve products from a specific manufa‐ SELECT s1.salesperson_name, s2.salesp‐ Indexing: Use appropriate indexes.
cturer (e.g., "Samsung"): SELECT * FROM erson_name FROM sales AS s1 JOIN sales Optimize Queries: Write efficient SQL.
product WHERE manufacturer = 'Samsung'; AS s2 ON s1.product_id = s2.product_id
Limit Data Retrieval: Fetch only needed
WHERE s1.salesperson_name <> s2.sal‐
data.
JOINS esperson_name;
Normalization: Properly structure tables.
INNER JOIN: Returns only the rows with
AGGREGATION AND GROUPING Stored Procedures: Use precompiled
matching values in both tables.
procedures.
SELECT * FROM sales INNER JOIN Aggregation functions like COUNT, SUM,
AVG, MAX, and MIN are used to perform Table Partitioning: Divide large tables.
products ON sales.product_id = products.p‐
roduct_id; calculations on data. GROUP BY clause is Regular Maintenance: Rebuild indexes,
used to group rows based on one or more update stats.
LEFT JOIN (or LEFT OUTER JOIN):
columns. Aggregation functions are applied Optimize Hardware: Ensure server
Returns all sales records and their corres‐
to each group, providing summary inform‐ resources.
ponding products. If a product has no sales,
ation.
it still appears with NULL values in sales-‐ Caching: Implement caching mechanisms.
related columns. -- Count the total number of products
Use Proper Data Types: Choose suitable
SELECT COUNT(*) AS total_products
SELECT * FROM products LEFT JOIN types.
FROM products;
sales ON products.product_id = sales.pro‐ Concurrency Control: Manage transactions
duct_id; -- Calculate the total price of all products
carefully.
SELECT SUM(price) AS total_price FROM
RIGHT JOIN (or RIGHT OUTER JOIN):
products;
Opposite of the LEFT JOIN. Returns all Data Types
sales records and includes products that -- Calculate the average price of products
CHAR(N): VARCHA VARCHAR and
have no sales SELECT AVG(price) AS average_price
Fixed-‐ R(N): VARCHAR2 are
FROM products;
SELECT * FROM sales RIGHT JOIN length Variable-‐ used interchan‐
products ON sales.product_id = products.p‐ -- Find the highest product price SELECT
character length geably in most
roduct_id; MAX(price) AS highest_price FROM
string with character databases, but
products;
FULL OUTER JOIN: Returns all sales a string VARCHAR2 is
records and all products, including those -- Find the lowest product price SELECT specified with a specific to Oracle
without sales. NULL values appear where MIN(price) AS lowest_price FROM maximum maximum databases.
there is no match. products; length of length of
N. N.
SELECT * FROM products FULL OUTER
DIMENSIONAL MODELLING
JOIN sales ON products.product_id = INTEGE‐ BLOB: DATETIME:
sales.product_id; A design approach for data warehousing. R,FLOA‐ Stores Combines date
Organizes data into fact tables (quantitative T,B‐ binary and time in
SELF JOIN: A self join could be used if the
data) and dimension tables (descriptive OOLEAN large 'YYYY-MM-DD
"sales" table contains information about
data). objects, HH:MM:SS'
salespeople who sell products. You might
such as format.
join the table with itself to identify salesp‐ Simplifies complex queries, improves
images or
eople who have sold the same products. performance, and supports business
files.
analytics by creating a logical structure for
data analysis.
By sourabhsahu12345 Not published yet. Sponsored by ApolloPad.com
Last updated 3rd September, 2023. Everyone has a novel in them. Finish
Page 3 of 4. Yours!
https://apollopad.com
cheatography.com/sourabhsahu12345/
SQL Cheatsheet for Data Analytics Cheat Sheet
by sourabhsahu12345 via cheatography.com/192985/cs/40126/
Data Types (cont)
DECIMAL(P, DATE: TIME:
S): Fixed-point Stores a Represents
decimal number date a time of
with P total value in day in
digits and S the format 'HH:MM:SS'
decimal places. 'YYYY-‐ format.
MM-DD'.
By sourabhsahu12345 Not published yet. Sponsored by ApolloPad.com
Last updated 3rd September, 2023. Everyone has a novel in them. Finish
Page 4 of 4. Yours!
https://apollopad.com
cheatography.com/sourabhsahu12345/