Ultimate Developer Command Guide
Complete Python, PySpark & SQL Reference
All essential commands in one optimized cheat sheet
Python Commands
Command Function Example
print("Welcome to
print() Outputs data to console Python!") # Prints:
Welcome to Python!
my_list = [1, 2, 3];
len() Returns length of an object print(len(my_list)) #
Output: 3
for i in range(3):
range() Generates a sequence of numbers print(i) # Prints: 0, 1,
2
def greet(name): return
f"Hello, {name}";
def Defines a custom function
print(greet("Alice")) #
Prints: Hello, Alice
import math;
import Imports a module or library print([Link]) # Prints:
3.141592653589793
squares = [x**2 for x in
[x for x in [1, 2, 3]];
Creates a list using comprehension
iterable] print(squares) # Prints:
[1, 4, 9]
x = 10; if x > 5:
print("Big") else:
if/elif/else Conditional logic
print("Small") # Prints:
Big
for fruit in ["apple",
for Iterates over a sequence "banana"]: print(fruit) #
Prints: apple, banana
count = 0; while count <
while Loops until condition is false 3: print(count); count +=
1 # Prints: 0, 1, 2
try: print(1/0) except
ZeroDivisionError:
try/except Handles exceptions print("Cannot divide by
zero") # Prints: Cannot
divide by zero
with open("[Link]",
"w") as f:
open() Opens a file for reading/writing
[Link]("Hello") #
Creates file with text
my_list = [];
my_list.append(5);
[Link]() Adds an item to a list
print(my_list) # Prints:
[5]
my_dict = {"key":
"value"};
[Link]() Retrieves value from dictionary
print(my_dict.get("key"))
# Prints: value
PySpark Commands
Command/Function Function Example
Initializes a Spark from [Link] import SparkSession; spark =
[Link]
session [Link]("MyApp").getOrCreate()
Loads CSV file into df = [Link]("[Link]", header=True,
[Link]()
a DataFrame inferSchema=True); [Link]() # Displays CSV data
Displays first n
[Link]() [Link](3) # Shows first 3 rows
rows of DataFrame
Displays
[Link]() [Link]() # Shows column names and types
DataFrame schema
Selects specific [Link]("name", "age").show() # Shows name and
[Link]()
columns age columns
Filters rows based [Link]([Link] > 25).show() # Shows rows where
[Link]()
on condition age > 25
[Link]("salary > 50000").show() # Filters rows
[Link]() Alias for filter
where salary > 50000
Groups data and [Link]("department").agg({"salary":
[Link]().agg()
applies aggregation "avg"}).show() # Shows avg salary per dept
Joins two [Link](df2, [Link] == [Link], "inner").show() #
[Link]()
DataFrames Inner join on id
Adds or modifies a [Link]("age_plus_10", [Link] + 10).show() #
[Link]()
column Adds column with age + 10
[Link]("old_name", "new_name").show()
[Link]() Renames a column
# Renames column
Drops specified
[Link]() [Link]("salary").show() # Drops salary column
columns
Replaces null [Link]({"age": 0}).show() # Replaces null ages
[Link]()
values with 0
Removes duplicate [Link](["name"]).show() # Drops
[Link]()
rows duplicate names
Saves DataFrame [Link]("[Link]", mode="overwrite") #
[Link]()
as CSV Saves DataFrame to CSV
Registers
[Link]("temp_table") # Creates
[Link]() DataFrame as SQL
SQL view
table
Runs SQL query on [Link]("SELECT name FROM temp_table WHERE age >
[Link]()
DataFrame 30").show() # Runs SQL query
from [Link] import Window; w =
Defines window for [Link]("dept").orderBy("salary");
[Link]()
ranking/aggregation [Link]("rank", row_number().over(w)).show()
# Adds rank column
SQL Commands
Command Function Example
SELECT name, age FROM
SELECT Retrieves data from a table employees # Selects
name and age columns
SELECT * FROM employees
WHERE age > 30 #
WHERE Filters rows based on condition
Filters employees older
than 30
SELECT * FROM employees
ORDER BY salary DESC #
ORDER BY Sorts result set
Sorts by salary in
descending order
SELECT department,
AVG(salary) FROM
GROUP BY Groups rows for aggregation employees GROUP BY
department # Avg salary
per dept
SELECT department,
COUNT(*) FROM employees
GROUP BY department
HAVING Filters grouped results
HAVING COUNT(*) > 5 #
Depts with > 5
employees
SELECT [Link],
d.dept_name FROM
employees e JOIN
JOIN Combines rows from multiple tables
departments d ON
e.dept_id = [Link] #
Joins tables
SELECT [Link],
d.dept_name FROM
employees e LEFT JOIN
LEFT JOIN Includes all rows from left table
departments d ON
e.dept_id = [Link] # Left
join
SELECT * FROM employees
LIMIT Restricts number of returned rows LIMIT 5 # Returns first
5 rows
INSERT INTO employees
(name, age) VALUES
INSERT INTO Adds new rows to a table
('Alice', 28) # Inserts
a new employee
UPDATE employees SET
salary = 60000 WHERE
UPDATE Modifies existing rows
name = 'Alice' #
Updates salary
DELETE FROM employees
WHERE age < 18 #
DELETE Removes rows from a table
Deletes rows where age
< 18
CREATE TABLE employees
(id INT, name
CREATE TABLE Creates a new table
VARCHAR(50), age INT) #
Creates employees table
ALTER TABLE employees
ADD COLUMN salary
ALTER TABLE Modifies table structure
DECIMAL(10,2) # Adds
salary column
DROP TABLE employees #
DROP TABLE Deletes a table
Deletes employees table
Cheat Sheet Summary
Your comprehensive reference for daily development tasks across Python, PySpark and SQL
Version 2.0 | Updated: August 2024
Print Tip: Use Ctrl+P (Win) / Cmd+P (Mac) to save as PDF