Chapter 1: Variables: File creation (txt): Chapter 7 (SQL):
print ("The " + cake_name + "is $ " + cake_price) a = "Singapore Zoo" filename = "hello" 1) SELECT Username from users, DOB as Birthday FROM
inventory { "choco": 20, "vanilla": 15} print (a[:3]) → first 3 characters file = open(filename, "w") USERS;
print (inventory ["choco"]) print (a [-3:]) → last 3 characters row = var 1 + "," + var 2 2) SELECT Username from users, DOB as Birthday FROM
In print statements all arguements must be string (x print (a [5:9]) → prints "pore" [Link] crow) # write row to file USERS WHERE DOB < “1990 - 01-01”;
float/ integer) file. close 3) SELECT content, Likes
ratings = [5, 4,3] a = [Link] [ "z", "B") → replaces z with B FROM Posts
Sum-ratings sum (ratings) a = [Link] ("o", "a") → replaces all Os with a File creation (csv): INNER JOIN Users ON Post. UserID = Users. UserID
(output = Singapore Zaa) import csv WHERE users. username = 'James Lee' ;
a = [Link](" o", "a", 1) → replaces first occurrance of filename = "[Link]" 4) SELECT Location, COUNT(*) as count
Chapter 2: o to a with open (filename, "w", newline = " ") as fp: From Users
add + subtract - multiply * divide / power ** (for square (output = singapare zoo) csv_w= [Link] (fp) GROUP BY Location;
root use **0.5 or [Link]()) a = a + "bay" → adds bay at the end csv_w.writerow('var1', 'var 2') #creating titles 5) ON [Link] = [Link]
if: else: a = a[:10]+ “ bay ”+ a [10:] → adds bay in the middle row = [var1, var 2] WHERE Users. Username LIKE "James Lee";
Try: except ValueError: csv_w.writerow(row) 6) SELECT * from HR;
List: 7) SELECT DISTINCT position FROM HR ORDER BY
list_a = [ 'a' 'b', 'c', 'e'] File reading: position;
Chapter 3: for i in range(1,11) list. a [3]= 'd' → replaces 'e' with 'd' data = [ ] 8) SELECT X,Y,Z,
i=0 print(i) list_a. insert ( 3, 'd') →adds 'd' before 'e' with open('[Link]', 'r') as fp: A as B
while i<x [11 is last number + 1, 11-1 = number of list_a.remove('e') / list_a. pop (3) → removes'e' for line in fp: FROM HR
i+=1 times loop should run] line_data = line. strip(). Split (',') 9) SELECT COUNT(*) FROM HR WHERE sex = 'M' ;
print (i) Dictionary: line_data [3] = float (line data [3]) 10) SELECT MIN (salary) AS 'min pay',
a_dict = { 'city': 'sg', 'flower': 'orchid' } x=0 MAX(salary) AS 'max pay'
[Link](i) [list is name of list, i is name of variable] a_dict [ 'location'] = 'bay' → adds new key and value for i in data: AVG(salary) AS' Avg pay'
print (f" The mean number of the entered is: { mean }") a_dict [ 'flower'] = 'rose' → replaces value for an existing if i[3] >37.5: FROM HR;
mean = sum (list) / len (list) key x+=1 11) SELECT *
del a_dicy [ 'city'] / a_dict. pop ('city') → removes valve FROM HR
and key data= [ ] WHERE salary BETWEEN 20000 AND 30 000;
Chapter 4: Chapter 5: a_dict['country'] = a_dict. pop ( 'city') month = [xxx] 12) WHERE position = 'Supervisor' OR position = 'manager"; /
for char in word: attraction = "Singapore Zoo" * string is immutable Item = [xxx] WHERE position IN('supervisor', 'manager');
for char in v: print (attraction [10:14]) → for i in range (len (item)) : 13) WHERE NOT (Position = 'Supervisor' OR position =
output: "zoo" Chapter 6: row = [] manager'); / WHERE position NOT IN (' supervisor', 'manager');
list_a = [] from datetime import date % Y → Year for m in month: 14) WHERE name LIKE ‘% o%'; [ contains the letter 'o']
def satisfaction (n): list_a = [ "Zoo", "Garden"] to_day = date. today () % m → month sales = int (input (xxx)) WHERE name LIKE '% y’ [name ends with the letter y]
for i in range print (list_a [0]) → output: "Zoo" print (to_day) → 2024-11-29 % d → day [Link](sales)
(1,n+1): print (today. year) → 2024 % H → hour data. append(row)
try: sales_L= [xxx] print (to-day. month) → 11 % M → Minute print ("Sale for item 1 in March", data[1][2])
a = int (input sales_m =[xxx] print ([Link]) → 29 % S → second
(xxx)) month = [xxx]
[Link] (a) c_dict = {} import datetime / from datetime import datetime
c_dict['L'] = sales _L to_day = [Link]() / to_day =
except ValueError: c_dict['m'] = sales_m [Link]()
print (xxxx) c_dict [ 'month'] = month x = [Link](days =1)
return list_a for i in range (len(sales _L)): [days can be weeks/hours/minutes/seconds and 1
print ("In { }, L get $ { } while m can be positive/negative]
n = int(input (xxx)) gets $ {}." . format(c_dict [ print (“One day later: ”, to_day + x)
print(satisfaction (n)) 'months' ] [ each ], c_dict['L"]
independence = datetime(year =1965, month =8, day
[each], c_dict['m'] [each]))
=9)
age = to_day - independence
from datetime import datetime, date, time, timedelta age_in_years = [Link]/365.25
moment [Link]()
print (moment) → output: 2024 - 11 - 29 [Link] % a → day of the week (short form)
print (moment. strftime ( "%d / %m / %Y")) % A → day of the week
print (moment. strftime ("%H : % M")) % B → Month
Chapter 8: (Pandas) Chapter 9: Chapter 10: Pie Chart:
df. head Describing data: Understanding data: Scatterplot: import pandas as pd
df. shape → no. of rows and columns df. describe() df. index import [Link] as plt import [Link] as plt
df. shape [1] → no. of columns df. corr() [Link] channel_1_data = df[df["Distribution
df. [Link] () → column name [Link] ([0.5,0.7, 0.8, [Link] | df ['xxx'] df = pd.read_csv("job_market.csv") Channel"] == 1]
[Link] → type of data in each column 0.90]) df. loc [ Row No] [Link](figsize=(12, 6)) goods_columns = ['Cold Food', 'Dairy',
df [df-duplicated()] → shows duplicated records [Link]() df. loc[ Row No, 'colname'] [Link](df['year'], df['recruitment_rate'], 'Grocery', 'Frozen']
df-isna() →null records [Link]() [Link][rowno.] label='Recruitment Rate') distribution =
df isna().sum() → no. of null records [Link] () [Link][row, col] [Link]('Overall Recruitment Rate', channel_1_data[goods_columns].sum()
[Link] → statistical summary [Link]() df [[Link] > xxx] fontsize=14) [Link](distribution, labels=[Link],
df-head (5) → first 5 records df. min() df-iloc[[Link] >xxx & xxxx] [Link]('Year', fontsize=12) autopct='%1.1f%%', startangle=90,
df. tail (2) → last 2 records df. max() [Link]('Recruitment Rate', fontsize=12) colors=[Link])
df. [Link]. strip()→ remove white space df sum() [Link]() [Link]("Distribution of Goods in Channel 1",
[Link]() → remove null values [Link]() fontsize=14)
[Link](columns = ['x', 'y']) categories = df["categories"].unique() # [Link]("equal")
df. xxx. astype (str) → convert to string [Link][144: 147,] (147 = last row plus 1) Get unique categories [Link]()
df. drop_duplicates (subset = ['xxx'], keep = 'last', [Link][2:5, 1:6] colors = ['blue', 'orange', 'green'] # Colors
inplace=True) df [[ 'xxx', 'xxx']] for the groups Bar Chart:
df['xxx']. unique() [Link][ : , [0,1,2]] for category, color in zip(categories, channel_1_data = df[df["Distribution
dft'xxx']. nunique () → count of unique values colors): Channel"] == 1]
df ['xxx']. notnull().sum() [Link] ('room-type') [ 'price' ]. std (). max(). category_data = df[df["categories"] == goods_columns = ['Cold Food', 'Dairy',
df [df ['xxx'] == 'xxx']. count() df. groupby('Distribution Channel').sum().sum (axis=1). category] 'Grocery', 'Frozen']
df. replace ("xxx" , "xxx", inplace=True) sort _value (ascending = False) [Link](category_data["year"], distribution =
df. rename (columns = { 'xxx'='xxx'}, inplace=True) [Link] (df ['sales'] > 1000). min() category_data["recruitment_rate"], channel_1_data[goods_columns].sum()
df [(df ['xxx'>10) | (df ['xxx'] == 'xxx')] label=category, color=color) [Link](kind='bar', color='skyblue')
[Link]("Total Distribution of Goods in
Line Chart: Channel 1", fontsize=14)
average_resignation = [Link]("year [Link]("Goods", fontsize=12)
["resignation_rate"].mean().reset_index() [Link]("Total Quantity", fontsize=12)
[Link](average_resignation["year"], [Link](rotation=45, ha='right')
average_resignation["resignation_rate"], [Link]()
color='blue', label='Average Resignation
Rate') Heatmap:
[Link]("Average Resignation Rates Over import pandas as pd
the Years", fontsize=14) import seaborn as sns
[Link]("Year", fontsize=12) import [Link] as plt
[Link]("Average Resignation Rate", correlation_matrix = [Link]()
fontsize=12) [Link](correlation_matrix, annot=True,
[Link]() cmap='coolwarm', fmt='.2f', linewidths=0.5)
[Link]() [Link]("Correlation Matrix Heatmap",
fontsize=14)
[Link]()
Seaborn:
Bar plots Correlation heatmaps
Box plots Boxplots and violin plots
Violin plots Pair plots (visualizing relationships between
Strip plots multiple variables)
Count plots Regression plots (fitting regression lines to
Distribution plots (e.g., data)
histograms, KDEs)