0% found this document useful (0 votes)
17 views2 pages

Python Cheatsheet 1

The document provides an overview of various programming concepts including exception handling, file operations, SQL commands, and data manipulation using Python libraries like Pandas and Matplotlib. It covers topics such as string manipulation, dictionary functions, list operations, and data visualization techniques. Additionally, it includes code examples for practical applications in data analysis and SQL queries.

Uploaded by

bryong05
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views2 pages

Python Cheatsheet 1

The document provides an overview of various programming concepts including exception handling, file operations, SQL commands, and data manipulation using Python libraries like Pandas and Matplotlib. It covers topics such as string manipulation, dictionary functions, list operations, and data visualization techniques. Additionally, it includes code examples for practical applications in data analysis and SQL queries.

Uploaded by

bryong05
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Types of Exceptions: String Sequence: Read/Write Text File: SQL DDL:

While True: * Index starts from 0 filep = open(“[Link]”, “r”) CREATE, ALTER, NOT NULL, UNIQUE, DEFAULT,
try: for each_line in filep: PRIMARY/FOREIGN KEY, CHECK, DROP
String Slicing Variable[index] -> Variable[start:stop] print (each_line)
x = int(input(“Please enter a valid [Link]()
number: “) Reverse Slicing Variable[-index] -> Variable[start:stop:step] SQL DML:
------------------------------------------------------------------------------
except ValueError: Concatenate Combine_text = text1 + “ “ + text2 + … filep = open(“[Link]”, “w”) INSERT, SELECT, WHERE, GROUP BY, HAVING,
print(“Oops! That was no valid data = [ ] ORDER BY, UPDATE, DELETE, COMMIT, ROLLBACK
Functions .capitalise(), .upper(), .isupper(), .isalpha() for each_line in data:
number.”) Comparison: =, <, >, <=, >=, <>, !=
-------------------------------------------------- Length len(text) print(each_line, file=filep) Logical: AND, OR, NOT
[Link]()
AttributeError, ImportError, IndexError, .format() ------------------------------------------------------------------------------ Special: BETWEEN, IS NULL, LIKE, IN , EXISTS,
KeyError, IndentationError, TypeError, “Text {:formatting for variable} other text“ [Link]() # Read as a single string DISTINCT
ValueError, ZeroDivisionError, IOError Format .format(matching variable/value) [Link]() # Read line by line as a string Aggregate: COUNT, MIN, MAX, SUM, AVG
Truth Table: s = string, d = decimal, f = float [Link]() # Read as a single list SQL Code Example:
< = left align, ^ = centre align, > = right align [Link](lines) # Lines contain list of string with data
“INSERT INTO table_name VALUES(“_”, ”_”,);”
Read With Open Text/CSV File: “SELECT * from table_name;”
Dictionary Functions: [ ] with open(“[Link]”) as filep:
{“key: ”, value} “UPDATE table_name SET “_”= “XX” WHERE “_”=
data = [Link]()
dict_name.XX() ------------------------------------------------------------------------------
“XX”;“
clear, items, keys, values import csv “DELETE from table_name WHERE “_” = “XX”;”
with open(“[Link]”) as filep:
Tuple Functions: ( ) csv_pointer = [Link](filep) CREATE TABLE table1_name (
Dictionary Comprehension:
tuple_name.count(“a”): returns the # of for each in csv_pointer: col1 INTEGER NOT NULL UNIQUE, col2
dict_name = {key: value for data in iterator if
items whose content matches with “a” print(each) VARCHAR(50) NOT NULL, PRIMARY KEY (col1),
condition} ------------------------------------------------------------------------------
tuple_name.index(“a”): returns the index ---------------------------------------------------------------------
Write With Open CSV File: FOREIGN KEY (col of table1) REFERENCES
number of first “a” in tuple Dictionary Iteration: import csv
table2_name(col PK) );
List Functions: ( ) For key, value in dict_name.items() data = [ ]
with open(“[Link]”, “w”) as filep: SQL Join: * LEFT JOIN supported, RIGHT & FULL JOIN
XX(list_name) OS Module: csv_pointer = [Link](filep) not supported by SQLite
len, min, max, sum csv_pointer.writerow(data)
list_name.XX() Current working dir [Link]() or csv_pointer.writerows(data)
Joining Tables:
append, insert, sort, reverse, split, split(“ ”) Change current dir [Link](path) def get_revenue(): SELECT * FROM MANAGER, EXECUTIVE WHERE
list_name(range(start, stop, step)) revenues = []
Rename file/dir [Link](“source”, “destination”) for i in range(1, 4): MANAGER.MANAGER_CODE =
[Link](float(input("Enter Rev {}:".format(i))) EXECUTIVE.MANAGER_CODE;
List Comprehension: Make dir [Link](file_path) return revenues
list_name = [data for data in iterator if Remove file/dir [Link](path) Inner Join: Combines the results similar to AND
condition] def print_revenue(revenues, filename):
operation look for values common to both tables.
[Link](path) fp = open(filename, "w")
List Iteration: List of files/dirs [Link](“.”) # Current dir print("Revenue Report\n", "-"*20, file=fp) SELECT * FROM MANAGER M INNER JOIN
names = [“John”, “Jane”, “Joey”] [Link](“..”) # Parent dir for index, value in enumerate(revenues): EXECUTIVE E on M.MANAGER_CODE =
print("Revenue for {}: ${:15,.2f} Cumm. Total: E.MANAGER_CODE;
id = [1033, 1034, 1035] import os ${}".format(index+1, value, sum(revenues[:index+1])),file=fp)
for index, element in enumerate(names): new_dir = [Link]() + [Link] + “sub” print("Total ${:,.2f}".format(sum(revenues)), file=fp)
print(“{}’s id is {}.”.format(element, [Link]() Outer Join: Keeps the non-matching results when
if not [Link](new_dir) join is done.
id[index]) [Link](new_dir) def task3(): SELECT * FROM MANAGER M LEFT JOIN EXECUTIVE
[Link](“old path”, “new path”) print_revenue(get_revenue(), "[Link]") E on M.MANAGER_CODE = E.MANAGER_CODE;
enumerate(list) # Returns index & val pair task3()
Pandas Summary: * inplace = True df.__.plot(kind=“bar”)
Code Example (1):
sales_latte = [10000, 12000, 14000, 34000, 12000, 14000] Getting rows [Link][1], [Link][2:4] E.g. [Link](“col_name”).mean().plot(kind=“bar”,
Bar graph

print("In {}, muffin sells for ${:.2f}.".format([Link][i], df.sales_muffin[i]))


months = ["Jan", "Feb", "Mar", "Apr", "May", "Jun"] y=“comparison_col_name")
[Link][number] # Drop certain row
# Print corresponding month(s) with the highest sales for sales E.g. df2[[“A”, “B”, “C”]].sum().plot(kind=“bar”)
Row operations [Link](another_df) # Insert another dataframe
latte df.”col_name”[row index] df.__.plot(kind=“pie”)
print(months[sales_latte.index(max(sales_latte))]) Pie chart
[Link](x, labels=y)
[Link][0]

latte_dollars = ["${:.2f}".format(each) for each in df.sales_latte]


for each in months: Getting cols # [Link][row index/slicing, “col_name”] df.col_name.plot(kind = "box")
i = [Link](each) [Link][:3, “col_name”] # Get first 3 rows of a column [Link](df.col_name1, df.col_name2)
Box plot
print("In the month of {months[i]}, revenue for latte is E.g. df[“Avg”]= sum([df.b1, df.b2, df.b3, df.b4])/4
Col operations [Link](“col_name”) # Drops certain column
{sales_latte[i]}”) [Link]([Link], [Link])
Replace chars [Link] = [[Link](" ", "_") for col in [Link]]
Code Example (2): [Link](df.col_name1, df.col_name2)
[Link] INCLUDES the last data as specified by ending # Violin plot
def get_revenue(): iloc vs loc [Link](df.col_name, showmeans = True),
revenues = [] [Link] EXCLUDES the last data as specified by ending #
[Link](df.col_name1, df.col_name2)
for each in range(3): # df[“col_name”][row index/slicing] Bar plot E.g. [Link]([Link].value_counts().index,
revenue = float(input("Enter revenue for month Getting first 3 rows df[[“col_name1”, “col_name2”]][:3] [Link].value_counts().values)
{}:".format(each+1))) for 2 cols # [Link][row index/slicing, col index/slicing]
[Link](revenue) # Adding the new revenue Line plot [Link](df.col_name1, df.col_name2)
[Link][:3, :2]
value to the list df.__.plot(kind = “hist”)

[Link]("mobile_fig.png")
Check null entries [Link]().sum() # Checking for null entries in column

for i in range([Link][0]):
return revenues E.g. 50 interval bars: [Link](kind = “hist”, bins = 50)
Count (rows, cols) [Link] # Returns (rows, columns) Histogram
E.g. Predefined bins: [Link](kind = “hist”, bins =
def print_file_revenue(revenues): [0.5,1.5,2.5])
with open(“[Link]”, “w”) as file_pointer: List all data for col df[“col_name”]

Save pic as png:


cummtotal = 0 Find duplicates [Link]() # Returns True if is a duplicated record Check datatype type([Link](“col_name").sum())
print(“Revenue Report\n”, “-“*20, file = file_pointer) [Link](columns={“Org_name”:”New_name”},
for i in list(range(len(revenues): # Call for index Drop duplicates df.drop_duplicates([“col_name”]) Rename columns
inplace = True)
cummtotal += revenues[i] df.”col_name”.fillna() # To fill with NA
print(“Revenue for {}: ${:>15,.2f} Cumm. Total: Fill in missing # Create just a figure and only 1 subplot
df.”col_name”.fillna(“XXX”) # To fill with other strings
${:>15,.2f}”.format(i+1, revenues[i], cummtotal), file = values fig, ax = [Link]()
df.”col_name”.fillna([Link]()) # To fill with mean
file_pointer) # Prints the file in file pointer [Link](x, y)
[Link]() # Drop record with NA

E.g. [Link](kind=“scatter”, x=“Year”, y=“Margin”, color=“_”)


ax.set_title(“XX”)
print(“The company has made a total of ${:>15,.2f} in the Drop record E.g. [Link](axis=0, how=“any”, thresh=None, Fig & Subplots # Create 2 subplots and unpack the output array
last quarter.”.format(cummtotal), file = file_pointer subset=None, inplace=False) fig, (ax1, ax2) = [Link](1, 2, sharey=True)
print_file_revenue(get_Revenue()) Inconsistent unit Use float() or int() to convert text value [Link](x, y)
Import Conventions: Check datatypes [Link] ax1.set_title(“Sharing Y axis”)
import numpy as np # Numerical Python [Link](x, y)

E.g. [Link](x=“Year”, y=“Revenue”,


import pandas as pd # DataFrame Statistical summ. [Link](), df.value_counts()
E.g. [Link][:, 2:8].sum().sort_values().plot(kind="bar")
import seaborn as sns # Visualisation Query df[(df.col_name < 5000) & (df.col_name> 4000)] Plot via sorted val
E.g. df_info.groupby(“col_name”).hist()
import [Link] as plt # [Link]()
%matplotlib inline * Input this too for Jupyter Grouping [Link](“XXX”).”sub_XXX”.describe() [Link](kind = “scatter”)

E.g. [Link](kind=“bar”)
from scipy import module # Data Processing Sort values df.sort_values(“Likes”, ascending = False) Scatterplot [Link](df_total.Year, df_total.Growth)
Data Cleaning: [Link](df_resident.Year, df_resident.Growth)
Summary [Link]() # Index & Column Dtypes
header = [“XX”, …] E.g. df_total = df[[Link] == "Total Pop Growth"]

s=df[“Margin”]*300)
df = pd.read_csv(“[Link]”, names = header) corr = [Link](method=“pearson”) df_resident = df[[Link] == "Resident Pop Growth"]
print([Link](number)) index, columns, values, mean, median, mode, var, std, [Link]([“Total Pop Growth”, “Resident Pop Growth”])
Statistics
----------------------------------------------------------------------------------- percentile, count, sum, min, max, abs, cov, kurt, skew E.g. [Link](data=df, x=“Year”, y=“Growth”,

1D Plotting:

2D Plotting:

3D Plotting:
print(df[df.col_name == “XXX"]) # Indicate () at the end hue=“Category”, fit_reg=False)
----------------------------------------------------------------------------------- Visualisation Summary: * Using pandas, matplotlib, and seaborn Heatmap of corr [Link]([Link]())
import webbrowser [Link](kind = “Typeofplot”, x = “axisname”, y = “yaxisname”, color = “ “, title = “title”,
webbrowser.open_new([Link][75, 3]) legend = True/False Pairplot [Link](df)

You might also like