Python for Data Science
[email protected]
FK4AW1IM38
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Agenda
1. Pop Quiz
2. Common Python Libraries for Data Science
3. NumPy and Pandas
4. Common NumPy functions
5. Common Pandas functions
[email protected]
6. Merge vs Join
FK4AW1IM38 in Pandas
7. Example of Join
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Pop Quiz
1. What are the data types in Python?
2. What are some of the common Python libraries for Data Science?
3. Can you list some of the common functions in Pandas?
4. What are the applications of the functions like group by, merge, join etc?
[email protected]
FK4AW1IM38
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited. 3
Common Python Libraries for Data Science
Library Use
NumPy Handling multi-dimensional arrays
Scipy Scientific computation package
Matplotlib, Seaborn Data visualisation
[email protected]FK4AW1IM38
Pandas Handling tabular data
Scikit-learn Machine learning
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
NumPy
● Stands for Numerical Python
● It is one of the fundamental packages for mathematical, logical, and statistical operations with
Python
● It contains
○ Powerful N-dimensional array object, called ndarray
[email protected]FK4AW1IM38○ Large set of functions for creating, manipulating, and transforming ndarrays
● ndarrays can only contain data of a single datatype
● Useful in linear algebra, vector calculus, random number capabilities, etc
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Pandas
● Pandas is one of the fundamental packages for analysis and manipulation of tabular data
● Offers two major data structures - series & dataframe
● We can think of a pandas dataframe like an excel spreadsheet that is storing some data in rows and
columns.
● A pandas dataframe is made up of several pandas series
[email protected]
FK4AW1IM38○ Each column of a dataframe is a series.
● Pandas dataframes can contain data of multiple datatypes
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Common NumPy Functions
Function Description
np.array() To create an array
np.arange() Return evenly spaced values within a given interval
np.linspace() Return evenly spaced numbers over a specified interval
[email protected]FK4AW1IM38
np.zeros() To create an array of zeros
np.ones() To create an array of ones
np.transpose() Permute array dimensions
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Common NumPy Functions
Function Description
np.random.rand() To create an array of specified shape filled with random values
np.random.randint() Return random integers from low (inclusive) to high (exclusive)
np.random.randn() Return a sample (or samples) from the “standard normal” distribution.
[email protected]
np.concatenate()
FK4AW1IM38
Concatenate two arrays
np.save() Save an array to a binary file in .npy format.
np.savez() Save several arrays into a single file in uncompressed .npz format.
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Common Pandas Functions
Function Description
pd.read_csv() Read a comma-separated values (csv) file into DataFrame
df.loc[] Access a group of rows and columns by label(s)
df.iloc[] Purely integer-location based indexing for selection by position
[email protected]FK4AW1IM38
df.drop() Drop specified labels from rows or columns
pd.concat() To concatenate two pandas objects
pd.merge() To merge the pandas dataframes
df.groupby() To split, apply or combine the data structures
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Common Pandas Functions
Function Description
df.value_counts() To get count of some attributes
df.unique() To get unique values
df.dtype To get the data types
[email protected]FK4AW1IM38
df.shape To get the shape (number or rows and columns)
df.head() To get the top rows
df.tail() To get the last rows
df.describe() To get the quick statistic summary
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Merge vs Join
• Join: The join method works best when we are joining dataframes on their indexes (though you
can specify another column to join on for the left dataframe).
• Merge: The merge method is more versatile and allows us to specify columns besides the index
to join on for both dataframes.
Natural join - Full outer join -
Left outer join Right outer join
Intersection Union
[email protected]FK4AW1IM38To keep only rows To keep all rows from To include all the To include all the
that match from the both data frames, rows of your data rows of your data
data frames frame x and only frame y and only
those from y that those from x that
match match,
how=‘inner’. how=‘outer’. how =‘left’. how=‘right’.
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Example of Join
[email protected]
FK4AW1IM38
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Example of Join
[email protected]
FK4AW1IM38
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action.
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.
Happy Learning !
[email protected]FK4AW1IM38
This file is meant for personal use by [email protected] only.
Sharing or publishing the contents in part or full is liable for legal action. 14
Proprietary content. © Great Learning. All Rights Reserved. Unauthorized use or distribution prohibited.