CMPUT 175
Introduction to Foundations
of Computing
A Quick Introduction to
Jupyter Notebook
You should view the vignettes:
Transposition Cipher
Post-test Need for Data Structures Post-test will close
on Thursday
January 17, 2024 © Osmar R. Zaïane : University of Alberta 1
Objectives
Learn about Jupyter Notebook, how it works
and how to use it.
Learn about the existence of some Python
packages
Use Jupyter Notebook with simple examples
January 17, 2024 © Osmar R. Zaïane : University of Alberta 2
What is Jupyter Notebook?
Jupyter Notebook is a open source web application
It runs in your web browser
It allows you to create documents that run live
code (Python is an example among many others).
A notebook integrates code and its output into a
single document
It is called Jupyter because originally it was
designed for three languages: Julia, Python, & R.
Python runs within a kernel and there are over
100 other kernels you can use.
January 17, 2024 © Osmar R. Zaïane : University of Alberta 3
Jupyter Notebook Visually
For interactive Python coding A Notebook
A Cell
to run a cell
A cell can be Code or
Markdown
Variables and
functions persist in
Current cell is framed in green subsequent cells
January 17, 2024 © Osmar R. Zaïane : University of Alberta 4
Jupyter Notebook and OS
https://www.geeksforgeeks.org/how-to-install-jupyter-notebook-on-macos/
Windows
Mac OS
Linux
https://linux.how2shout.com/how-to-install-jupyter-on-ubuntu-20-04-lts-linux/
See https://www.youtube.com/watch?v=mZ_0inQKMGk Even on Android
using pydroid 5
January 17, 2024 © Osmar R. Zaïane : University of Alberta
Functions in Python
Remember that in Python you can write functions.
def say_hello(recipient):
return 'Hello, {}!'.format(recipient)
Hello CMPUT175
say_hello('CMPUT175')
Once you write a function you can call it as many
times as needed
Many have written useful functions that are put in
packages or modules than you can import and use.
import math
120
math.factorial(5)
There are many available packages
January 17, 2024 © Osmar R. Zaïane : University of Alberta 6
Popular Packages for Python
numPy: for advanced array operations
pandas: for data analysis and working with tabular, time series or matrix data
Matplotlib: for data exploration and visualization
Seaborn: for drawing attractive statistical graphics
scikit-learn: for predictive data analysis
Requests: for using an HTTP client to make HTTP requests
urllib3: for using an HTTP client to make HTTP requests
NLTK: Natural Language Toolkit for processing language data
Pillow: for working with image data
pytest: for testing new code
Pygame: for the development of multimedia applications like video games
PyTorch: for creating deep neural networks
SciPy: for scientific computing
January 17, 2024 © Osmar R. Zaïane : University of Alberta 7
How to use packages
Use PIP a python package manager
For example to download a package named
“numpy” type the command
pip install numpy
Shell command
In your code use “import”
import numpy as np
randomGenerator = np.random.default_rng(1)
Vector=randomGenerator.normal(2,0.5,10000)
print(Vector)
[2.1727921 2.41080907 2.16521854 ... 2.15062613 1.61436007 2.09274213]
January 17, 2024 © Osmar R. Zaïane : University of Alberta 8
Example with numpy
numpy to
generate
into a
vector a
random
normal
distribution
matplotlib
to generate
and plot a
histogram
from a
vector of
numbers
January 17, 2024 © Osmar R. Zaïane : University of Alberta 9
Installing Jupyter Notebook
You can install Jupyter by the package
manager that comes with Python called pip
Shell
$ pip install jupyter
And start the Jupyter Notebook server by
calling it
Shell
$ jupyter notebook
And Jupyter will start in your default web
browser at http://localhost:8888/tree
January 17, 2024 © Osmar R. Zaïane : University of Alberta 10
Jupyter First Contact
That is the Notebook server.
To create a jupyter
Notebook per se click
on the “New” button
You can name your
notebook
January 17, 2024 © Osmar R. Zaïane : University of Alberta 11
What are the components
at play?
Web Browser
Page in a tab RUNS Web Server
localhost
Notebook
CALLS
Cell
Python Kernel
The kernel keeps on running while
the page exists.
The Notebook server can
shutdown or re-run the kernel.
January 17, 2024 © Osmar R. Zaïane : University of Alberta 12
What are the components
at play?
Web Browser Web Server
localhost
Notebook Server
Python Kernel 1
Notebook n
Cells You can rerun a cell or all cells
Python Kernel n
…
Each Notebook has its
own kernel.
You can shutdown or
restart individual kernels
January 17, 2024 © Osmar R. Zaïane : University of Alberta 13
How does it work
Now that you have your web server running
Your Notebook server is active
You created a Notebook with python kernel
You can write python code in your cells
interactively
January 17, 2024 © Osmar R. Zaïane : University of Alberta 14
Better way to install
Jupyter
If interested in data science and want to
avoid installing individual packages
The most popular way of installing Jupyter
Notebook is with Anaconda (a data science
platform) which comes with many libraries
preinstalled including Jupiter Notebook and
many useful python packages.
You just have to install anaconda
http://anaconda.com/download
January 17, 2024 © Osmar R. Zaïane : University of Alberta 15
An advanced Example
Fortune 500 ranks the
biggest U.S. companies
by revenue
It is an annual ranking
done since 1955
We have a file with the
Fortune 500 rankings
from 1955 to 2005.
January 17, 2024 © Osmar R. Zaïane : University of Alberta 16
An advanced Example
Importing packages
Opening csv file
Displaying top records
with head()
Displaying last records
with tail()
January 17, 2024 © Osmar R. Zaïane : University of Alberta 17
An advanced Example
head() and tail()
display by default Displaying top 10 records
5 records
We are just
exploring the
data
January 17, 2024 © Osmar R. Zaïane : University of Alberta 18
Let’s explore further
We name the columns
Check the data frame info
51 years * 500 companies
= 25500 entries
Why is profit an object
and not a float?
Probably missing
values entered as
a string?
January 17, 2024 © Osmar R. Zaïane : University of Alberta 19
Let’s explore further
Let’s locate those
profits that are
not numeric
These are the first 5
What’s the set of
profit values that
are not numeric?
How many are there?
There are 369 entries with no specific profit available
January 17, 2024 © Osmar R. Zaïane : University of Alberta 20
Let’s explore further
How are these non numeric values distributed?
January 17, 2024 © Osmar R. Zaïane : University of Alberta 21
Let’s clean the data
Let’s keep in the
data frame only
those numeric
How many left?
25500-369=25131
Let’s check the types again.
Now profit is a float
January 17, 2024 © Osmar R. Zaïane : University of Alberta 22
What is next?
These examples are only to introduce you to
Jupyter Notebook and some python libraries
You need to explore on your own
There are many resources and examples online
For NumPy documentation see
https://numpy.org/doc/stable/
Here is a tutorial on how to use matplotlib
https://matplotlib.org/stable/tutorials/index
Documentations on pandas functions are here:
https://pandas.pydata.org/docs/reference/gener
al_functions.html
January 17, 2024 © Osmar R. Zaïane : University of Alberta 23
Google Colab
(Colaboratory)
Google Colab is a free cloud service that allows
you to run and share Jupyter Notebooks
It is cloud-based so it doesn’t run on your
computer
You don’t have to download, install, or run
anything on your computer, except a browser
You have to be online and run Chrome or
Firefox
Go to: https://colab.research.google.com/
January 17, 2024 © Osmar R. Zaïane : University of Alberta 24
Start existing Notebook
Start new Notebook
January 15, 2024 © Osmar R. Zaïane : University of Alberta 25
Cell
Run with
Notebook
January 15, 2024 © Osmar R. Zaïane : University of Alberta 26
Colab simple example
January 15, 2024 © Osmar R. Zaïane : University of Alberta 27
Where are my files?
You can change the name of the file
And download as .ipynb
Click the folder
You can drag and
drop files from
your computer
January 15, 2024 © Osmar R. Zaïane : University of Alberta 28
What is the difference?
Runs on your computer Runs on the cloud
Works off-line Works online only
You have to install it and You don’t have to install
install packages anything
Files are saved on your Files are saved on Google
device (hard drive) servers
Files persist Files disappear at end of
Notebooks can’t be shared session
except via repositories Notebooks can be shared
Notebooks are in .ipynb Notebooks can be saved as
format
January 15, 2024
.ipynb
© Osmar R. Zaïane : University of Alberta 29
Another example
January 15, 2024 © Osmar R. Zaïane : University of Alberta 30