MSBD 5001
Foundations of Data Analytics
Fall 2025
Tutorial 2: NumPy Arrays
Cecia Chan
Department of Computer Science and Engineering
The Hong Kong University of
Science and Technology
MSBD5001 Fall 2025 1
NumPy
• NumPy is a fundamental package for scientific computing in Python.
• NumPy is a short form for "Numerical Python".
• The NumPy library contains:
• multidimensional array data structures, such as the homogeneous,
N-dimensional ndarray, and
• a large library of functions that operate efficiently on these data structures.
• https://numpy.org
MSBD5001 Fall 2025 2
Getting Started
• To install NumPy in Jupyter Notebook using pip, run the following cell:
!pip install numpy
• Import the NumPy package
• np is the widespread convention
• np. allows access to NumPy features with a short, recognizable prefix
import numpy as np
MSBD5001 Fall 2025 3
Array
• List
• Build-in data type of Python
• Lists are mutable sequences which can be used to store items of different
types.
• Array
• array module from the Python Standard Library
• arrays are ordered, mutable sequence to store homogenous items.
• The type of objects stored is constrained.
import array
arr1 = array.array('d', [1, 2, 3, 4, 5])
print(arr1)
array('d', [1.0, 2.0, 3.0, 4.0, 5.0])
MSBD5001 Fall 2025 4
NumPy Array
• Standard Python Library array only handles one-dimensional arrays and offers
less functionality.
• NumPy’s array class is called ndarray (the N-dimensional array).
• It provides a powerful N-dimensional array object.
• It is a table of elements (usually numbers), all of the same type, indexed by a
tuple of non-negative integers.
• In NumPy, dimensions are called axes.
• The shape of an ndarray is a tuple of integers, which each integer is the size of the
array along each dimensions.
• NumPy arrays are “0-indexed”: the first element of the array is accessed using
index 0.
MSBD5001 Fall 2025 5
Part 1 – ndarray Basics
MSBD5001 Fall 2025 6
NumPy Array Basics
• To create an ndarray, we can use the numpy.array() method.
• We can initialize an ndarray from a Python list of elements.
array1 = np.array([5, 3, 6])
print (array1) [5 3 6]
print (type(array1)) <class 'numpy.ndarray'>
print(array1[0], array1[1])
print(array1[-1])
53
6
MSBD5001 Fall 2025 7
ndarray Attributes
• Dimensions
• To obtain the number of dimensions of an array, we can use the
numpy.ndarray.ndim
array1.ndim
1
• Shape
• To obtain the shape of an array, we can use the numpy.ndarray.shape attribute.
• https://numpy.org/doc/stable/reference/generated/numpy.ndarray.shape.html
• For the above example, since array1 only has 1 dimension, the shape of it is (3, )
• It indicates that dimension 1 has 3 elements.
array1.shape
(3,)
MSBD5001 Fall 2025 8
ndarray Attributes
• Size
• To obtain the number of elements of an array, we can use the
numpy.ndarray.size
array1.size
3
• Data type
• To obtain the data type of an array, we can use the numpy.ndarray.dtype
attribute.
array1.dtype
dtype('int32')
array1.dtype.name
'int32'
MSBD5001 Fall 2025 9
Axis 0
Creating an 1d Array 1 2 3 4
a = np.array([1, 2, 3, 4]) # Create an array with 1 axes and a length of 3
print (a)
print (type(a)) # Return the type of the object, a
[1 2 3 4]
<class 'numpy.ndarray'>
print (a.ndim) # Return the number of axes (dimensions) of the array
print (a.shape) # Return the dimensions of the array
print (a.size) # Return the number of elements
print (a.dtype.name) # Return the type of the elements in the array 1
(4,)
4
print(a[0], a[1], a[2], a[3])
int32
print(a[-1]) 1 2 3 4 4
MSBD5001 Fall 2025 10
Axis 1
Creating a 2d Array Axis 0
1.0
0
0
1
0
2
b = np.array([[ 1., 0., 0.],
[ 0., 1., 2.]]) # Create an array with 2 axes
# 1st axis has a length of 2,
# 2nd axis has a length of 3
print (b)
print (type(b)) [[1. 0. 0.] [0. 1. 2.]]
<class 'numpy.ndarray'>
2
print (b.ndim) (2, 3)
print (b.shape) 6
print (b.size) float64
print (b.dtype.name)
[1. 0. 0.] [0. 1. 2.]
1.0, 0.0, 0.0, 0.0, 1.0, 2.0
print (b[0], b[1], sep="\t")
print (b[0, 0], b[0, 1], b[0, 2], b[1, 0], b[1, 1], b[1, 2], sep=", ")
MSBD5001 Fall 2025 11
Axis 2
Axis 0
Creating a 3d Array Axis 1
1
4
2
5
7
3
6
8 9
10 11 12
c = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
print (c)
[[[ 1 2 3]
[ 4 5 6]]
[[ 7 8 9]
[10 11 12]]]
print (c.ndim) print (c[0, 0, 0], c[1, 1, 1])
print (c.shape) 3
print (c.size) 1 11
(2, 2, 3)
print (c.dtype.name) 12
int32
MSBD5001 Fall 2025 12
Iterating
• Iterating the elements of an array
• numpy.ndarray.flat
• an attribute of array 0
2
• a 1d iterator 4
6
b = np.array([[0, 2, 4, 6], [8, 10, 12, 14], [16, 18, 20, 22]])
8
for x in b.flat:
10
print (x)
12
14
16
18
20
22
MSBD5001 Fall 2025 13
Part 2 – Creating ndarray
MSBD5001 Fall 2025 14
More about Creating Arrays
• NumPy offers several functions to create arrays with initial placeholder content.
Array creation routines — NumPy Manual
1D array numpy.arange(…) Creates an array with a sequence of evenly
spaced numbers (step) within a given interval
numpy.linspace(…) Creates an array with a sequence of evenly
spaced numbers over a given interval
2D array numpy.eye(i) Creates a 𝑖 × 𝑖 identify matrix
numpy.diag(v) Creates a diagonal array with given values
General numpy.zeros(shape) Creates an array of zeros with specified shape
ndarray numpy.ones(shape) Creates an array of ones with specified shape
numpy.full(shape, constant) Creates a constant array with specified shape
numpy.empty(shape) Creates an empty array
numpy.random.random(shape) Creates an array of random values with
specified shape
MSBD5001 Fall 2025 15
More about Creating Arrays – 1D Array
• Creates an array with a sequence of evenly spaced numbers within a given
interval
• np.arange(end): the sequence of values are from 0 to end (exclusive)
• np.arange(start, end): the sequence of values are from start to end (exclusive)
• np.arange(start, end, step): the values are within start to end (exclusive) and with the spacing
between two values given by step
np.arange(10) # Return an array with a sequence of numbers from 0 to 10 (exclusive)
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
np.arange(1, 10) # Return an array with a sequence of numbers from 1 to 10 (exclusive)
array([1, 2, 3, 4, 5, 6, 7, 8, 9])
np.arange(1, 10, 2) # Return an array with a sequence of numbers from 1 to 10 (exclusive)
# and the step is 2 between two numbers in the sequence.
array([1, 3, 5, 7, 9])
MSBD5001 Fall 2025 16
More about Creating Arrays – 1D Array
• Creates an array with a sequence of evenly spaced numbers over a given interval
• linspace(start, end, num=50, endpoint=True)
• start: The starting value of the sequence.
• stop: The end value of the sequence.
• num: Number of samples to generate. Default is 50. Must be non-negative. an optional parameter
• endpoint: If True, stop is the last sample. Otherwise, it is not included. Default is True. An optional
parameter
array([0. , 0.18367347, 0.36734694, 0.55102041, 0.73469388,
np.linspace(0, 9) 0.91836735, 1.10204082, 1.28571429, 1.46938776, 1.65306122,
1.83673469, 2.02040816, 2.20408163, 2.3877551 , 2.57142857,
2.75510204, 2.93877551, 3.12244898, 3.30612245, 3.48979592,
3.67346939, 3.85714286, 4.04081633, 4.2244898 , 4.40816327,
4.59183673, 4.7755102 , 4.95918367, 5.14285714, 5.32653061,
5.51020408, 5.69387755, 5.87755102, 6.06122449, 6.24489796,
6.42857143, 6.6122449 , 6.79591837, 6.97959184, 7.16326531,
7.34693878, 7.53061224, 7.71428571, 7.89795918, 8.08163265,
8.26530612, 8.44897959, 8.63265306, 8.81632653, 9. ])
np.linspace(0, 9, 3)
array([0. , 4.5, 9. ])
MSBD5001 Fall 2025 17
More about Creating Arrays – 2D Array
• An 𝑖 × 𝑖 identify matrix
array([[1., 0., 0.],
np.eye(3) # Return a 3x3 identity matrix [0., 1., 0.],
[0., 0., 1.]])
• A diagonal matrix
• a square 2D array with given values along the diagonal
array([[1, 0, 0],
np.diag([1, 2, 3]) # return a 3x3 matrix [0, 2, 0],
[0, 0, 3]])
• An array with given values along the kth diagonal
array([[0, 1, 0, 0],
np.diag([1, 2, 3], 1) # return a 4x4 matrix [0, 0, 2, 0],
[0, 0, 0, 3],
[0, 0, 0, 0]])
MSBD5001 Fall 2025 18
More about Creating Arrays – ndarray
• An array of zeros
np.zeros( (3, 4) ) # Create an 3x4 array full of zeros
# (i.e. a two dimensional array with 3 rows and 4 elements each row)
array([[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]])
• An array of ones
np.ones( (3, 4) ) # Create an array full of ones
array([[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]])
MSBD5001 Fall 2025 19
More about Creating Arrays – ndarray
• A constant array
np.full( (3, 4), 5) # Create a constant array
array([[5, 5, 5, 5],
[5, 5, 5, 5],
[5, 5, 5, 5]])
• An empty array
• An array with specified shape whose initial content is random and depends on the state of
the memory is created.
• The reason to use empty() over zeros() (or something similar) is speed.
np.empty( (3, 4) )
array([[1.01851408e-311, 2.71736105e-322, 0.00000000e+000,
0.00000000e+000],
[1.11260619e-306, 1.16096346e-028, 9.82205649e+252,
1.11789342e+253],
[6.16418465e-114, 3.18110755e-110, 7.35413917e+223,
4.95182015e+185]])
MSBD5001 Fall 2025 20
More about Creating Arrays – ndarray
• An array of random values
np.random.random((3, 4)) # Return a 3x4 array filled with random values
array([[0.95807152, 0.02277773, 0.10667925, 0.36269869],
[0.51258402, 0.22810189, 0.03542813, 0.14546954],
[0.45486006, 0.79664748, 0.79413054, 0.78954938]])
MSBD5001 Fall 2025 21
More about Printing an Array
• If an array is too large to be printed, NumPy automatically skips the central part
of the array and only prints the corners.
print (np.arange(10000))
[ 0 1 2 ... 9997 9998 9999]
print (np.arange(10000).reshape(500, 20)) # reshape() returns an array
# with a modified shape
[[ 0 1 2 ... 17 18 19]
[ 20 21 22 ... 37 38 39]
[ 40 41 42 ... 57 58 59]
...
[9940 9941 9942 ... 9957 9958 9959]
[9960 9961 9962 ... 9977 9978 9979]
[9980 9981 9982 ... 9997 9998 9999]]
MSBD5001 Fall 2025 22
Part 3 – Changing the Shape of an Array
MSBD5001 Fall 2025 23
Changing the Shape of an Array – reshape
• Reshaping
• Through reshaping, we can create a new array by adding or removing
dimensions or changing the number of elements in each dimension
• numpy.ndarray.reshape(new_shape)
a = np.array([ [0, 1, 2, 3], [4, 5, 6, 7]]) # creates a 2x4 2d array
print ("a =", a)
b = a.reshape(4, 2) # change the shape to a 2x4 array
print ("b =", b)
a = [[0 1 2 3]
[4 5 6 7]]
b = [[0 1]
[2 3]
[4 5]
[6 7]]
MSBD5001 Fall 2025 24
Changing the Shape of an Array – reshape
• Using -1 to automatically deduce the size of a dimension
a = np.array([ [0, 1, 2, 3], [4, 5, 6, 7]])
# creates a 2x4 2d array
b = a.reshape(-1, 2) # automatically deduce the 1st dimension's size
b.shape
Size of 2nd dimension given
(4, 2)
Size of 1st dimension deduced automatically
MSBD5001 Fall 2025 25
Changing the Shape of an Array – resize
• Resizing
• We can change the shape and size of an array in-place.
• numpy.ndarray.resize(new_shape)
a = np.array([ [0, 1, 2, 3], [4, 5, 6, 7]])
print("a = ", a)
a.resize(4, 2) a = [[0 1 2 3]
print ("after resizing, a =", a) [4 5 6 7]]
after resizing, a = [[0 1]
[2 3]
[4 5]
[6 7]]
MSBD5001 Fall 2025 26
Part 4 – Basic Array Operations
(Vectorizations)
MSBD5001 Fall 2025 27
Vectorization
• Arrays with same shape
0 1 2 1 1 1
0 1 2 3 4 5 1 1 1
x+y +
3 4 5 6 7 8 1 1 1
x= 6 7 8 9 10 11 1 1 1
9 10 11
+
1 1 1 1 2 3
1 1 1 4 5 6
y= 1 1 1 7 8 9
1 1 1 10 11 12
MSBD5001 Fall 2025 28
Vectorization
MSBD5001 Fall 2025 29
Arithmetic Operations with
2 Arrays of Same Shape
• Arithmetic operators on 2 arrays of same shape are applied elementwise.
• A new array a = np.arange(0, 24, 2).reshape(3, 4) [[ 0 2 4 6]
[ 8 10 12 14]
is created. b = np.arange(1, 24, 2).reshape(3, 4)
[16 18 20 22]]
print (a) [[ 1 3 5 7]
print (b) [ 9 11 13 15]
print (a + b) [17 19 21 23]]
[[ 1 5 9 13]
[17 21 25 29]
[33 37 41 45]]
a= b= a+ b =
0 2 4 6 1 3 5 7 1 5 9 13
8 10 12 14 + 9 11 13 15 17 21 25 29
16 18 20 22 17 18 21 23 33 37 41 45
MSBD5001 Fall 2025 30
Arithmetic Operations with
2 Arrays of Same Shape
• Binary Operators: [[ 0 2 4 6]
• Addition (+) [ 8 10 12 14]
[16 18 20 22]]
• Subtraction (-) [[ 1 3 5 7]
[ 9 11 13 15]
• Multiplication (*) [17 19 21 23]]
• Division (/) [[-1 -1 -1 -1]
[-1 -1 -1 -1]
[-1 -1 -1 -1]]
a = np.arange(0, 24, 2).reshape(3, 4) [[ 0 6 20 42]
b = np.arange(1, 24, 2).reshape(3, 4) [ 72 110 156 210]
print (a) [272 342 420 506]]
print (b) [[0. 0.66666667 0.8 0.85714286]
print (a - b) [0.88888889 0.90909091 0.92307692 0.93333333]
print (a * b) [0.94117647 0.94736842 0.95238095 0.95652174]]
print (a / b)
MSBD5001 Fall 2025 31
Arithmetic Operations with
2 Arrays of Same Shape
• Assignment Operators:
• Addition (+=)
• Subtraction (-=)
• Multiplication (*=)
• Division (/=) [[ 0 2 4 6]
• modify an existing array in-place [ 8 10 12 14]
[16 18 20 22]]
[[ 1 3 5 7]
a = np.arange(0, 24, 2).reshape(3, 4)
[ 9 11 13 15]
b = np.arange(1, 24, 2).reshape(3, 4) [17 19 21 23]]
print (a) [[ 0 6 20 42]
print (b) [ 72 110 156 210]
a *= b [272 342 420 506]]
MSBD5001 Fall 2025 32
Comparison Operations with
2 Arrays of Same Shape
• Comparison Operators:
• <, >, <=, >=, ==, !=
[[ 0 2 4 6]
[ 8 10 12 14]
[16 18 20 22]]
a = np.arange(0, 24, 2).reshape(3, 4) [[ 1 3 5 7]
[ 9 11 13 15]
b = np.arange(1, 24, 2).reshape(3, 4)
[17 19 21 23]]
print (a) [[False False False False]
print (b) [False False False False]
print (a > b) [False False False False]]
print (a != b) [[ True True True True]
[ True True True True]
[ True True True True]]
MSBD5001 Fall 2025 33
Arithmetic Operations with Array and Scalar
• Arithmetic operators between an array and a single value
• elementwise [[ 0 2 4 6]
[ 8 10 12 14]
• A new array a = np.arange(0, 24, 2).reshape(3, 4) [16 18 20 22]]
is created. print (a) [[10 12 14 16]
[18 20 22 24]
print (a + 10) [26 28 30 32]]
a=
0 2 4 6 0 2 4 6 10 10 10 10 10 12 14 16
8 10 12 14 + 10 8 10 12 14 + 10 10 10 10 18 20 22 24
16 18 20 22 16 18 20 22 10 10 10 10 26 28 30 32
MSBD5001 Fall 2025 34
Arithmetic Operations with Array and Scalar
• Arithmetic operators between an array and a single value
• elementwise
[[ 0 1 2 3]
[ 4 5 6 7]
a = np.arange(0, 12).reshape(3, 4) [ 8 9 10 11]]
print (a) [[-10 -9 -8 -7]
print (a - 10) [ -6 -5 -4 -3]
print (a * 10) [ -2 -1 0 1]]
print (a / 10) [[ 0 10 20 30]
[ 40 50 60 70]
print (a % 10)
[ 80 90 100 110]]
[[0. 0.1 0.2 0.3]
[0.4 0.5 0.6 0.7]
[0.8 0.9 1. 1.1]]
[[0 1 2 3]
[4 5 6 7]
[8 9 0 1]]
MSBD5001 Fall 2025 35
Comparison Operations with Array and Scalar
• Arithmetic operators between an array and a single value
• elementwise
[[ 0 1 2 3]
[ 4 5 6 7]
a = np.arange(0, 12).reshape(3, 4) [ 8 9 10 11]]
print (a) [[ True True True True]
print (a < 10) [ True True True True]
print (a == 10) [ True True False False]]
[[False False False False]
[False False False False]
[False False True False]]
MSBD5001 Fall 2025 36
Dot Product and Matrix Product
• Matrix multiplication can be done by:
• Dot product: numpy.ndarray.dot()
• Matrix product: numpy.matmul() (@ operator)
a = np.array([[1, 0], [0, 1]])
b = np.array([[1, 2], [3, 4]])
print (a.dot(b)) # Return matrix product [[1 2]
[3 4]]
[[1 2]
print (a @ b) # Return matrix product [3 4]]
print (np.matmul(a, b)) # Return matrix product [[1 2]
[3 4]]
MSBD5001 Fall 2025 37
Part 5 – Indexing and Slicing
MSBD5001 Fall 2025 38
Indexing and Slicing
• Similar to Python lists, NumPy arrays can be indexed and sliced.
• If an array is multi-dimensional, then we must specify a slice for each dimension of
the array.
• For example, if a is a NumPy array,
Indexing a[i] Select the element at index i
a[-i] Select the element from the end
Slicing a[i:j] Select the elements from index i to j-1
a[:] Select all elements in the corresponding dimension (axis)
a[0:] Select all elements in the corresponding dimension (axis)
a[i:] Select all elements from index i to the end (inclusive)
a[:j] Select all elements from index 0 to index j-1
a[i:j:n] Select the elements from index i to j, with a step of n
a[::-1] Select all elements in the reversed order
MSBD5001 Fall 2025 39
Indexing and Slicing with an ID Array
a = np.arange(0, 8, 2) # creates a 1d array with 4 even numbers
print (a)
print (a[2]) # Indexing, access the 3rd element with index 2
print (a[:]) # Slicing, access all elements
print (a[1:3]) # Slicing, access the elements with index starting at 1 and ending at 3-1
print (a[::-1]) # Slicing, access all elements in the reversed order
[0 2 4 6]
4
[0 2 4 6]
[2 4]
[6 4 2 0]
MSBD5001 Fall 2025 40
Indexing and Slicing with a 2D Array
• Creating a 2d Array
0 1 2
3 4 5
x= 6 7 8
9 10 11
MSBD5001 Fall 2025 41
Indexing with a 2D Array
How about x[1][2]?
0 1 2
0 0 1 2
0 1 2 x[ 1, 2 ] 1 3 4 5
5
3 4 5 2 6 7 8
x= 6 7 8 Indexing on 1st Indexing on 2nd 3 9 10 11
9 10 11 dimension dimension
MSBD5001 Fall 2025 42
Slicing with a 2D Array
MSBD5001 Fall 2025 43
Slicing with a 2D Array 0 1 2
0 0 1 2 1 2
0 1 2 x[ : , 1: ] 1 3 4 5 4 5
3 4 5 2 6 7 8 7 8
x= 6 7 8 Slicing on 1st Slicing on 2nd 3 9 10 11 10 11
9 10 11 dimension: dimension:
start:end:step start:end:step
0 1 2
0 0 1 2
x[ 1::2 , ::2 ] 1 3 4 5 3 5
2 6 7 8 9 11
Slicing on Slicing on 3 9 10 11
1st 2nd
dimension dimension
MSBD5001 Fall 2025 44
Indexing with a 2D Array using Boolean Array
• We can create an array by selecting elements of an array that satisfy certain
condition.
MSBD5001 Fall 2025 45
Indexing with a 2D Array using Boolean Array
0 1 2
3 4 5
x= 6 7 8
9 10 11
0>5
False False False False
0 False
1 False
2
False False False False
3 False
4 False
5
x>5 x[ x > 5 ] 6 7 8 9 10 11
True True True True
6 True
7 True
8
True True True True
9 True
10 True
11
6>5
MSBD5001 Fall 2025 46
Mixing Indexing with Slicing
MSBD5001 Fall 2025 47
Indexing with an Array of Indices
• We can create an array using data from another array.
a = np.arange(9).reshape(3, 3) # [[0,1,2],[3,4,5],[6,7,8]]
print (a[[0, 1, 2], [0, 1, 0]])
# a[0, 0], a[1, 1], a[2, 0] are selected
print (a[[0, 2], [0, 1]])
# a[0, 0], a[2, 1] are selected
print (a[[0, 0], [2, 2]]) [0 4 6]
# a[0, 2] are selected twice [0 7]
[2 2]
MSBD5001 Fall 2025 48
Part 6 – Broadcasting
MSBD5001 Fall 2025 49
Broadcasting
• NumPy broadcasting allows us to work with arrays of different shapes when
performing arithmetic operations.
• Input arrays do not need to have the same number of dimensions.
• Broadcasting will be more efficient than using explicit loop(s) when the matrix is
very large.
MSBD5001 Fall 2025 50
Broadcasting
• A set of arrays is called “broadcastable” to the same shape if the rules
below produce a valid result:
• When operating on two arrays, NumPy compares their shapes element-wise.
• It starts with the trailing (i.e. rightmost) dimension and works its way left.
• Arrays are compatible in all dimensions:
• Two dimensions are compatible when
• they are equal, or
• one of them is 1.
• The resulting array will have the same number of dimensions as the input array with
the greatest number of dimensions, where the size of each dimension is the largest
size of the corresponding dimension among the input arrays.
• Note that missing dimensions are assumed to have size one.
• If one is of higher dimension, e.g. (4, 1), and the other array is with lower dimensions,
e.g. (3, )
• The lower dimension array will have 1’s prepared to make it with the same number of
dimensions,
• e.g. (3, ) ➔ (1, 3)
MSBD5001 Fall 2025 51
x: Broadcast
allow 2nd
dimension
Broadcasting 1 2 3
0
• Arrays with different shapes y: Broadcast
3 allow 1st
0 +
6 dimension
3
shape: (4, 1) 9
x= 6
1.
9 2. Starting from
Then, 0 0 0 1 2 3
the rightmost
➔ one is 1 dimension 3 3 3 1 2 3
➔ one is 1 +
shape: (1, 3) 6 6 6 1 2 3
y= 1 2 3 9 9 9 1 2 3
Resulting
shape:
1 2 3
(4, 3)
4 5 6
7 8 9
MSBD5001 Fall 2025 52
10 11 12
Broadcasting
MSBD5001 Fall 2025 53
Part 7 – More about changing the shape of
the array
MSBD5001 Fall 2025 54
Flattening an Array
• Flatten an array into 1 dimension
• numpy.ndarray.flatten()
• Returns a copy of the array collapsed into 1 dimension
a = np.array([ [0, 1, 2, 3], [4, 5, 6, 7]] )
print ("a = ", a) a = [[0 1 2 3]
b = a.flatten() [4 5 6 7]]
b = [0 1 2 3 4 5 6 7]
• numpy.ravel() print ("b =", b)
• Returns the element of the original array in a 1-dimensional array
a = np.array([ [0, 1, 2, 3], [4, 5, 6, 7]])
print ("a = ", a) a = [[0 1 2 3]
c = np.ravel(a) [4 5 6 7]]
print ("c =", c) c = [0 1 2 3 4 5 6 7]
• The primary difference the new array created using ravel() is actually a reference to
the parent array (i.e., a “view”).
• Any changes to the new array will affect the parent array as well.
MSBD5001 Fall 2025 55
Transposing an Array
• Transpose a matrix
• numpy.transpose() or numpy.ndarray.transpose() or numpy.ndarray.T
a = np.array([ [0, 1, 2, 3], [4, 5, 6, 7]]) # 2 x 4 array
print ("a =", a) a = [[0 1 2 3]
print ("shape = ", a.shape) [4 5 6 7]]
shape = (2, 4)
print (a.transpose()) # standard transpose of the 2D array [[0 4]
# same as np.transpose(a) [1 5]
[2 6]
print (a.T) [3 7]]
[[0 4]
[1 5]
[2 6]
[3 7]]
MSBD5001 Fall 2025 56
Reversing an Array
• Reverse an array
• numpy.flip(a, axis=None)
• Shape of the reversed array remains unchanged.
a = np.array([ [0, 1, 2, 3], [4, 5, 6, 7]]) # 2 x 4 array
print ("a =", a)
print ("shape = ", a.shape)
a = [[0 1 2 3]
[4 5 6 7]]
b = np.flip(a)
print ("b = ", b) shape = (2, 4)
b = [[7 6 5 4]
print ("shape = ", b.shape)
[3 2 1 0]]
shape = (2, 4)
MSBD5001 Fall 2025 57
More about Transposing an Array
• Transpose a matrix with specified axes
• numpy.transpose(*axes)
a = np.array([ [0, 1, 2, 3], [4, 5, 6, 7]]) # 2 x 4 array
print ("a =", a) a = [[0 1 2 3]
print ("shape = ", a.shape) [4 5 6 7]]
shape = (2, 4)
print (a.transpose()) # standard transpose of the 2D array [[0 4]
# same as np.transpose(a) [1 5]
[2 6]
print (a.T) [3 7]]
[[0 4]
[1 5]
[2 6]
[3 7]]
MSBD5001 Fall 2025 58
a.shape= (4,)
a= [0 1 2 3]
b.shape= (1, 4)
Adding a New Axis to an Array b= [[0 1 2 3]]
c.shape= (4, 1)
c= [[0]
• Increase the dimensions of your existing array [1]
[2]
• numpy.newaxis [3]]
• numpy.expand_dims()
a = np.arange(4) # 1D array a = np.arange(4)
print ("a.shape=", a.shape) print ("a.shape=", a.shape)
print ("a=", a) print ("a=", a)
# add an axis along the 1st dimension
b = a[np.newaxis, :] b = np.expand_dims(a, axis=0)
print ("b.shape=", b.shape) print ("b.shape=", b.shape)
print ("b=", b) print ("b=", b)
# add an axis along the 2nd dimension
c = a[:, np.newaxis] c = np.expand_dims(a, axis=1)
print ("c.shape=", c.shape) print ("c.shape=", c.shape)
print ("c=", c) print ("c=", c)
MSBD5001 Fall 2025 59
Part 8 – More Useful Array Operations
MSBD5001 Fall 2025 60
Aggregation Functions
• NumPy also provide many aggregation functions for doing computations on arrays.
• Mathematical functions
• numpy.ndarray.sum(axis=None)
• Return the sum of the array elements over a given axis
• numpy.ndarray.min(axis=None)
• Return the minimum of the array elements over a given axis
• numpy.ndarray.max(axis=None)
• Return the maximum of the array elements over a given axis
• Statistics
• numpy.mean(a, axis=None)
• Return the average along a given axis
• numpy.average(a, axis=None, weights=None)
• Return the weighted average along a given axis
• numpy.median(a, axis=None)
• Compute the median along the specified axis
• numpy.percentile(a, q, axis=None)
• Compute the q-th percentile of array a along the given axis
MSBD5001 Fall 2025 61
Aggregation
[[ 0 1 2]
a = np.arange(0, 12).reshape(4, 3) [ 3 4 5]
print(a) [ 6 7 8]
[ 9 10 11]]
print (a.sum()) # sum all elements 66
print (np.mean(a))
print (np.mean(a, 0)) 5.5
print (np.percentile(a, 50)) [4.5 5.5 6.5]
print (np.median(a)) 5.5
5.5
MSBD5001 Fall 2025 62
Aggregation
MSBD5001 Fall 2025 63
x.sum(axis=0)
Aggregation
0 1 2 0 1 2
3 4 5 3 4 5
x= 6 7 8
Axis: 0
6 7 8
(1st dimension)
9 10 11 9 10 11
Axis: 1
(2nd dimension)
0 1 2 3 18 22 26
3 4 5 12
x.sum(axis=1) 6 7 8 21
9 10 11 30
MSBD5001 Fall 2025 64
Aggregate Functions
• Logic functions
• numpy.all(a, axis=None)
• Test whether all array elements along a given axis evaluate to True.
• numpy.any(a, axis=None)
• Test whether any array element along a given axis evaluates to True.
a = np.arange(0, 12).reshape(4, 3)
b = (a%2 == 0)
print (b) [[ True False True]
[False True False]
print (np.all(b)) [ True False True]
[False True False]]
print (np.any(b)) False
True
MSBD5001 Fall 2025 65
Other Functions
• Sorting, searching, and counting
• NumPy provides many functions for sorting an array, searching in an array or counting the
elements in an array.
• numpy.sort(a[, axis=-1, …]) or numpy.ndarray.sort([axis=-1, …])
• Return a sorted copy. By default, it is sorted along the last axis.
• numpy.argmax(a[, axis=None, …])
• Return the indices of the maximum values along an axis.
• numpy.argmix(a[, axis=None, …])
• Returns the indices of the minimum values along an axis.
• numpy.nonzero(a)
• Return the indices of the elements that are non-zero.
• numpy.where(condition, x, y)
• Return elements chosen from x or y depending on condition
MSBD5001 Fall 2025 66
Part 9 – Copies and Views
MSBD5001 Fall 2025 67
Copies and Views
• For different operations or manipulations of arrays, you can see sometimes data
is copied into a new array, sometimes not.
• For example, simple assignment of arrays does not make a copy.
a = np.array([[1, 2, 3],[4, 5, 6]]) [[100 2 3]
b = a # not copying, b is a reference of a [ 4 5 6]]
b[0, 0] = 100 [[100 2 3]
print(a) [ 4 5 6]]
print(b) True
print(b is a)
• When b is changed, a is also changed.
MSBD5001 Fall 2025 68
Copies and Views
• The NumPy array is a data structure consisting of two parts:
• The contiguous data buffer with the actual data elements, and
• The metadata that contains information about the data buffer.
• View
• A new array that is a new way of looking at the data.
• It is to access the array differently by just changing certain metadata without
changing the data buffer (i.e. same data buffer).
• Any changes made to a view reflects in the original copy.
• Copy
• A new array that is created by duplicating the data buffer as well as the
metadata.
• Changes made to the copy do not reflect on the original array.
• Making a copy is slower and memory-consuming
MSBD5001 Fall 2025 69
Views
• For example, basic array indexing and slicing create view:
a = np.array([[1, 2, 3],[4, 5, 6]])
v = a[0, :]
v[0] = 321
print(v) [321 2 3]
print(a) [[321 2 3]
[ 4 5 6]]
• Array a is changed when array v is changed.
MSBD5001 Fall 2025 70
Copies
• For example, numpy.ndarray.flatten() returns a copy:
a = np.array([[1, 2, 3],[4, 5, 6]])
c = a.flatten()
c[0] = 321
print(c) [321 2 3 4 5 6]
print(a) [[1 2 3]
[4 5 6]]
• Array a is not changed when array c is changed.
MSBD5001 Fall 2025 71
Creating a View or a Copy
• Shallow Copy
• A view can be forced through using numpy.ndarry.view().
• Deep Copy
• A copy can be forced through using numpy.ndarray.copy().
• How to tell if an array is a view or a copy?
• Using numpy.ndarray.base attribute
a = np.arange(6).reshape(2, 3) a = np.arange(6).reshape(2, 3)
b = a.view() # b is a view of a b = a.copy() # b is a copy of a
print (b.base) print (b.base)
[0 1 2 3 4 5] None
What if we reshape b? Will a be reshaped as well?
MSBD5001 Fall 2025 72
Creating a View or a Copy
• Advance indexing creates copies.
• Indexing using an integer array
• Indexing using a Boolean array
a = np.array([[1, 2, 3], [4, 5, 6]]) a = np.array([[1, 2, 3], [4, 5, 6]])
# integer array indexing # boolean array indexing
c = a[ [0, 1], [0, 2]] b = a[ a > 3 ]
print(c.base) print(b.base)
c[0] = 123 None b[0] = 123 None
print("a=", a) a= [[1 2 3] print("a=", a) a= [[1 2 3]
print("c=", c) [4 5 6]] print("b=", b) [4 5 6]]
c= [123 6] b= [123 5 6]
MSBD5001 Fall 2025 73
Part 10 – Data Types
MSBD5001 Fall 2025 74
Data Types
• NumPy tries to guess the datatype when we are creating an array.
• We can also create an array specifying the datatype explicitly.
• dtype: Data type object
• Array scalar: np.int_, np.double, np.str_, np.bool_
• For more about the datatypes,
• https://numpy.org/doc/stable/reference/arrays.scalars.html
w = np.array([0.5, 1.5, 2.5], dtype=np.double)
# create an array of double datatype values
print (w) [0.5 1.5 2.5]
a = w.astype(np.int_) # Return a copy of the array casted to the given
type
print (a) [0 1 2]
MSBD5001 Fall 2025 75
Exercises
• What is the shape of the following array?
A = np.array([ [1], [2], [3] ])
• How to create an 1D array B of size 15 with all zeros?
• How to create a 3 x 5 array C with values from 1 to 16 (from top to bottom, left to
right)?
• How to extract the sub-array consisting of the odd rows and even columns
of array D [[3, 6, 9, 12], [15, 18, 21, 24], [27, 30, 33, 36], [39, 42, 45, 48], [51, 54,
57, 60]]?
• How to find the indices of the maximum values in an array E?
MSBD5001 Fall 2025 76
Exercises
• Given two arrays, A = np.array([1, 2, 3]), and B = np.array([[4, 4, 4], [3, 3, 3]]),
what is the result of A + B?
• Given two arrays, A = np.ones((16, 6, 7)), and B = np.ones((16, 6)),
what is the result of A + B?
MSBD5001 Fall 2025 77
References
• https://numpy.org/doc/stable/user/absolute_beginners.html
• Harris, Charles, Millman, K, et. al. (2020). Array programming with NumPy.
Nature. 585. 357-362. 10.1038/s41586-020-2649-2.
MSBD5001 Fall 2025 78