NumPy in Data Analysis
NumPy (Numerical Python) is a fundamental library for numerical
computing in Python. It provides support for arrays, matrices, and many
mathematical functions, making it essential for data analysis.
1. Why Use NumPy for Data Analysis?
Efficient Storage & Performance: NumPy arrays are more
memory-efficient than Python lists.
Fast Computation: Vectorized operations make mathematical
operations faster.
Integration: Works well with libraries like Pandas, Matplotlib,
and Scikit-learn.
Broadcasting: Enables element-wise operations without explicit
loops.
Installing and Importing NumPy
# Install NumPy if not installed
!pip install numpy
# Import NumPy
import numpy as np
np is the alias used for NumPy, which makes it convenient to call
functions.
NumPy Arrays
Creating Arrays
1D Array
arr1 = np.array([1, 2, 3, 4, 5])
# 2D Array (Matrix)
arr2 = np.array([[1, 2, 3], [4, 5, 6]])
# 3D Array
arr3 = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
Creating a 1D Array
arr1 = np.array([1, 2, 3, 4, 5])
Explanation
np.array([1, 2, 3, 4, 5]) creates a 1-dimensional (1D) NumPy
array.
It is equivalent to a simple Python list but stored as a NumPy
array.
A 1D array has a single row with multiple elements.
Structure of arr1
[1 2 3 4 5]
=================================================
Attributes
print(arr1.ndim) # 1 → Number of dimensions
print(arr1.shape) # (5,) → Shape (5 elements in one row)
print(arr1.size) # 5 → Total number of elements
print(arr1.dtype) # int32 (or int64 depending on the system)
Creating a 2D Array (Matrix)
arr2 = np.array([[1, 2, 3], [4, 5, 6]])
Explanation
A 2-dimensional (2D) array (also called a matrix) is created.
It consists of multiple rows and columns.
arr2 contains 2 rows and 3 columns.
Structure of arr2
[
[1 2 3]
[4 5 6]
]
Attributes
print(arr2.ndim) # 2 → Number of dimensions
print(arr2.shape) # (2, 3) → 2 rows, 3 columns
print(arr2.size) # 6 → Total elements
arr2.shape returns (2,3), meaning 2 rows and 3 columns.
Step 4: Creating a 3D Array
arr3 = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
Explanation
This is a 3-dimensional (3D) array.
It consists of multiple 2D arrays stacked together.
It contains 2 blocks, each having 2 rows and 2 columns.
Structure of arr3
csharp
CopyEdit
[
[
[1 2]
[3 4]
]
[
[5 6]
[7 8]
]
]
Attributes
print(arr3.ndim) # 3 → Number of dimensions
print(arr3.shape) # (2, 2, 2) → 2 blocks, 2 rows, 2 columns
print(arr3.size) # 8 → Total elements
arr3.shape returns (2,2,2), meaning 2 matrices, each with 2
rows and 2 columns.
Summary
Key Takeaways
1D array: A single row of numbers.
2D array: A matrix with rows and columns.
3D array: A collection of 2D matrices stacked together.
In NumPy, ndim is an attribute of a NumPy array that returns the
number of dimensions (axes) of the array.
Example:
import numpy as np
# 1D array
arr1 = np.array([1, 2, 3, 4])
print(arr1.ndim) # Output: 1
# 2D array (matrix)
arr2 = np.array([[1, 2, 3], [4, 5, 6]])
print(arr2.ndim) # Output: 2
# 3D array
arr3 = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
print(arr3.ndim) # Output: 3
Each dimension represents a different level of nested lists. The more
nested the array, the higher the ndim value.
In NumPy, the shape attribute returns a tuple representing the
dimensions of the array, where each value in the tuple corresponds to
the size of the array along that axis.
import numpy as np
# 1D array
arr1 = np.array([1, 2, 3, 4])
print(arr1.shape) # Output: (4,) → 1 row, 4 columns
# 2D array (matrix)
arr2 = np.array([[1, 2, 3], [4, 5, 6]])
print(arr2.shape) # Output: (2, 3) → 2 rows, 3 columns
# 3D array
arr3 = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
print(arr3.shape) # Output: (2, 2, 2) → 2 blocks, 2 rows, 2 columns
The shape attribute is useful for understanding the structure of an
array and performing reshaping operations.
In NumPy, the size attribute returns the total number of elements in an
array, regardless of its shape or dimensions.
Example:
import numpy as np
# 1D array
arr1 = np.array([1, 2, 3, 4])
print(arr1.size) # Output: 4 (total elements)
# 2D array (matrix)
arr2 = np.array([[1, 2, 3], [4, 5, 6]])
print(arr2.size) # Output: 6 (2 rows × 3 columns)
# 3D array
arr3 = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
print(arr3.size) # Output: 8 (2 × 2 × 2)
The size attribute is helpful for quickly determining the total number of
elements in an array, especially when working with large datasets.
Indexing & Slicing in NumPy Arrays
Indexing and slicing allow us to access and manipulate elements within
NumPy arrays. Let's explore this for 1D, 2D, and 3D arrays.
1. Indexing in a 1D Array
A 1D array behaves like a simple Python list.
import numpy as np
arr1 = np.array([10, 20, 30, 40, 50])
print(arr1[0]) # 10 → First element
print(arr1[2]) # 30 → Third element
print(arr1[-1]) # 50 → Last element
Explanation:
Positive indexing starts from 0 (left to right).
Negative indexing starts from -1 (right to left).
2. Slicing in a 1D Array
print(arr1[1:4]) # [20 30 40] → From index 1 to 3
print(arr1[:3]) # [10 20 30] → First 3 elements
print(arr1[2:]) # [30 40 50] → From index 2 to end
print(arr1[::2]) # [10 30 50] → Every second element
print(arr1[::-1]) # [50 40 30 20 10] → Reverse the array
3. Indexing in a 2D Array
arr2 = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
print(arr2[0, 1]) # 2 → First row, second column
print(arr2[2, -1]) # 9 → Last row, last column
Explanation:
Syntax: arr[row_index, column_index]
Rows and columns are zero-indexed.
4. Slicing in a 2D Array
print(arr2[0:2, 1:3])
# [[2 3]
# [5 6]] → First 2 rows, last 2 columns
print(arr2[:, 0])
# [1 4 7] → All rows, first column
print(arr2[1, :])
# [4 5 6] → Second row, all columns
print(arr2[::-1, ::-1])
# [[9 8 7]
# [6 5 4]
# [3 2 1]] → Reverse both rows and columns
5. Indexing in a 3D Array
arr3 = np.array([[[1, 2], [3, 4]],
[[5, 6], [7, 8]]])
print(arr3[0, 1, 1]) # 4 → First matrix, second row, second column
print(arr3[1, 0, 0]) # 5 → Second matrix, first row, first column
Structure of arr3
Matrix 1:
[1 2]
[3 4]
Matrix 2:
[5 6]
[7 8]
]
6. Slicing in a 3D Array
print(arr3[:, 1, :])
# [[3 4]
# [7 8]] → All matrices, second row, all columns
print(arr3[:, :, 0])
# [[1 3]
# [5 7]] → All matrices, all rows, first column
print(arr3[::-1, ::-1, ::-1])
# [[[8 7]
# [6 5]]
# [[4 3]
# [2 1]]] → Reverse across all dimensions
Conclusion
1D Arrays: Simple indexing and slicing like Python lists.
2D Arrays: Use [row, col] notation.
3D Arrays: Add another index [depth, row, col].
NumPy Indexing & Slicing Exercises 🚀
1. 1D Array Exercises
import numpy as np
arr1 = np.array([10, 20, 30, 40, 50, 60, 70])
🔹 Q1: Extract the element 40 from arr1.
🔹 Q2: Slice arr1 to get [30, 40, 50].
🔹 Q3: Get every second element from the array.
🔹 Q4: Reverse the array.
2. 2D Array Exercises
arr2 = np.array([[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12]])
🔹 Q5: Extract the number 7 from arr2.
🔹 Q6: Get the second row ([5, 6, 7, 8]).
🔹 Q7: Extract the first two columns from all rows.
🔹 Q8: Reverse the rows of arr2.
3. 3D Array Exercises
arr3 = np.array([[[1, 2, 3],
[4, 5, 6]],
[[7, 8, 9],
[10, 11, 12]]])
🔹 Q9: Extract the number 9 from arr3.
🔹 Q10: Get all elements from the first matrix.
🔹 Q11: Extract the first column from all matrices.
🔹 Q12: Reverse the entire array across all dimensions.
💡 Bonus Challenge:
Given arr2, write one line of code to extract all even numbers.
NumPy Indexing & Slicing Solutions
1. 1D Array Solutions
import numpy as np
arr1 = np.array([10, 20, 30, 40, 50, 60, 70])
# Q1: Extract the element 40
print(arr1[3]) # Output: 40
# Q2: Slice arr1 to get [30, 40, 50]
print(arr1[2:5]) # Output: [30 40 50]
# Q3: Get every second element
print(arr1[::2]) # Output: [10 30 50 70]
# Q4: Reverse the array
print(arr1[::-1]) # Output: [70 60 50 40 30 20 10]
2. 2D Array Solutions
arr2 = np.array([[1, 2, 3, 4],
[5, 6, 7, 8],
[9, 10, 11, 12]])
# Q5: Extract the number 7
print(arr2[1, 2]) # Output: 7
# Q6: Get the second row [5, 6, 7, 8]
print(arr2[1, :]) # Output: [5 6 7 8]
# Q7: Extract the first two columns from all rows
print(arr2[:, :2])
# Output:
# [[ 1 2]
# [ 5 6]
# [ 9 10]]
# Q8: Reverse the rows of arr2
print(arr2[::-1, :])
# Output:
# [[ 9 10 11 12]
# [ 5 6 7 8]
# [ 1 2 3 4]]
3. 3D Array Solutions
arr3 = np.array([[[1, 2, 3],
[4, 5, 6]],
[[7, 8, 9],
[10, 11, 12]]])
# Q9: Extract the number 9
print(arr3[1, 0, 2]) # Output: 9
# Q10: Get all elements from the first matrix
print(arr3[0, :, :])
# Output:
# [[1 2 3]
# [4 5 6]]
# Q11: Extract the first column from all matrices
print(arr3[:, :, 0])
# Output:
# [[ 1 4]
# [ 7 10]]
# Q12: Reverse the entire array across all dimensions
print(arr3[::-1, ::-1, ::-1])
# Output:
# [[[12 11 10]
# [ 9 8 7]]
# [[ 6 5 4]
# [ 3 2 1]]]
Bonus Challenge Solution
# Extract all even numbers from arr2
print(arr2[arr2 % 2 == 0])
# Output: [ 2 4 6 8 10 12]
Special Arrays in NumPy with Examples
NumPy provides several built-in functions to create special arrays that
are commonly used in data analysis, machine learning, and scientific
computing.
1️np.zeros() - Zero Matrix
Creates an array filled with zeros.
📌 Syntax: np.zeros(shape, dtype)
# 1D array of zeros
arr1 = np.zeros(5)
print(arr1) # Output: [0. 0. 0. 0. 0.]
# 2D array of zeros (3x3)
arr2 = np.zeros((3, 3))
print(arr2)
✅ Use Case: Initializing an array before storing computed values.
2️np.ones() - Ones Matrix
Creates an array filled with ones.
📌 Syntax: np.ones(shape, dtype)
# 1D array of ones
arr1 = np.ones(5)
print(arr1) # Output: [1. 1. 1. 1. 1.]
# 2D array of ones (3x3)
arr2 = np.ones((3, 3))
print(arr2)
✅ Use Case: Used in neural networks for bias initialization.
3️np.eye() - Identity Matrix
Creates an identity matrix (diagonal elements = 1).
📌 Syntax: np.eye(N, M, k)
# 3x3 Identity matrix
arr = np.eye(3)
print(arr)
✅ Use Case: Used in linear algebra for matrix operations.
4️np.full() - Custom Value Matrix
Creates an array filled with a specific value.
📌 Syntax: np.full(shape, fill_value)
# 3x3 matrix filled with 7
arr = np.full((3, 3), 7)
print(arr)
✅ Use Case: Used when initializing an array with a fixed value.
5️np.arange() - Sequence of Numbers
Creates a sequence of numbers (like range() in Python).
📌 Syntax: np.arange(start, stop, step)
# Sequence from 1 to 10 (step = 2)
arr = np.arange(1, 10, 2)
print(arr) # Output: [1 3 5 7 9]
✅ Use Case: Used to create indexing or grid values.
6️np.linspace() - Evenly Spaced Values
Creates an array of evenly spaced values between a range.
📌 Syntax: np.linspace(start, stop, num)
# 5 values between 0 and 1
arr = np.linspace(0, 1, 5)
print(arr) # Output: [0. 0.25 0.5 0.75 1. ]
✅ Use Case: Used in plotting graphs.
7️np.random.rand() - Random Numbers (Uniform
Distribution)
Generates random numbers between 0 and 1.
📌 Syntax: np.random.rand(shape)
# 1D random array of 5 numbers
arr = np.random.rand(5)
print(arr)
✅ Use Case: Used in data simulation & machine learning.
8.np.random.randint() - Random Integers
Generates random integers within a range.
📌 Syntax: np.random.randint(low, high, size)
# 3 random integers between 10 and 50
arr = np.random.randint(10, 50, 3)
print(arr)
✅ Use Case: Used for random sampling.
9️np.random.randn() - Random Normal Distribution
Generates random numbers from a normal distribution (mean=0,
std=1).
📌 Syntax: np.random.randn(shape)
# 5 random numbers from a normal distribution
arr = np.random.randn(5)
print(arr)
✅ Use Case: Used in statistics & probability models.
Modifying Arrays in NumPy
In NumPy, arrays can be modified in several ways, such as reshaping,
adding, deleting, updating, and merging elements. These
modifications are essential in data preprocessing, feature engineering,
and machine learning. Let’s explore each technique with practical
examples.
📌 1. Changing the Shape of an Array (reshape(), ravel(),
flatten())
1️Reshaping an Array (np.reshape())
Reshaping allows us to change the dimensions of an array without
modifying the original data.
import numpy as np
# Original 1D array
arr = np.array([1, 2, 3, 4, 5, 6])
# Reshape to a 2D array (2 rows, 3 columns)
reshaped_arr = arr.reshape(2, 3)
print(reshaped_arr)
🔹 Output:
[[1 2 3]
[4 5 6]]
✅ Use Case: Used in deep learning where inputs are often reshaped
into a specific format.
2️Flattening an Array (ravel(), flatten())
Flattening converts a multi-dimensional array into a 1D array.
arr = np.array([[1, 2, 3], [4, 5, 6]])
# Using np.ravel() (returns a view, changes reflect in original array)
flattened_arr1 = arr.ravel()
# Using .flatten() (returns a copy, changes don't affect original array)
flattened_arr2 = arr.flatten()
print(flattened_arr1) # Output: [1 2 3 4 5 6]
print(flattened_arr2) # Output: [1 2 3 4 5 6]
✅ Use Case: Useful for preparing feature vectors for machine learning
models.
📌 2. Adding Elements to an Array (append(), insert())
1️Appending Elements (np.append())
np.append() adds elements to an array, but creates a new array
instead of modifying the original.
arr = np.array([1, 2, 3])
# Append a single value
new_arr = np.append(arr, 4)
print(new_arr) # Output: [1 2 3 4]
# Append multiple values
new_arr = np.append(arr, [5, 6])
print(new_arr) # Output: [1 2 3 5 6]
✅ Use Case: Adding new records dynamically in datasets.
2️Inserting Elements at a Specific Index (np.insert())
We can insert elements at a specific position using np.insert().
arr = np.array([1, 2, 3])
# Insert 99 at index 1
new_arr = np.insert(arr, 1, 99)
print(new_arr) # Output: [ 1 99 2 3]
✅ Use Case: Used in data preprocessing to insert missing values.
📌 3. Deleting Elements from an Array (np.delete())
To remove elements from an array, we use np.delete().
arr = np.array([1, 2, 3, 4, 5])
# Delete element at index 2
new_arr = np.delete(arr, 2)
print(new_arr) # Output: [1 2 4 5]
✅ Use Case: Removing unnecessary features or outliers from datasets.
📌 4. Modifying Elements (array[index] = value)
We can directly modify specific elements in an array.
arr = np.array([1, 2, 3, 4, 5])
# Change element at index 2
arr[2] = 99
print(arr) # Output: [ 1 2 99 4 5]
✅ Use Case: Updating incorrect values in datasets.
📌 5. Changing Data Type of an Array (astype())
Convert an array to a different data type using astype().
arr = np.array([1.5, 2.3, 3.7])
# Convert to integer
new_arr = arr.astype(int)
print(new_arr) # Output: [1 2 3]
✅ Use Case: Converting floating-point numbers to integers for
categorical variables.
📌 6. Merging Arrays (np.concatenate(), np.vstack(),
np.hstack())
1️Concatenating Arrays (np.concatenate())
Join two or more arrays along an axis.
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
merged_arr = np.concatenate((arr1, arr2))
print(merged_arr) # Output: [1 2 3 4 5 6]
✅ Use Case: Merging datasets.
2️Stacking Arrays Vertically (np.vstack())
Joins arrays along rows.
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
vstacked = np.vstack((arr1, arr2))
print(vstacked)
🔹 Output:
[[1 2 3]
[4 5 6]]
✅ Use Case: Combining multiple feature matrices.
3️Stacking Arrays Horizontally (np.hstack())
Joins arrays along columns.
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
hstacked = np.hstack((arr1, arr2))
print(hstacked) # Output: [1 2 3 4 5 6]
✅ Use Case: Merging features from different sources.
📌 7. Splitting Arrays (np.split())
To divide an array into smaller parts, use np.split().
arr = np.array([1, 2, 3, 4, 5, 6])
# Split into 3 equal parts
split_arr = np.split(arr, 3)
print(split_arr)
🔹 Output:
[array([1, 2]), array([3, 4]), array([5, 6])]
✅ Use Case: Splitting datasets into training, validation, and test
sets.
🔎 Summary of Array Modification Methods
Advanced Array Operations in NumPy
📌 1. Vectorized Operations (Fast Element-wise Calculations)
NumPy allows fast element-wise operations without using loops, known
as vectorization.
Example: Element-wise Addition, Multiplication
import numpy as np
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
# Vectorized operations
sum_arr = arr1 + arr2 # Element-wise addition
mul_arr = arr1 * arr2 # Element-wise multiplication
print(sum_arr) # Output: [5 7 9]
print(mul_arr) # Output: [ 4 10 18]
✅ Use Case: Faster computations in machine learning (e.g., applying
operations on matrices).
📌 2. Broadcasting (Handling Different Shapes)
Broadcasting allows operations between arrays of different
shapes without explicit looping.
Example: Broadcasting a Scalar to an Array
arr = np.array([1, 2, 3])
# Scalar multiplication (broadcasting)
result = arr * 2
print(result) # Output: [2 4 6]
🔹 How it works? NumPy automatically expands the scalar 2 to
match the shape of arr.
Example: Broadcasting in 2D Arrays
arr1 = np.array([[1, 2, 3], [4, 5, 6]])
arr2 = np.array([1, 2, 3]) # 1D array
# Adding a 1D array to a 2D array
result = arr1 + arr2
print(result)
🔹 Output:
[[ 2 4 6]
[ 5 7 9]]
✅ Use Case: Used in deep learning when adding biases to neurons.
📌 3. Sorting & Filtering Data (np.sort(), Boolean Indexing)
Sorting and filtering are essential for data preprocessing.
Example: Sorting an Array
arr = np.array([5, 3, 8, 1, 4])
# Sort in ascending order
sorted_arr = np.sort(arr)
print(sorted_arr) # Output: [1 3 4 5 8]
✅ Use Case: Sorting datasets before visualization.
Example: Boolean Indexing (Filtering Values)
arr = np.array([10, 20, 30, 40, 50])
# Filter values greater than 25
filtered_arr = arr[arr > 25]
print(filtered_arr) # Output: [30 40 50]
✅ Use Case: Selecting only relevant data points in machine learning.
📌 4. Mathematical & Statistical Operations
NumPy has built-in functions for mathematical calculations.
Example: Basic Mathematical Functions
arr = np.array([1, 2, 3, 4])
# Compute sum, mean, standard deviation
print(np.sum(arr)) # Output: 10
print(np.mean(arr)) # Output: 2.5
print(np.std(arr)) # Output: 1.118
✅ Use Case: Feature scaling in machine learning.
Example: Matrix Operations (np.dot())
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
# Matrix multiplication
C = np.dot(A, B)
print(C)
🔹 Output:
[[19 22]
[43 50]]
✅ Use Case: Used in neural networks and computer graphics.
📌 5. Conditional Operations & Masking
Conditional operations allow element-wise filtering and
modification.
Example: Applying Conditions on an Array
arr = np.array([1, 2, 3, 4, 5])
# Replace all values greater than 3 with 99
arr[arr > 3] = 99
print(arr) # Output: [ 1 2 3 99 99]
✅ Use Case: Used in data cleaning to replace missing values.
📌 6. Advanced Indexing Techniques
NumPy supports fancy indexing for accessing specific elements.
Example: Indexing with Lists
arr = np.array([10, 20, 30, 40, 50])
# Extract values at indices 0, 2, and 4
indexed_values = arr[[0, 2, 4]]
print(indexed_values) # Output: [10 30 50]
✅ Use Case: Selecting specific features from datasets.
Hands-on NumPy Exercises for Practice
🔹 Exercise 1: Vectorized Operations
Task: Perform element-wise addition, subtraction,
multiplication, and division between two arrays.
import numpy as np
arr1 = np.array([10, 20, 30, 40, 50])
arr2 = np.array([2, 4, 6, 8, 10])
# Perform vectorized operations
add_result = arr1 + arr2
sub_result = arr1 - arr2
mul_result = arr1 * arr2
div_result = arr1 / arr2
print("Addition:", add_result)
print("Subtraction:", sub_result)
print("Multiplication:", mul_result)
print("Division:", div_result)
✅ Challenge: Modify the code to compute square, square root, log,
and exponential of arr1.
🔹 Exercise 2: Broadcasting
Task: Multiply a 2D array with a 1D array using broadcasting.
arr1 = np.array([[1, 2, 3], [4, 5, 6]])
arr2 = np.array([10, 20, 30])
result = arr1 * arr2 # Broadcasting happens here
print(result)
✅ Challenge: Change arr2 to have different shapes and observe what
happens.
🔹 Exercise 3: Sorting and Filtering
Task: Sort an array and filter out values greater than 50.
arr = np.array([90, 30, 60, 20, 50, 10, 70])
# Sort the array
sorted_arr = np.sort(arr)
# Filter values greater than 50
filtered_arr = sorted_arr[sorted_arr > 50]
print("Sorted:", sorted_arr)
print("Filtered (greater than 50):", filtered_arr)
✅ Challenge: Modify the code to filter even numbers only.
🔹 Exercise 4: Matrix Operations
Task: Multiply two matrices using np.dot().
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
# Matrix multiplication
C = np.dot(A, B)
print(C)
✅ Challenge: Find the determinant and inverse of matrix A.
🔹 Exercise 5: Conditional Operations
Task: Replace all values greater than 25 with 99.
arr = np.array([[10, 20, 30], [40, 50, 60]])
arr[arr > 25] = 99 # Conditional modification
print(arr)
✅ Challenge: Replace even numbers with 0.
🔹 Exercise 6: Fancy Indexing
Task: Extract values at specific positions.
arr = np.array([5, 10, 15, 20, 25, 30, 35])
indices = [1, 3, 5] # Select values at these positions
selected_values = arr[indices]
print("Selected values:", selected_values)
✅ Challenge: Modify the code to select every alternate element.
🔹 Exercise 7: Splitting a Dataset
Task: Split an array into training and test sets.
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
# Split into 70% training, 30% testing
train_size = int(len(arr) * 0.7)
train_set, test_set = np.split(arr, [train_size])
print("Train set:", train_set)
print("Test set:", test_set)
✅ Challenge: Try splitting randomly instead of sequentially.
Mathematical Operations
1. Basic Arithmetic Operations
NumPy supports element-wise addition, subtraction, multiplication,
and division.
Example: Basic Arithmetic
import numpy as np
arr1 = np.array([10, 20, 30])
arr2 = np.array([2, 4, 6])
# Element-wise operations
add_result = arr1 + arr2 # Addition
sub_result = arr1 - arr2 # Subtraction
mul_result = arr1 * arr2 # Multiplication
div_result = arr1 / arr2 # Division
print("Addition:", add_result) # [12 24 36]
print("Subtraction:", sub_result) # [ 8 16 24]
print("Multiplication:", mul_result) # [20 80 180]
print("Division:", div_result) # [5. 5. 5.]
✅ Use Case: Used in feature scaling in machine learning.
📌 2. Aggregation Functions
Aggregation functions help summarize an array’s data.
Example: Sum, Mean, Min, Max
arr = np.array([1, 2, 3, 4, 5])
print("Sum:", np.sum(arr)) # 15
print("Mean (Average):", np.mean(arr)) # 3.0
print("Min:", np.min(arr)) # 1
print("Max:", np.max(arr)) # 5
✅ Use Case: Compute the average salary of employees in a dataset.
📌 3. Exponential & Logarithmic Functions
NumPy provides exponential and logarithmic functions, which are
commonly used in machine learning algorithms.
Example: Exponential & Logarithm
arr = np.array([1, 2, 3, 4])
print("Exponential:", np.exp(arr)) # [ 2.718 7.389 20.086 54.598]
print("Natural Log (ln):", np.log(arr)) # [0. 0.693 1.099 1.386]
print("Base 10 Log:", np.log10(arr)) # [0. 0.301 0.477 0.602]
✅ Use Case: Log transformations help normalize skewed
datasets.
📌 4. Power and Square Root Operations
Example: Square, Cube, Square Root
arr = np.array([4, 9, 16, 25])
print("Square Root:", np.sqrt(arr)) # [2. 3. 4. 5.]
print("Power 2:", np.power(arr, 2)) # Squares each element
print("Power 3:", np.power(arr, 3)) # Cubes each element
✅ Use Case: Used in data normalization.
📌 5. Trigonometric Functions
NumPy provides all major trigonometric functions, useful in signal
processing and physics.
Example: Sine, Cosine, Tangent
angles = np.array([0, 30, 45, 60, 90])
# Convert degrees to radians
angles_radians = np.radians(angles)
print("Sine:", np.sin(angles_radians))
print("Cosine:", np.cos(angles_radians))
print("Tangent:", np.tan(angles_radians))
✅ Use Case: Used in robotics and engineering simulations.
📌 6. Rounding & Floor/Ceil Operations
Example: Rounding, Floor, and Ceil
arr = np.array([1.2, 2.7, 3.5, 4.9])
print("Round:", np.round(arr)) # [1. 3. 4. 5.]
print("Floor:", np.floor(arr)) # [1. 2. 3. 4.]
print("Ceil:", np.ceil(arr)) # [2. 3. 4. 5.]
✅ Use Case: Used in financial applications (e.g., rounding currency
values).
📌 7. Statistical Functions
NumPy makes statistical analysis faster and easier.
Example: Variance, Standard Deviation, Median
arr = np.array([1, 2, 3, 4, 5])
print("Variance:", np.var(arr)) # 2.0
print("Standard Deviation:", np.std(arr)) # 1.414
print("Median:", np.median(arr)) # 3.0
✅ Use Case: Used in risk assessment in stock markets.
📌 8. Matrix Operations
NumPy supports matrix multiplication, inverse, determinant,
and transpose.
Example: Matrix Multiplication
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
C = np.dot(A, B) # Matrix multiplication
print(C)
✅ Use Case: Used in deep learning for weight calculations.
📌 9. Conditional Mathematical Operations
Example: Conditional Replacement
arr = np.array([10, 20, 30, 40, 50])
arr[arr > 30] = 99 # Replace values greater than 30 with 99
print(arr) # [10 20 30 99 99]
✅ Use Case: Replacing missing values in datasets.
🔹 Exercise 1: Basic Arithmetic Operations
Task: Perform element-wise arithmetic operations on two
NumPy arrays.
import numpy as np
arr1 = np.array([5, 10, 15, 20, 25])
arr2 = np.array([2, 4, 6, 8, 10])
# TODO: Perform element-wise addition, subtraction, multiplication,
and division
add_result = None
sub_result = None
mul_result = None
div_result = None
print("Addition:", add_result)
print("Subtraction:", sub_result)
print("Multiplication:", mul_result)
print("Division:", div_result)
✅ Challenge: Modify the code to compute modulus (%) and power
(arr1^arr2).
🔹 Exercise 2: Aggregation Functions
Task: Calculate the sum, mean, min, and max of an array.
arr = np.array([12, 45, 67, 89, 23, 56])
# TODO: Find the sum, mean, min, and max of the array
total = None
mean_value = None
minimum = None
maximum = None
print("Sum:", total)
print("Mean:", mean_value)
print("Min:", minimum)
print("Max:", maximum)
✅ Challenge: Find the variance and standard deviation of the
array.
🔹 Exercise 3: Exponential & Logarithm
Task: Compute the exponential and logarithm of an array.
arr = np.array([1, 5, 10, 20])
# TODO: Compute exponential, natural log, and base 10 log
exp_values = None
log_values = None
log10_values = None
print("Exponential:", exp_values)
print("Natural Log:", log_values)
print("Base 10 Log:", log10_values)
✅ Challenge: Convert logarithm values back to original
numbers.
🔹 Exercise 4: Trigonometric Functions
Task: Calculate sine, cosine, and tangent values for given
angles.
angles = np.array([0, 30, 45, 60, 90])
# TODO: Convert degrees to radians and apply trigonometric functions
angles_radians = None
sin_values = None
cos_values = None
tan_values = None
print("Sine:", sin_values)
print("Cosine:", cos_values)
print("Tangent:", tan_values)
✅ Challenge: Find inverse sine (arcsin) for sin_values.
🔹 Exercise 5: Matrix Operations
Task: Perform matrix multiplication.
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
# TODO: Multiply A and B using np.dot()
C = None
print("Matrix Multiplication:\n", C)
✅ Challenge: Find the determinant and inverse of matrix A.
🔹 Exercise 6: Conditional Operations
Task: Replace all values greater than 50 with 100.
arr = np.array([20, 40, 60, 80, 100])
# TODO: Replace values greater than 50 with 100
arr[arr > 50] = None
print(arr)
✅ Challenge: Replace even numbers with -1.
🔹 Exercise 7: Statistical Functions
Task: Compute variance, standard deviation, and median.
arr = np.array([10, 20, 30, 40, 50])
# TODO: Compute variance, standard deviation, and median
variance = None
std_dev = None
median_value = None
print("Variance:", variance)
print("Standard Deviation:", std_dev)
print("Median:", median_value)
✅ Challenge: Find the percentile (e.g., 25th, 50th, 75th
percentile).
🔹 Exercise 8: Working with Random Data
Task: Generate random numbers and find their mean.
# TODO: Generate an array of 10 random numbers between 0 and 100
random_numbers = None
# TODO: Compute the mean
mean_value = None
print("Random Numbers:", random_numbers)
print("Mean Value:", mean_value)
✅ Challenge: Find the highest and lowest random values.
Here are the solutions for all the exercises. 🚀
🔹 Exercise 1: Basic Arithmetic Operations
Solution:
import numpy as np
arr1 = np.array([5, 10, 15, 20, 25])
arr2 = np.array([2, 4, 6, 8, 10])
# Element-wise operations
add_result = arr1 + arr2
sub_result = arr1 - arr2
mul_result = arr1 * arr2
div_result = arr1 / arr2
print("Addition:", add_result) # [ 7 14 21 28 35]
print("Subtraction:", sub_result) # [ 3 6 9 12 15]
print("Multiplication:", mul_result) # [10 40 90 160 250]
print("Division:", div_result) # [2.5 2.5 2.5 2.5 2.5]
✅ Bonus Challenge:
modulus_result = arr1 % arr2 # [1 2 3 4 5]
power_result = arr1 ** arr2 # Element-wise power operation
print("Modulus:", modulus_result)
print("Power:", power_result)
🔹 Exercise 2: Aggregation Functions
Solution:
arr = np.array([12, 45, 67, 89, 23, 56])
# Aggregation functions
total = np.sum(arr)
mean_value = np.mean(arr)
minimum = np.min(arr)
maximum = np.max(arr)
print("Sum:", total) # 292
print("Mean:", mean_value) # 48.67
print("Min:", minimum) # 12
print("Max:", maximum) # 89
✅ Bonus Challenge:
variance = np.var(arr) # 701.89
std_dev = np.std(arr) # 26.5
print("Variance:", variance)
print("Standard Deviation:", std_dev)
🔹 Exercise 3: Exponential & Logarithm
Solution:
arr = np.array([1, 5, 10, 20])
# Exponential and log functions
exp_values = np.exp(arr)
log_values = np.log(arr)
log10_values = np.log10(arr)
print("Exponential:", exp_values)
print("Natural Log:", log_values)
print("Base 10 Log:", log10_values)
✅ Bonus Challenge:
# Convert logs back to original values
original_values = np.exp(log_values) # Should return [1, 5, 10, 20]
print("Converted Back:", original_values)
🔹 Exercise 4: Trigonometric Functions
Solution:
angles = np.array([0, 30, 45, 60, 90])
# Convert degrees to radians
angles_radians = np.radians(angles)
# Compute trigonometric values
sin_values = np.sin(angles_radians)
cos_values = np.cos(angles_radians)
tan_values = np.tan(angles_radians)
print("Sine:", sin_values)
print("Cosine:", cos_values)
print("Tangent:", tan_values)
✅ Bonus Challenge:
inverse_sin = np.arcsin(sin_values) # Inverse sine
print("Inverse Sine:", np.degrees(inverse_sin)) # Convert back to
degrees
🔹 Exercise 5: Matrix Operations
Solution:
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
# Matrix multiplication
C = np.dot(A, B)
print("Matrix Multiplication:\n", C)
✅ Bonus Challenge:
det_A = np.linalg.det(A) # Determinant
inv_A = np.linalg.inv(A) # Inverse matrix
print("Determinant of A:", det_A)
print("Inverse of A:\n", inv_A)
🔹 Exercise 6: Conditional Operations
Solution:
arr = np.array([20, 40, 60, 80, 100])
# Replace values greater than 50 with 100
arr[arr > 50] = 100
print(arr) # [20 40 100 100 100]
✅ Bonus Challenge:
arr[arr % 2 == 0] = -1 # Replace even numbers with -1
print(arr)
🔹 Exercise 7: Statistical Functions
Solution:
arr = np.array([10, 20, 30, 40, 50])
variance = np.var(arr)
std_dev = np.std(arr)
median_value = np.median(arr)
print("Variance:", variance) # 200.0
print("Standard Deviation:", std_dev) # 14.14
print("Median:", median_value) # 30
✅ Bonus Challenge:
percentiles = np.percentile(arr, [25, 50, 75])
print("25th, 50th, 75th Percentiles:", percentiles)
🔹 Exercise 8: Working with Random Data
Solution:
# Generate an array of 10 random numbers between 0 and 100
random_numbers = np.random.randint(0, 100, size=10)
# Compute the mean
mean_value = np.mean(random_numbers)
print("Random Numbers:", random_numbers)
print("Mean Value:", mean_value)
✅ Bonus Challenge:
max_value = np.max(random_numbers)
min_value = np.min(random_numbers)
print("Highest Value:", max_value)
print("Lowest Value:", min_value)