Numpy - Data Science
Numpy - Data Science
In [2]: print(np.__version__)
1.24.3
Linear algebra
- Linear algebra deals with vector spaces and linear mappings between
these spaces. Here's a simple program that
demonstrates basic linear algebra operations using NumPy
print(f"A = \n{A}")
print(f"B = \n{B}")
# Matrix addition
C = A + B
print("\nMatrix Addition:")
print(C)
# Matrix multiplication
D = np.dot(A, B)
print("\nMatrix Multiplication:")
print(D)
# Matrix transpose
A_transpose = np.transpose(A)
print("\nMatrix Transpose:")
print(A_transpose)
# Matrix determinant
det_A = np.linalg.det(A)
print("\nDeterminant of A:", det_A)
A =
[[1 2]
[3 4]]
B =
[[5 6]
[7 8]]
Matrix Addition:
[[ 6 8]
[10 12]]
Matrix Multiplication:
[[19 22]
[43 50]]
Matrix Transpose:
[[1 3]
[2 4]]
Determinant of A: -2.0000000000000004
Fourier Transform:
- The Fourier Transform is a mathematical technique used to decompose a
function into its constituent frequencies.
Here's a simple program that computes and plots the Fourier Transform
of a signal using NumPy
plt.subplot(2, 1, 2)
plt.plot(freq, np.abs(dft))
plt.title('Fourier Transform')
plt.xlabel('Frequency (Hz)')
plt.ylabel('Magnitude')
plt.xlim(0, 20) # Limit the x-axis to show frequencies up to 20 Hz
plt.tight_layout()
plt.show()
In the Above program:
- We generate a signal composed of two sine waves with frequencies 5 Hz
and 10 Hz.
- We compute the Fourier Transform of the signal using np.fft.fft().
- We plot the original signal and its Fourier Transform.
Matrices:
- Matrices are rectangular arrays of numbers arranged in rows and
columns. Here's a simple program that demonstrates
basic matrix operations using NumPy:
# Matrix addition
result_add = matrix + 10
print("Matrix Addition (Scalar):")
print(result_add)
# Element-wise multiplication
result_mul = matrix * 2
print("\nElement-wise Multiplication (Scalar):")
print(result_mul)
# Matrix transpose
matrix_transpose = np.transpose(matrix)
print("\nMatrix Transpose:")
print(matrix_transpose)
Matrix Addition (Scalar):
[[11 12 13]
[14 15 16]
[17 18 19]]
Matrix Transpose:
[[1 4 7]
[2 5 8]
[3 6 9]]
Array
- An array in NumPy is represented as an n-dimensional array (nd-array).
- In NumPy, the array function is used to create arrays, and it needs to
be called with the NumPy library whenever
required.
- To create an n-dimensional array, we use the array() function. This
function takes a list or tuple as a parameter.
- An array is a collection of items.
- The idea is to store multiple items of the same type together (so its
work faster than list)
Creating a Array
- Array in Python can be created by importing array module.
syntax:
- array(data_type,value_list)
Out[7]: 3
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[9], line 3
1 import array as arr
2 # now will try to create multi dimentional array
----> 3 num1 = arr.array('i',[[1,23,5],[9,2,32]])
Note: If you want to create multidimentional array you want to use numpy library¶
In [10]: import numpy as np
num1 = np.array([[1, 23, 5], [9, 2, 32]])
num1
[1 2 3]
type= <class 'numpy.ndarray'>
1 Dimention Array
[[1 2 3]
[4 5 6]]
type= <class 'numpy.ndarray'>
2 Dimention Array
[[[1 2 3]
[4 5 6]]
[[7 8 9]
[0 1 2]]]
type= <class 'numpy.ndarray'>
3 Dimention Array
In [12]: # 1) numpy.asarray()
#-------------------
# The numpy.asarray() function converts input to an array. If the input is already an nd
# Otherwise, it creates a new array. This function is particularly useful when you want
# into a NumPy array.
#Example:
# Convert a list to a NumPy array
my_list = [1, 2, 3, 4, 5]
np_array = np.asarray(my_list)
print(np_array)
[1 2 3 4 5]
In [13]: # 2. numpy.frombuffer()
# ----------------------
# The numpy.frombuffer() function interprets a buffer as a one-dimensional array. This f
# stored in a buffer-like object (e.g., bytes object) and you want to create a NumPy arr
# Example:
print(np_array)
[b'H' b'e' b'l' b'l' b'o' b' ' b'W' b'o' b'r' b'l' b'd' b'!']
In [14]: # 3. numpy.fromiter()
# The numpy.fromiter() function creates a new one-dimensional array from an iterable obj
# This function is useful when you have an iterable object (e.g., Python range) and you
# Example:
# Create a Python range object
my_range = range(10)
print(np_array)
[0 1 2 3 4 5 6 7 8 9]
- Here are some functions in NumPy used for creating arrays with
numerical ranges:
- numpy.arange()
- numpy.linspace()
- numpy.logspace()
In [15]: # 1) numpy.arange(): This function creates an array with evenly spaced values within a sp
In [16]: # 2) numpy.linspace(): This function creates an array with evenly spaced values over a sp
# Syntax : numpy.linspace(start, stop, num=50, endpoint=True, dtype)
In [17]: # 3) numpy.logspace(): This function creates an array with evenly spaced values on a log
print("Original array:")
print(arr)
print("\nReshaped array:")
print(reshaped_arr)
Original array:
[1 2 3 4 5 6]
Reshaped array:
[[1 2 3]
[4 5 6]]
In [ ]:
Ones Array =
[[1. 1. 1. 1.]
[1. 1. 1. 1.]]
Twos Array =
[[2 2]
[2 2]
[2 2]]
Range Array =
[0 1 2 3 4 5 6 7 8 9]
Linespace Array =
[0. 0.11111111 0.22222222 0.33333333 0.44444444 0.55555556
0.66666667 0.77777778 0.88888889 1. ]
# numpy.random.rand(): Generate random numbers from a uniform distribution over [0, 1).
# numpy.random.randn(): Generate random numbers from a standard normal distribution.
# numpy.random.randint(): Generate random integers from a specified low to high range.
# numpy.random.random(): Generate random floats in the half-open interval [0.0, 1.0).
random_normal_array =
[[-0.1556638 -0.42272671 -0.95054239 -0.40332646]
[-0.77956962 -0.43401801 1.45729616 0.77205238]]
random_int_array =
[[5 4 2 1]
[0 6 3 8]
[4 5 4 3]
[3 2 5 1]]
In [23]: a=np.array([1,2,3,4,5])
print(f"Array = {a[::-1]}")
b=[1,2,3,4,5]
print(f"List = {b[::-1]}")
# in output we can Observe in Array we have space and in List we have ',' for separation
Array = [5 4 3 2 1]
List = [5, 4, 3, 2, 1]
In [24]: a=np.array([[10,20,30],[40,50,60]])
print(f"Accessing 10,20 elements by using Slicing = \n{a[0,0:2]}")
print(f"Accessing 10,20,40,50 elements by using Slicing = \n{a[0:2,0:2]}")
Accessing 10,20 elements by using Slicing =
[10 20]
Accessing 10,20,40,50 elements by using Slicing =
[[10 20]
[40 50]]
In [25]: a=np.array([[[10,20,30],[40,50,60]],[[1,2,3],[4,5,6]]])
print(f"Accessing 10,20,40,50 elements by using Slicing = \n{a[0,1,0:2]}")
Accessing 10,20,40,50 elements by using Slicing =
[40 50]
In [26]: a=np.array([[[10,20,30],[40,50,60]],[[1,2,3],[4,5,6]]])
# Accessing 50 value
print("Accessing 50 value = ",a[0,1,1])
print("Accessing 2 value = ",a[1,0,1])
Accessing 50 value = 50
Accessing 2 value = 2
In [27]: # Single Element Access : We can access a single element of an ndarray using its indices
# Create a 2D array (matrix)
arr = np.array([[1, 2, 3],
[4, 5, 6]])
In [28]: arr[-1]
In [29]: # Slicing : We can slice ndarrays to access a subset of elements along one or more dimen
# Slice a submatrix
submatrix = arr[:2, 1:]
print("Submatrix:")
print(submatrix)
First row: [1 2 3]
Second column: [2 5]
Submatrix:
[[2 3]
[5 6]]
In [30]: # Boolean Indexing : We can use boolean arrays to index ndarrays and filter elements bas
# Create a 1D array
arr = np.array([1, 2, 3, 4, 5])
In [31]: # Fancy Indexing : We can use arrays of indices (fancy indexing) to access elements from
# Create a 1D array
arr = np.array([1, 2, 3, 4, 5])
# Create a 2D array
arr = np.array([[3, 2, 5],
[1, 4, 6],
[7, 0, 2]])
print("Original array:")
print(arr)
# Create a 2D array
arr = np.array([[3, 2, 5],
[1, 4, 6],
[7, 0, 2]])
print("Original array:")
print(arr)
In [34]: # 3) One-dimensional Arrays : For one-dimensional arrays, there is only one axis (axis 0
# Operations along this axis are straightforward, as there's only one dimension to co
# Here, axis is not explicitly specified because there's only one axis (axis 0)
Total sum: 15
In [35]: # 4) Multi-dimensional Arrays : For multi-dimensional arrays, the axis parameter becomes
# Each axis is identified by an integer, where axis 0 corresponds to the first dimens
# to the second dimension, and so on.
# Here, axis=0 sums along the rows, and axis=1 sums along the columns
Sum along axis 0: [5 7 9]
Sum along axis 1: [ 6 15]
In [36]: # 5) Understanding Axis in Functions : The axis parameter behaves differently depending
# For functions like np.sum(), np.mean(), np.max(), etc., specifying the axis paramet
# which the operation is performed. However, for functions like np.argmax(), the axis
# along which the maximum value is searched for.
# Here, axis=0 finds the index of the maximum along each column,
# and axis=1 finds the index of the maximum along each row
Index of max along axis 0: [1 1 1]
Index of max along axis 1: [2 2]
- i: Signed integer
- f: Floating-point number
- s: String
- u: Unsigned integer
- M: Datetime
- m: Timedelta
- O: Object
- v: Void
These shorthand notations are commonly used when specifying the data
type of NumPy arrays using the dtype parameter.
For example:
- np.int32 can be abbreviated as i4
- np.float64 can be abbreviated as f8
- np.str_ can be abbreviated as S
- np.uint16 can be abbreviated as u2
- np.datetime64 can be abbreviated as M8
- np.timedelta64 can be abbreviated as m8
- np.object can be abbreviated as O
- np.void can be abbreviated as V
Copy() vs View()
- In NumPy, both copy() and view() are methods used to create a new
array object from an existing array, but they behave differently:
In [38]: # copy() Method : The copy() method creates a deep copy of the array. This means that it
# array object with its own data, separate from the original array. Changes made to the
# array, and vice versa.
In [39]: # view() Method : The view() method creates a shallow copy of the array. It creates a ne
# view of the same data as the original array. Changes made to the view affect the origi
# However, the view may have a different shape or strides, allowing you to reinterpret t
# Create an original array
original_arr = np.array([1, 2, 3])
Summary:
- copy(): Creates a deep copy of the array, resulting in a new array
with its own data. Changes to the copy do not
affect the original array.
- view(): Creates a shallow copy of the array, resulting in a new array
with a different view of the same data.
Changes to the view affect the original array.
We can Choose between copy() and view() based on your specific requirements. If you need a completely
independent array, use copy(). If you want to manipulate the data in different ways but share memory between
the arrays, use view().
Joining Arrays
- concatenate()
- stack()
- vstack()
- hstack()
- dstack()
In [40]: # concatenate()
a=np.arange(6).reshape(2,3)
print(f"A = \n{a}")
b=np.arange(7,13).reshape(2,3)
print(f"\nB = \n{b}")
B =
[[ 7 8 9]
[10 11 12]]
In [41]: # stack()
In [44]: # np.dstack() : The np.dstack() function stacks arrays along the third dimension (depth)
[[3 7]
[4 8]]]
In [45]: # split()
arr = np.arange(9)
In [46]: # array_split()
arr = np.arange(10)
In [47]: # hsplit()
arr = np.arange(1, 13).reshape(3, 4)
# Split the array along the third dimension into two sub-arrays
np.dsplit(arr, 2)
[[13, 14],
[17, 18],
[21, 22]]]),
array([[[ 3, 4],
[ 7, 8],
[11, 12]],
[[15, 16],
[19, 20],
[23, 24]]])]
Search in Numpy
- Finding Elements Meeting Conditions:
- np.where() : Returns the indices of elements in an array where a
given condition is satisfied.
- Boolean Indexing : Creates a boolean array based on a condition
and uses it to filter elements.
- searchsorted()
- Searching for Specific Values:
- np.argmax() : Returns the indices of the maximum and minimum
values in an array.
- np.argmin()
- Array Comparison:
- np.allclose() : Checks if all elements of two arrays are equal
within a tolerance.
- Searching for Specific Values:
- np.isin(): Checks if elements of one array are present in another
array.
In [50]: # np.where()
a=np.arange(10,100,10)
print(a)
print(f"array of Index Numbers = {np.where(a%15==0)}")
arr = np.array([1, 2, 3, 4, 5])
indices = np.where(arr > 3)
print("\nIndices where elements > 3:", indices)
[10 20 30 40 50 60 70 80 90]
array of Index Numbers = (array([2, 5, 8], dtype=int64),)
In [51]: # Boolean Indexing: Creates a boolean array based on a condition and uses it to filter e
In [52]: # searchsorted() is a function in NumPy that returns the indices where elements should b
# the sorted order in an array. It is useful for finding the indices where elements woul
# array to maintain its sorted order.
# Search for the indices where elements should be inserted to maintain sorted order
indices = np.searchsorted(arr, [2, 14, 6, 8])
In [53]: # np.argmax() and np.argmin(): Returns the indices of the maximum and minimum values in
In [54]: # np.allclose(): Checks if all elements of two arrays are equal within a tolerance.
arr1 = np.array([1, 2, 3])
arr2 = np.array([1.001, 2.002, 3.003])
are_close = np.allclose(arr1, arr2, atol=0.01)
print("Arrays are close:", are_close)
Arrays are close: True
In [55]: # np.isin(): Checks if elements of one array are present in another array.
arr = np.array([1, 2, 3, 4, 5])
values = [2, 4, 6]
is_present = np.isin(arr, values)
print("Elements present:", is_present)
Elements present: [False True False True False]
Sort() in Numpy
In [56]: # Ex1
# Create an array
arr = np.array([3, 1, 2, 5, 4])
# Sort the array
sorted_arr = np.sort(arr)
In [57]: # Ex2
# Sorting along Axis 0 (Rows) : When sorting along axis 0, each row of the array is trea
# elements within each row are sorted independently.
In [58]: # Ex3
# Sorting along Axis 1 (Columns) : When sorting along axis 1, each column of the array i
# and the elements within each column are sorted independently.
In [59]: # Ex4
d=np.dtype([('name','S10'),('perc',float)])
stud=np.array([("abc",90.3),("def",95.5),("ghi",65.3)],dtype=d)
print(stud)
np.sort(stud,order="perc")
[(b'abc', 90.3) (b'def', 95.5) (b'ghi', 65.3)]
Out[59]: array([(b'ghi', 65.3), (b'abc', 90.3), (b'def', 95.5)],
dtype=[('name', 'S10'), ('perc', '<f8')])
# Addition
result_add = arr1 + arr2
# Subtraction
result_sub = arr1 - arr2
# Multiplication
result_mul = arr1 * arr2
# Division
result_div = arr1 / arr2
# Exponentiation
result_exp = arr1 ** 2
print("Addition:", result_add)
print("Subtraction:", result_sub)
print("Multiplication:", result_mul)
print("Division:", result_div)
print("Exponentiation:", result_exp)
Addition: [5 7 9]
Subtraction: [-3 -3 -3]
Multiplication: [ 4 10 18]
Division: [0.25 0.4 0.5 ]
Exponentiation: [1 4 9]
# Create an array
arr = np.array([10, 20, 30])
# Modulo operation
result_mod = np.mod(arr, 3) # Equivalent to arr % 3
# Reciprocal operation
result_reciprocal = np.reciprocal(arr)
In [64]: # Complex Arithmetic Functions : NumPy also supports complex arithmetic operations such
# imaginary part, and absolute value.
# Create a complex array
arr_complex = np.array([1 + 2j, 3 + 4j, 5 + 6j])
# Complex conjugate
result_conjugate = np.conj(arr_complex)
# Real part
result_real = np.real(arr_complex)
# Imaginary part
result_imag = np.imag(arr_complex)
# Absolute value
result_abs = np.abs(arr_complex)
In [65]: # Broadcasting : NumPy allows you to perform arithmetic operations between arrays of dif
scalar = 10
# Addition (broadcasting)
result_add_scalar = arr + scalar
In [66]: # Universal Functions (ufuncs) : NumPy provides universal functions (ufuncs) for perform
# arithmetic operations efficiently.
# Create an array
arr = np.array([1, 2, 3])
# Square root
result_sqrt = np.sqrt(arr)
# Exponential
result_exp = np.exp(arr)
# Trigonometric functions (sin, cos, tan)
result_sin = np.sin(arr)
result_cos = np.cos(arr)
result_tan = np.tan(arr)
In [67]: arr=np.array([1,2,3,45,67,3,65,7,34,76,3,2])
In [68]: arr
Out[69]: 25.666666666666668
Out[70]: 5.0
Out[72]: 820.8888888888888
Out[73]: 28.65115859592573
In [74]: # percentile
arr = np.array([1, 2, 3, 4, 5])
Out[75]: 1
Out[76]: 5
Out[77]: 15
In [79]: np.corrcoef(arr)
Out[79]: 1.0
In [80]: np.cov(arr)
Out[80]: array(2.5)
In [82]: # np.random.rand(): Generates random values in a given shape from a uniform distribution
# Generate a 2x3 array of random numbers
random_array = np.random.rand(2, 3)
print("Random array from uniform distribution:", random_array)
Random array from uniform distribution: [[0.81790752 0.15824763 0.60906654]
[0.28188703 0.40432175 0.30668917]]
In [84]: # np.random.normal():Draws random samples from a normal (Gaussian) distribution with spe
# Generate 5 random numbers from a normal distribution with mean 0 and standard deviatio
random_normal = np.random.normal(0, 1, size=5)
print("Random values from normal distribution:", random_normal)
Random values from normal distribution: [ 0.44586891 0.21942117 -1.20103574 -1.16410359
-1.11527723]
In [86]: # np.random.choice(): Generates a random sample from a given 1-D array with or without r
# Generate a random sample from a given array
arr = np.array([1, 2, 3, 4, 5])
random_sample = np.random.choice(arr, size=3, replace=False)
print("Random sample:", random_sample)
Random sample: [1 5 3]
In [87]: # np.random.seed():Seeds the random number generator to produce reproducible random numb
# Seed the random number generator
np.random.seed(0)
random_value = np.random.rand()
print("Random value with seed:", random_value)
Random value with seed: 0.5488135039273248
In [88]: # np.random.permutation(): Randomly permutes a sequence or returns a permuted range.
# Permute a sequence
arr = np.array([1, 2, 3, 4, 5])
permuted_arr = np.random.permutation(arr)
print("Permuted array:", permuted_arr)
Permuted array: [5 3 2 4 1]
In [89]: # np.random.uniform(): Draws samples from a uniform distribution within a specified rang
# Generate 5 random numbers from a uniform distribution [1, 10)
random_uniform = np.random.uniform(1, 10, size=5)
print("Random values from uniform distribution:", random_uniform)
Random values from uniform distribution: [6.81304702 4.9382849 9.02595701 9.67296484 4.
45097367]
In [91]: # np.random.poisson(): Draws samples from a Poisson distribution with specified rate par
# Generate 5 random samples from a Poisson distribution with lambda=2.0
random_poisson = np.random.poisson(2.0, size=5)
print("Random samples from Poisson distribution:", random_poisson)
Random samples from Poisson distribution: [0 0 7 1 3]
In [ ]:
import numpy as np
x=np.zeros((3,4))
x
Out[4]: array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26,
27, 28, 29])
In [18]: # 7) Reshape the array from problem 6 into shape (3, 2).
temp=x
print(f"Before Reshape = \n{temp}")
y=np.reshape(x,(3,2))
print(f"\nAfter Reshape = \n{y}")
Before Reshape =
[[0.08686746 0.01323134 0.21084767]
[0.08998506 0.95201384 0.21804003]]
After Reshape =
[[0.08686746 0.01323134]
[0.21084767 0.08998506]
[0.95201384 0.21804003]]
# 3rd Problem
x=np.arange(0,10)
print(f"the NUMBERS are = {x}")
# 8th Problem
total=np.sum(x)
print(f"sum of Total Numbers are = {total}")
the NUMBERS are = [0 1 2 3 4 5 6 7 8 9]
sum of Total Numbers are = 45
# 4th Problem
x=np.arange(10,30)
x
# 9th Problem
np.max(x)
Out[24]: 29
In [35]: # 10) Compute the dot product of two arrays: [1, 2, 3] and [4, 5, 6].
one=np.array([1, 2, 3])
two=np.array([4, 5, 6])
tot=np.dot(one,two)
print(f"First Method = {tot}")
In [50]: # 12) Create a NumPy array of evenly spaced values between 1 and 10.
print(f"Evenly spaced Values are = {np.linspace(1,10,10)}")
Evenly spaced Values are = [ 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.]
In [70]: # 15) Calculate the mean, median, and standard deviation of elements in a NumPy array.
x=np.array([1,2,3,4,5,23,9,56,23,6.34])
print(f"Mean = {np.mean(x)}")
print(f"Median = {np.median(x)}")
print(f"SD = {np.std(x)}")
Mean = 13.234
Median = 5.67
SD = 16.18273166063134
In [73]: # Find the unique elements and their counts in a NumPy array
import numpy as np
In [ ]: