Convert Covariance Matrix to Correlation Matrix using Python
Last Updated :
30 Sep, 2025
In statistics, covariance measures how variables vary together, while correlation standardizes this relationship to a value between -1 and 1, making it easier to interpret. In this article, we will be discussing relationship between Covariance and Correlation and program our own function for calculating covariance and correlation using python.
Relationship Between Covariance and Correlation
Correlation is just normalized Covariance refer to the formula below:
\text{corr}(x, y) = \frac{\text{cov}(x, y)}{\sigma_x \cdot \sigma_y}
where \sigma_x, \sigma_y are the standard deviation of x and y respectively.
Program to convert Covariance to Correlation matrix
We will use the Iris dataset for demonstration. The goal is to first compute the covariance matrix manually and then convert it to a correlation matrix.
1. Loading and displaying the dataset
Python
import numpy as np
import pandas as pd
data = pd.read_csv("iris.csv")
data.head()
Output

In this example, we exclude the target column (species) since we only want numeric features:
Python
data = dataset.iloc[:, :-1].values
2. Define a Function to Calculate Covariance Between Two Variables
The covariance between two variables x and y measures how much they vary together.
Python
# Calculates covariance between two 1D arrays x and y.
def calcCov(x, y):
mean_x, mean_y = x.mean(), y.mean() # Calculate mean of each variable
n = len(x) # Number of observations
# Compute covariance using the formula: sum((x-mean_x)*(y-mean_y)) / n
return sum((x - mean_x) * (y - mean_y)) / n
Explanation:
- x.mean() and y.mean() calculate the average of each variable.
- The formula multiplies deviations of each observation from the mean and averages them.
3. Compute the Covariance Matrix
We can now build the full covariance matrix for all numeric features.
Python
# Calculates the covariance matrix for the dataset.
def covMat(data):
rows, cols = data.shape
cov_matrix = np.zeros((cols, cols)) # Initialize a square matrix
# Fill the covariance matrix
for i in range(cols):
for j in range(cols):
cov_matrix[i][j] = calcCov(data[:, i], data[:, j])
return cov_matrix
# Compute covariance matrix
covMat(data)
Output

Explanation:
- cov_matrix[i][j] stores covariance between the i-th and j-th feature.
- The matrix is symmetric because cov(x, y) = cov(y, x)
This manual calculation matches the NumPy function:
Python
np.cov(data, rowvar=False)
rowvar=False ensures columns are treated as features and rows as observations.
Output

4. Convert Covariance Matrix to Correlation Matrix
Correlation is simply normalized covariance, dividing by the standard deviations of each variable.
Python
def corrMat(data):
rows, cols = data.shape
corr_matrix = np.zeros((cols, cols))
# Compute correlation for each pair of variables
for i in range(cols):
for j in range(cols):
x, y = data[:, i], data[:, j]
# Normalize covariance by product of standard deviations
corr_matrix[i][j] = calcCov(x, y) / (x.std() * y.std())
return corr_matrix
# Compute correlation matrix
corrMat(data)
Output

Explanation:
- For each pair of features (i, j), the covariance is normalized by x.std() * y.std() to get correlation.
- Values are always between -1 and 1, indicating the strength and direction of the relationship.
5. Verify Using NumPy
Python
np.corrcoef(data, rowvar=False)
Output

This gives the same correlation matrix in a single step.
Explore
Python Fundamentals
Python Data Structures
Advanced Python
Data Science with Python
Web Development with Python
Python Practice