A Q-Q plot, short for “quantile-quantile” plot, is often used to assess whether or not a set of data potentially came from some theoretical distribution.
In most cases, this type of plot is used to determine whether or not a set of data follows a normal distribution.
This tutorial explains how to create a Q-Q plot for a set of data in Python.
Example: Q-Q Plot in Python
Suppose we have the following dataset of 100 values:
import numpy as np #create dataset with 100 values that follow a normal distribution np.random.seed(0) data = np.random.normal(0,1, 1000) #view first 10 values data[:10] array([ 1.76405235, 0.40015721, 0.97873798, 2.2408932 , 1.86755799, -0.97727788, 0.95008842, -0.15135721, -0.10321885, 0.4105985 ])
To create a Q-Q plot for this dataset, we can use the qqplot() function from the statsmodels library:
import statsmodels.api as sm import matplotlib.pyplot as plt #create Q-Q plot with 45-degree line added to plot fig = sm.qqplot(data, line='45') plt.show()

In a Q-Q plot, the x-axis displays the theoretical quantiles. This means it doesn’t show your actual data, but instead it represents where your data would be if it were normally distributed.
The y-axis displays your actual data. This means that if the data values fall along a roughly straight line at a 45-degree angle, then the data is normally distributed.
We can see in our Q-Q plot above that the data values tend to closely follow the 45-degree, which means the data is likely normally distributed. This shouldn’t be surprising since we generated the 100 data values by using the numpy.random.normal() function.
Consider instead if we generated a dataset of 100 uniformally distributed values and created a Q-Q plot for that dataset:
#create dataset of 100 uniformally distributed values data = np.random.uniform(0,1, 1000) #generate Q-Q plot for the dataset fig = sm.qqplot(data, line='45') plt.show()

The data values clearly do not follow the red 45-degree line, which is an indication that they do not follow a normal distribution.
Notes on Q-Q Plots
Keep in mind the following notes about Q-Q plots:
- Although a Q-Q plot isn’t a formal statistical test, it offers an easy way to visually check whether or not a data set is normally distributed.
- Be careful not to confuse Q-Q plots with P-P plots, which are less commonly used and not as useful for analyzing data values that fall on the extreme tails of the distribution.
You can find more Python tutorials here.
zach, please correct this tutorial as it’s super misleading. If, for example you plot
data = np.random.normal(0,5, 1000)
and then add a 45 degree line, the points won’t be on the line as this tutorial indicates.
For comparing data with a distribution it is better to use
line = ‘r’ which will fit a regression line to fit the data. The analysis you use for your second example is simply wrong…