Pandas Dataframe.sample() | Python
Last Updated :
11 Jul, 2025
Pandas DataFrame.sample() function is used to select randomly rows or columns from a DataFrame. It proves particularly helpful while dealing with huge datasets where we want to test or analyze a small representative subset. We can define the number or proportion of items to sample and manage randomness through parameters such as n, frac and random_state.
Example : Sampling a Single Random Row
In this example, we load a dataset and generate a single random row using the sample() method by setting n=1.
C++
import pandas as pd
# Load dataset
d = pd.read_csv("employees.csv")
# Sample one random row
r_row = d.sample(n=1)
# Display the result
r_row
Output
one row of dataframeThe sample(n=1) function selects one random row from the DataFrame.
Syntax
DataFrame.sample(n=None, frac=None, replace=False, weights=None, random_state=None, axis=None)
Parameters:
- n: int value, Number of random rows to generate.
- frac: Float value, Returns (float value * length of data frame values ) . frac cannot be used with n.
- replace: Boolean value, return sample with replacement if True.
- random_state: int value or numpy.random.RandomState, optional. if set to a particular integer, will return same rows as sample in every iteration.
- axis: 0 or 'row' for Rows and 1 or 'column' for Columns.
Return Type: New object of same type as caller.
To download the CSV file used, Click Here.
Examples of Pandas Dataframe.sample()
Example 1: Sample 25% of the DataFrame
In this example, we generate a random sample consisting of 25% of the entire DataFrame by using the frac parameter.
C++
import pandas as pd
d = pd.read_csv("employees.csv")
# Sample 25% of the data
sr = d.sample(frac=0.25)
# Verify the number of rows
print(f"Original rows: {len(d)}")
print(f"Sampled rows (25%): {len(sr)}")
# Display the result
sr
Output
25% of dataframe As shown in the output image, the length of sample generated is 25% of data frame. Also the sample is generated randomly.
Example 2: Sampling with Replacement and a Fixed Random State
This example demonstrates how to sample multiple rows with replacement (i.e., allowing repetition of rows) and ensures reproducibility using a fixed random seed.
C++
import pandas as pd
d = pd.read_csv("employees.csv")
# Sample 3 rows with replacement and fixed seed
sd = d.sample(n=3, replace=True, random_state=42)
sd
Output
sampling with replacementThe replace=True parameter allows the same row to be sampled more than once, making it ideal for bootstrapping. random_state=42 ensures the result is reproducible across multiple runs very useful during testing and debugging.
Explore
Python Fundamentals
Python Data Structures
Advanced Python
Data Science with Python
Web Development with Python
Python Practice