Data Analysis Pythondata~10 mins

Sample() for random rows in Data Analysis Python - Step-by-Step Execution

Choose your learning style9 modes available

Learn Why Deep Visual Try Challenge Project Recall Time

Concept Flow - Sample() for random rows

Start with DataFrame

↓

Call sample(n=3)

↓

Randomly select 3 rows

↓

Return new DataFrame with selected rows

↓

Use or display sampled rows

The sample() function picks random rows from a DataFrame and returns them as a new smaller DataFrame.

Execution Sample

Data Analysis Python

import pandas as pd

df = pd.DataFrame({'A': [10,20,30,40,50], 'B': ['a','b','c','d','e']})
sample_df = df.sample(n=3, random_state=1)
print(sample_df)

This code creates a DataFrame and randomly selects 3 rows using sample().

Execution Table

Step	Action	DataFrame Rows Before	Random Rows Selected	Output DataFrame Rows
1	Create DataFrame	[0,1,2,3,4]	N/A	[0,1,2,3,4]
2	Call sample(n=3, random_state=1)	[0,1,2,3,4]	[1,4,2]	[1,4,2]
3	Print sampled DataFrame	[0,1,2,3,4]	[1,4,2]	[1,4,2]
4	End	[0,1,2,3,4]	N/A	[1,4,2]

💡 Sampled 3 random rows; execution ends after printing.

Variable Tracker

Variable	Start	After sample() call	Final
df	[0,1,2,3,4]	[0,1,2,3,4]	[0,1,2,3,4]
sample_df	N/A	[1,4,2]	[1,4,2]

Key Moments - 3 Insights

Why does sample_df have rows with indices [1,4,2] instead of [0,1,2]?

What does random_state=1 do in sample()?

Does sample() change the original DataFrame df?

Visual Quiz - 3 Questions

Test your understanding

Look at the execution_table at step 2, which rows were randomly selected?

A[1,4,2]

B[0,1,2]

C[1,3,4]

D[3,4,0]

Concept Snapshot

sample(n) picks n random rows from a DataFrame.
Use random_state to get repeatable samples.
Returns a new DataFrame; original stays unchanged.
Useful for quick random data checks or splits.

Full Transcript

The sample() function in pandas lets you pick random rows from a DataFrame. You start with your full DataFrame, then call sample(n=3) to get 3 random rows. Setting random_state ensures the same rows are picked every time, which helps with testing or sharing results. The original DataFrame does not change; sample() returns a new smaller DataFrame with the selected rows. This is useful when you want to look at a random subset of your data quickly.