0
0
Data Analysis Pythondata~10 mins

Sample() for random rows in Data Analysis Python - Step-by-Step Execution

Choose your learning style9 modes available
Concept Flow - Sample() for random rows
Start with DataFrame
Call sample(n=3)
Randomly select 3 rows
Return new DataFrame with selected rows
Use or display sampled rows
The sample() function picks random rows from a DataFrame and returns them as a new smaller DataFrame.
Execution Sample
Data Analysis Python
import pandas as pd

df = pd.DataFrame({'A': [10,20,30,40,50], 'B': ['a','b','c','d','e']})
sample_df = df.sample(n=3, random_state=1)
print(sample_df)
This code creates a DataFrame and randomly selects 3 rows using sample().
Execution Table
StepActionDataFrame Rows BeforeRandom Rows SelectedOutput DataFrame Rows
1Create DataFrame[0,1,2,3,4]N/A[0,1,2,3,4]
2Call sample(n=3, random_state=1)[0,1,2,3,4][1,4,2][1,4,2]
3Print sampled DataFrame[0,1,2,3,4][1,4,2][1,4,2]
4End[0,1,2,3,4]N/A[1,4,2]
💡 Sampled 3 random rows; execution ends after printing.
Variable Tracker
VariableStartAfter sample() callFinal
df[0,1,2,3,4][0,1,2,3,4][0,1,2,3,4]
sample_dfN/A[1,4,2][1,4,2]
Key Moments - 3 Insights
Why does sample_df have rows with indices [1,4,2] instead of [0,1,2]?
Because sample() picks rows randomly, not sequentially. The execution_table row 2 shows the random indices selected.
What does random_state=1 do in sample()?
It fixes the random selection so the same rows are picked every time. This is why sample_df is consistent across runs (see execution_table row 2).
Does sample() change the original DataFrame df?
No, sample() returns a new DataFrame with selected rows. The original df stays the same as shown in variable_tracker.
Visual Quiz - 3 Questions
Test your understanding
Look at the execution_table at step 2, which rows were randomly selected?
A[1,4,2]
B[0,1,2]
C[1,3,4]
D[3,4,0]
💡 Hint
Check the 'Random Rows Selected' column at step 2 in execution_table.
At which step does the sample_df variable get its value?
AStep 1
BStep 2
CStep 3
DStep 4
💡 Hint
Look at variable_tracker and execution_table rows where sample_df changes.
If we remove random_state=1, what will happen to the sampled rows?
ANo rows will be selected.
BThey will always be the same rows.
CThey will be randomly different each run.
DAn error will occur.
💡 Hint
random_state fixes randomness; without it, sample() picks different rows each time.
Concept Snapshot
sample(n) picks n random rows from a DataFrame.
Use random_state to get repeatable samples.
Returns a new DataFrame; original stays unchanged.
Useful for quick random data checks or splits.
Full Transcript
The sample() function in pandas lets you pick random rows from a DataFrame. You start with your full DataFrame, then call sample(n=3) to get 3 random rows. Setting random_state ensures the same rows are picked every time, which helps with testing or sharing results. The original DataFrame does not change; sample() returns a new smaller DataFrame with the selected rows. This is useful when you want to look at a random subset of your data quickly.