Challenge - 5 Problems
Sample() Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
What is the output of this sample() code?
Given the DataFrame
df below, what will df.sample(n=3, random_state=1) return?Data Analysis Python
import pandas as pd df = pd.DataFrame({'A': [10, 20, 30, 40, 50], 'B': ['a', 'b', 'c', 'd', 'e']}) sample_df = df.sample(n=3, random_state=1) print(sample_df)
Attempts:
2 left
💡 Hint
Remember that setting random_state fixes the random selection order.
✗ Incorrect
The sample() method with n=3 and random_state=1 picks rows with indices 2, 0, and 1 in that order.
❓ data_output
intermediate1:00remaining
How many rows are returned by sample() with frac=0.4?
If a DataFrame has 10 rows, what is the number of rows returned by
df.sample(frac=0.4, random_state=5)?Data Analysis Python
import pandas as pd df = pd.DataFrame({'X': range(10)}) sample_df = df.sample(frac=0.4, random_state=5) print(len(sample_df))
Attempts:
2 left
💡 Hint
frac means fraction of total rows.
✗ Incorrect
frac=0.4 means 40% of 10 rows, which is 4 rows.
🔧 Debug
advanced1:30remaining
What error does this sample() code raise?
What error will this code produce?
Data Analysis Python
import pandas as pd df = pd.DataFrame({'A': [1, 2, 3]}) sample_df = df.sample(n=5)
Attempts:
2 left
💡 Hint
Check if sample size is larger than DataFrame size without replacement.
✗ Incorrect
By default, sample() does not allow sampling more rows than exist without replacement, so it raises a ValueError.
🚀 Application
advanced1:30remaining
Which code produces a random sample with replacement?
You want to randomly select 4 rows from a DataFrame of 3 rows, allowing repeats. Which code does this?
Data Analysis Python
import pandas as pd df = pd.DataFrame({'A': [1, 2, 3]})
Attempts:
2 left
💡 Hint
Sampling with replacement allows repeats and can exceed original size.
✗ Incorrect
Only option C uses replace=True to allow sampling more rows than exist with repeats.
🧠 Conceptual
expert1:00remaining
What is the effect of setting random_state in sample()?
Why do we set the
random_state parameter in df.sample()?Attempts:
2 left
💡 Hint
Think about reproducibility in random operations.
✗ Incorrect
random_state fixes the random seed so the sample is reproducible and consistent across runs.