Recall & Review
beginner
What does the
drop_duplicates() function do in pandas?It removes duplicate rows from a DataFrame, keeping only unique rows based on all or selected columns.
Click to reveal answer
beginner
How do you keep the first occurrence of duplicates when using
drop_duplicates()?By default,
drop_duplicates() keeps the first occurrence and removes later duplicates. You can also set keep='first' explicitly.Click to reveal answer
intermediate
What parameter do you use to remove duplicates based on specific columns only?
Use the
subset parameter with a list of column names to consider only those columns when identifying duplicates.Click to reveal answer
intermediate
How can you remove duplicates and modify the original DataFrame directly?
Set the parameter
inplace=True in drop_duplicates() to remove duplicates without creating a new DataFrame.Click to reveal answer
advanced
What happens if you set
keep=False in drop_duplicates()?All duplicates are removed, including the first occurrences, so only rows that are unique remain.
Click to reveal answer
What is the default behavior of
drop_duplicates() in pandas?✗ Incorrect
By default,
drop_duplicates() keeps the first occurrence and removes later duplicates.Which parameter lets you specify columns to check for duplicates?
✗ Incorrect
The
subset parameter takes a list of columns to consider when identifying duplicates.How do you remove duplicates and update the original DataFrame without creating a new one?
✗ Incorrect
Using
inplace=True modifies the original DataFrame directly.What does
keep=False do in drop_duplicates()?✗ Incorrect
Setting
keep=False removes all duplicates, leaving only unique rows.If you want to remove duplicates based on columns 'A' and 'B' only, which code is correct?
✗ Incorrect
The
subset parameter specifies columns to check for duplicates.Explain how
drop_duplicates() works and how you can control which duplicates to keep or remove.Think about how to keep or remove duplicates and which columns to consider.
You got /4 concepts.
Describe a real-life example where removing duplicates from data is important and how
drop_duplicates() helps.Imagine cleaning a list of names or transactions.
You got /4 concepts.