beginner

What does the 'keep' parameter do in pandas' drop_duplicates() method?

The 'keep' parameter decides which duplicate to keep: 'first' keeps the first occurrence, 'last' keeps the last occurrence, and False drops all duplicates.

Click to reveal answer

intermediate

In pandas, what happens if you set keep=False in drop_duplicates()?

All rows that have duplicates are removed, so only unique rows remain with no duplicates at all.

Click to reveal answer

beginner

How does keep='first' differ from keep='last' in drop_duplicates()?

keep='first' keeps the first occurrence of each duplicate group and removes the rest, while keep='last' keeps the last occurrence and removes earlier ones.

Click to reveal answer

beginner

True or False: Using keep=False in drop_duplicates() will keep one row from each duplicate group.

False. keep=False removes all duplicates, so no rows from duplicate groups are kept.

Click to reveal answer

intermediate

Why might you use keep=False instead of 'first' or 'last' when removing duplicates?

To ensure that only completely unique rows remain, removing all rows that appear more than once, which can be important for clean data analysis.

Click to reveal answer

What does keep='first' do in pandas drop_duplicates()?

AKeeps the first occurrence of duplicates and removes the rest

BKeeps the last occurrence of duplicates and removes the rest

CRemoves all duplicates completely

DKeeps all duplicates

If you want to remove all rows that have duplicates, which keep option should you use?

Alast

BFalse

Cfirst

Dall

What is the default value of the keep parameter in drop_duplicates()?

Afirst

Blast

Call

Dnone

Which keep option keeps the last duplicate row?

Aall

Bfirst

Cnone

Dlast

What happens if you set keep=False and there are no duplicates in the data?

AAll rows are removed

BOnly the first row is kept

CNo rows are removed

DOnly the last row is kept

Explain the difference between keep='first', keep='last', and keep=False in pandas drop_duplicates().

Describe a situation where using keep=False would be better than keep='first' or 'last'.