Challenge - 5 Problems

🎖️

Duplicate Column Mastery

Get all challenges correct to earn this badge!

Test your skills under time pressure!

❓ Predict Output

intermediate

2:00remaining

Output of DataFrame with duplicate columns after selection

What is the output of the following code snippet when selecting column 'A' from a DataFrame with duplicate column names?

Data Analysis Python

import pandas as pd

df = pd.DataFrame({"A": [1, 2], "B": [3, 4], "A": [5, 6]})
result = df["A"]
print(result)

A[5, 6]

BKeyError: 'A'

C[1, 2]

DDataFrame with two columns named 'A'

Attempts:

2 left

❓ data_output

intermediate

2:00remaining

Number of columns after reading CSV with duplicate headers

Given a CSV file with headers: 'X,Y,X', what will be the number of columns in the DataFrame after reading it with pandas default settings?

Data Analysis Python

import pandas as pd
from io import StringIO

csv_data = "X,Y,X\n1,2,3\n4,5,6"
df = pd.read_csv(StringIO(csv_data))
print(len(df.columns))

ARaises an error

Attempts:

2 left

🔧 Debug

advanced

2:00remaining

Identify error when accessing duplicate columns by attribute

What error occurs when trying to access a duplicate column by attribute in pandas DataFrame?

Data Analysis Python

import pandas as pd

df = pd.DataFrame([[1, 2], [3, 4]], columns=["A", "A"])
print(df.A)

AAttributeError: 'DataFrame' object has no attribute 'A'

BReturns a DataFrame with both 'A' columns

CRaises a KeyError

DReturns the last 'A' column as a Series

Attempts:

2 left

🚀 Application

advanced

2:00remaining

Resolving duplicate columns after concatenation

After concatenating two DataFrames with overlapping column names, which method correctly renames duplicate columns to unique names?

Data Analysis Python

import pandas as pd

df1 = pd.DataFrame({"A": [1], "B": [2]})
df2 = pd.DataFrame({"A": [3], "B": [4]})
df_concat = pd.concat([df1, df2], axis=1)
# Which code renames duplicates correctly?

Adf_concat.columns = list(set(df_concat.columns))

Bdf_concat.columns = df_concat.columns.unique()

Cdf_concat.columns = [f'{col}_{i}' for i, col in enumerate(df_concat.columns)]

Ddf_concat.columns = df_concat.columns.drop_duplicates()

Attempts:

2 left

🧠 Conceptual

expert

3:00remaining

Effect of duplicate columns on groupby aggregation

If a DataFrame has duplicate column names and you perform a groupby aggregation on one of these columns, what is the expected behavior?

AAggregation applies only to the last occurrence of the column name

BAggregation applies to all columns with that name, returning multiple results per group

CAggregation applies only to the first occurrence of the column name

DRaises a ValueError due to ambiguous column names

Attempts:

2 left