Challenge - 5 Problems

🎖️

Column Name Mastery

Get all challenges correct to earn this badge!

Test your skills under time pressure!

❓ Predict Output

intermediate

2:00remaining

What is the output of this code after standardizing column names?

Given a DataFrame with columns having spaces and uppercase letters, what will be the column names after applying the standardization code?

Pandas

import pandas as pd

df = pd.DataFrame(columns=['First Name', 'Last Name', 'Age'])
df.columns = df.columns.str.strip().str.lower().str.replace(' ', '_')
print(list(df.columns))

A['first name', 'last name', 'age']

B['First_Name', 'Last_Name', 'Age']

C['first_name', 'last_name', 'age']

D['FIRST_NAME', 'LAST_NAME', 'AGE']

Attempts:

2 left

❓ data_output

intermediate

2:00remaining

How many columns remain after filtering standardized columns?

After standardizing column names, you want to keep only columns that start with 'user_'. How many columns remain?

Pandas

import pandas as pd

df = pd.DataFrame(columns=['User ID', 'User Name', 'Age', 'user_email'])
df.columns = df.columns.str.strip().str.lower().str.replace(' ', '_')
filtered_cols = [col for col in df.columns if col.startswith('user_')]
print(len(filtered_cols))

Attempts:

2 left

🔧 Debug

advanced

2:00remaining

What error does this code raise when standardizing columns?

Identify the error raised by this code snippet:

Pandas

import pandas as pd

df = pd.DataFrame(columns=['Name', 'Age'])
df.columns = df.columns.str.lower().replace(' ', '_')

AAttributeError: 'Index' object has no attribute 'replace'

BSyntaxError: invalid syntax

CNo error, columns are standardized correctly

DTypeError: replace() missing 1 required positional argument

Attempts:

2 left

🚀 Application

advanced

2:00remaining

Which option correctly standardizes columns to lowercase with underscores?

Choose the code snippet that correctly standardizes DataFrame columns by making them lowercase and replacing spaces with underscores.

Adf.columns = df.columns.str.replace(' ', '_').lower()

Bdf.columns = df.columns.str.lower().str.replace(' ', '_')

Cdf.columns = df.columns.replace(' ', '_').lower()

Ddf.columns = df.columns.str.lower().replace(' ', '_')

Attempts:

2 left

🧠 Conceptual

expert

2:00remaining

Why is standardizing column names important in data science projects?

Choose the best reason why standardizing column names is a crucial step in data science workflows.

AIt ensures consistent column naming to avoid errors in code and makes data easier to understand and merge.

BIt increases the size of the dataset for better model training.

CIt automatically fixes missing values in the dataset.

DIt encrypts the column names to protect sensitive data.

Attempts:

2 left