0
0
Pandasdata~20 mins

Why indexing matters in Pandas - Challenge Your Understanding

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Indexing Mastery Badge
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
Output of selecting rows with index labels

What is the output of this code snippet?

import pandas as pd

df = pd.DataFrame({
    'A': [10, 20, 30],
    'B': [100, 200, 300]
}, index=['x', 'y', 'z'])

result = df.loc[['y', 'z'], 'A']
print(result)
Pandas
import pandas as pd

df = pd.DataFrame({
    'A': [10, 20, 30],
    'B': [100, 200, 300]
}, index=['x', 'y', 'z'])

result = df.loc[['y', 'z'], 'A']
print(result)
AKeyError: "['y', 'z'] not in index"
B
0    20
1    30
Name: A, dtype: int64
C
x    10
y    20
Name: A, dtype: int64
D
y    20
z    30
Name: A, dtype: int64
Attempts:
2 left
💡 Hint

Remember that loc uses the index labels, not integer positions.

data_output
intermediate
1:30remaining
Number of rows after resetting index

Given this DataFrame, what is the number of rows after resetting the index?

import pandas as pd

df = pd.DataFrame({
    'score': [88, 92, 95],
    'grade': ['B', 'A', 'A']
}, index=['s1', 's2', 's3'])

new_df = df.reset_index()
print(len(new_df))
Pandas
import pandas as pd

df = pd.DataFrame({
    'score': [88, 92, 95],
    'grade': ['B', 'A', 'A']
}, index=['s1', 's2', 's3'])

new_df = df.reset_index()
print(len(new_df))
A4
B3
C2
DKeyError
Attempts:
2 left
💡 Hint

Resetting the index adds the old index as a column but does not change the number of rows.

🔧 Debug
advanced
2:00remaining
Why does this code raise a KeyError?

Consider this code:

import pandas as pd

df = pd.DataFrame({
    'value': [5, 10, 15]
}, index=[1, 2, 3])

print(df.loc[0])

Why does it raise a KeyError?

Pandas
import pandas as pd

df = pd.DataFrame({
    'value': [5, 10, 15]
}, index=[1, 2, 3])

print(df.loc[0])
ABecause the DataFrame has no columns named 0
BBecause loc only accepts column names, not index labels
CBecause 0 is not in the DataFrame's index labels
DBecause loc requires integer positions, not labels
Attempts:
2 left
💡 Hint

Check the index labels of the DataFrame and what loc expects.

🚀 Application
advanced
2:30remaining
Selecting rows efficiently with index

You have a large DataFrame with a unique index of user IDs. Which method is the fastest to select multiple users by their IDs?

AUse <code>df.loc[list_of_user_ids]</code> to select rows by index labels
BUse <code>df.iloc[list_of_user_ids]</code> to select rows by integer position
CUse <code>df[df['user_id'].isin(list_of_user_ids)]</code> to filter rows
DUse <code>df.query('user_id in @list_of_user_ids')</code> to filter rows
Attempts:
2 left
💡 Hint

Think about how pandas uses the index for fast lookups.

🧠 Conceptual
expert
3:00remaining
Why setting an index improves join performance

Why does setting a column as the index improve the performance of joining two DataFrames on that column?

ABecause indexes are implemented as hash tables or trees, allowing faster lookups during joins
BBecause setting an index compresses the data, reducing memory usage during joins
CBecause setting an index sorts the data, and joins require sorted data
DBecause joins only work on index columns, so setting the index is mandatory
Attempts:
2 left
💡 Hint

Think about how indexes help find matching rows quickly.