0
0
Pandasdata~20 mins

Using appropriate dtypes in Pandas - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Dtype Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
Output of memory usage with different dtypes
What is the output of the following code snippet that compares memory usage of integer columns with default and optimized dtypes?
Pandas
import pandas as pd
import numpy as np

df = pd.DataFrame({
    'A': np.random.randint(0, 100, size=1000),
    'B': np.random.randint(0, 100, size=1000)
})

mem_default = df.memory_usage(deep=True, index=False).sum()
df['A'] = df['A'].astype('int8')
df['B'] = df['B'].astype('int8')
mem_optimized = df.memory_usage(deep=True, index=False).sum()
print(mem_default, mem_optimized)
A16000 2000
B16000 4000
C8000 4000
D8000 2000
Attempts:
2 left
💡 Hint
Think about how changing from default int64 to int8 affects memory size.
data_output
intermediate
2:00remaining
Resulting dtypes after conversion
Given the DataFrame below, what are the dtypes of columns after applying the conversion code?
Pandas
import pandas as pd

df = pd.DataFrame({
    'col1': [1, 2, 3],
    'col2': [0.1, 0.2, 0.3],
    'col3': ['a', 'b', 'c']
})

df['col1'] = df['col1'].astype('int16')
df['col2'] = df['col2'].astype('float32')
Acol1: int16, col2: float32, col3: object
Bcol1: int64, col2: float64, col3: object
Ccol1: int16, col2: float64, col3: string
Dcol1: int32, col2: float32, col3: object
Attempts:
2 left
💡 Hint
Check the astype conversions applied to col1 and col2.
🔧 Debug
advanced
2:00remaining
Identify the error in dtype conversion
What error will this code raise when trying to convert the 'age' column to 'int8'?
Pandas
import pandas as pd

df = pd.DataFrame({'age': [25, 300, 45]})
df['age'] = df['age'].astype('int8')
ATypeError
BValueError
COverflowError
DNo error, conversion succeeds
Attempts:
2 left
💡 Hint
Check if values fit in int8 range (-128 to 127).
🧠 Conceptual
advanced
2:00remaining
Best dtype for categorical data
Which dtype is most memory efficient and appropriate for a column with repeated string categories like 'red', 'blue', 'green'?
Aobject
Bcategory
Cstring
Dint64
Attempts:
2 left
💡 Hint
Think about how pandas stores repeated categories internally.
🚀 Application
expert
3:00remaining
Optimize memory usage for mixed dtype DataFrame
Given a DataFrame with columns: 'id' (integers 0-100000), 'score' (floats 0-1), 'grade' (strings 'A', 'B', 'C'), which dtype conversions will minimize memory usage without losing data?
Aid: int8, score: float32, grade: category
Bid: int16, score: float64, grade: object
Cid: int32, score: float32, grade: category
Did: int64, score: float16, grade: string
Attempts:
2 left
💡 Hint
Consider the range of 'id' and precision needed for 'score'.