0
0
Data Analysis Pythondata~20 mins

Data type optimization in Data Analysis Python - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Data Type Optimization Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
Output of memory usage after optimization

Given a pandas DataFrame with integer columns, what is the output of the memory usage after converting the columns to int8?

Data Analysis Python
import pandas as pd
import numpy as np

df = pd.DataFrame({
    'A': np.random.randint(0, 100, 1000),
    'B': np.random.randint(0, 100, 1000)
})

before = df.memory_usage(deep=True).sum()
df['A'] = df['A'].astype('int8')
df['B'] = df['B'].astype('int8')
after = df.memory_usage(deep=True).sum()
print(after < before)
AFalse
BTrue
CTypeError
DMemoryError
Attempts:
2 left
💡 Hint

Think about how changing data types to smaller ones affects memory.

data_output
intermediate
2:00remaining
Resulting data types after optimization

What are the data types of the columns after applying the following optimization?

Data Analysis Python
import pandas as pd

df = pd.DataFrame({
    'col1': [1, 2, 3],
    'col2': [1000, 2000, 3000],
    'col3': ['a', 'b', 'c']
})

df['col1'] = df['col1'].astype('int8')
df['col2'] = df['col2'].astype('int16')
result = df.dtypes.to_dict()
A{'col1': 'int8', 'col2': 'int16', 'col3': 'object'}
B{'col1': 'int64', 'col2': 'int64', 'col3': 'object'}
C{'col1': 'float64', 'col2': 'float64', 'col3': 'object'}
D{'col1': 'int8', 'col2': 'int8', 'col3': 'object'}
Attempts:
2 left
💡 Hint

Check the explicit astype conversions.

🔧 Debug
advanced
2:00remaining
Identify the error in data type conversion

What error will this code raise when trying to convert the column to int8?

Data Analysis Python
import pandas as pd

df = pd.DataFrame({'values': [0, 127, 128, 255]})
df['values'] = df['values'].astype('int8')
AOverflowError
BNo error, conversion succeeds
CValueError
DTypeError
Attempts:
2 left
💡 Hint

Check the range of int8 and the values in the column.

🚀 Application
advanced
2:30remaining
Choosing optimal data types for mixed data

You have a DataFrame with columns: age (0-120), income (0-1,000,000), and gender ('M' or 'F'). Which data types optimize memory without losing information?

A{'age': 'int8', 'income': 'int32', 'gender': 'category'}
B{'age': 'int16', 'income': 'int16', 'gender': 'object'}
C{'age': 'int8', 'income': 'float64', 'gender': 'object'}
D{'age': 'int32', 'income': 'int64', 'gender': 'category'}
Attempts:
2 left
💡 Hint

Consider the value ranges and best types for categorical data.

🧠 Conceptual
expert
3:00remaining
Impact of data type optimization on performance

Which statement best describes the impact of data type optimization on data processing performance?

AData type optimization has no effect on performance, only on memory usage.
BSmaller data types always speed up computations because less memory is used.
CUsing larger data types improves performance by avoiding overflow errors.
DOptimizing data types reduces memory usage but can sometimes slow down computations due to type conversions.
Attempts:
2 left
💡 Hint

Think about trade-offs between memory and CPU operations.