0
0
Pandasdata~20 mins

Why data I/O matters in Pandas - Challenge Your Understanding

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Data I/O Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
What is the output of this CSV read code?

Given a CSV file with columns name and age, what does this code output?

Pandas
import pandas as pd
from io import StringIO
csv_data = "name,age\nAlice,30\nBob,25"
df = pd.read_csv(StringIO(csv_data))
print(df.head(1))
A
   name  age
0  Alice   30
B
   name  age
1    Bob   25
C
   name  age
0  Alice  30.0
D
   name  age
0  Alice  '30'
Attempts:
2 left
💡 Hint

Look at the first row printed by head(1).

data_output
intermediate
1:30remaining
How many rows does this DataFrame have after reading JSON?

After reading this JSON string into a DataFrame, how many rows does it contain?

Pandas
import pandas as pd
json_str = '[{"city": "NY", "pop": 8000}, {"city": "LA", "pop": 4000}]'
df = pd.read_json(json_str)
print(len(df))
A0
B1
C2
DRaises ValueError
Attempts:
2 left
💡 Hint

Count the number of objects in the JSON array.

🔧 Debug
advanced
2:00remaining
What error does this code raise when reading a CSV with wrong delimiter?

What error will this code raise?

Pandas
import pandas as pd
from io import StringIO
csv_data = "name;age\nAlice;30\nBob;25"
df = pd.read_csv(StringIO(csv_data))
ATypeError: expected string or bytes-like object
BNo error, DataFrame with one column
CValueError: delimiter not supported
DParserError: Error tokenizing data
Attempts:
2 left
💡 Hint

What is the default delimiter for read_csv?

visualization
advanced
2:30remaining
Which plot shows correct data after reading Excel file?

Given this Excel data read into a DataFrame, which plot correctly shows the sales column?

Pandas
import pandas as pd
import matplotlib.pyplot as plt
from io import BytesIO
import numpy as np

# Simulate Excel file
excel_data = BytesIO()
df_original = pd.DataFrame({'month': ['Jan', 'Feb', 'Mar'], 'sales': [100, 150, 120]})
df_original.to_excel(excel_data, index=False)
excel_data.seek(0)

# Read Excel

df = pd.read_excel(excel_data)

plt.figure(figsize=(4,3))
plt.plot(df['month'], df['sales'], marker='o')
plt.title('Monthly Sales')
plt.xlabel('Month')
plt.ylabel('Sales')
plt.tight_layout()
plt.savefig('plot.png')
plt.close()
ALine plot with months Jan, Feb, Mar on x-axis and sales 100, 150, 120 on y-axis
BBar plot with months Jan, Feb, Mar on x-axis and sales 100, 150, 120 on y-axis
CLine plot with months 0, 1, 2 on x-axis and sales 100, 150, 120 on y-axis
DScatter plot with sales on x-axis and months on y-axis
Attempts:
2 left
💡 Hint

Look at the code: it uses plt.plot with month on x-axis.

🧠 Conceptual
expert
1:30remaining
Why is efficient data I/O critical in data science projects?

Choose the best reason why efficient data input/output (I/O) matters in data science workflows.

AIt allows data scientists to avoid learning programming languages.
BIt ensures data is always stored in CSV format for compatibility.
CIt guarantees that data will never have missing values after loading.
DIt reduces the time spent waiting for data to load or save, speeding up analysis and iteration.
Attempts:
2 left
💡 Hint

Think about how data size and format affect workflow speed.