Given a CSV file with columns name and age, what does this code output?
import pandas as pd from io import StringIO csv_data = "name,age\nAlice,30\nBob,25" df = pd.read_csv(StringIO(csv_data)) print(df.head(1))
Look at the first row printed by head(1).
The head(1) method prints the first row of the DataFrame. The age column is read as integer, so no decimal or quotes appear.
After reading this JSON string into a DataFrame, how many rows does it contain?
import pandas as pd json_str = '[{"city": "NY", "pop": 8000}, {"city": "LA", "pop": 4000}]' df = pd.read_json(json_str) print(len(df))
Count the number of objects in the JSON array.
The JSON string contains two objects, so the DataFrame has two rows.
What error will this code raise?
import pandas as pd from io import StringIO csv_data = "name;age\nAlice;30\nBob;25" df = pd.read_csv(StringIO(csv_data))
What is the default delimiter for read_csv?
Since the delimiter is not specified, pandas treats the whole line as one column, so no error occurs. The DataFrame has one column with the entire line as string.
Given this Excel data read into a DataFrame, which plot correctly shows the sales column?
import pandas as pd import matplotlib.pyplot as plt from io import BytesIO import numpy as np # Simulate Excel file excel_data = BytesIO() df_original = pd.DataFrame({'month': ['Jan', 'Feb', 'Mar'], 'sales': [100, 150, 120]}) df_original.to_excel(excel_data, index=False) excel_data.seek(0) # Read Excel df = pd.read_excel(excel_data) plt.figure(figsize=(4,3)) plt.plot(df['month'], df['sales'], marker='o') plt.title('Monthly Sales') plt.xlabel('Month') plt.ylabel('Sales') plt.tight_layout() plt.savefig('plot.png') plt.close()
Look at the code: it uses plt.plot with month on x-axis.
The code creates a line plot with months as x-axis labels and sales as y-axis values.
Choose the best reason why efficient data input/output (I/O) matters in data science workflows.
Think about how data size and format affect workflow speed.
Efficient data I/O reduces waiting time, enabling faster data analysis and model development. Other options are incorrect because data format choice depends on context, missing values depend on data quality, and programming skills are still needed.