0
0
Pandasdata~20 mins

Reading Excel files with read_excel in Pandas - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Excel Reader Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
Output of reading specific sheet from Excel
What is the output of this code snippet when reading the 'Sales' sheet from an Excel file?
Pandas
import pandas as pd
from io import BytesIO
import numpy as np

# Create a sample Excel file in memory
excel_file = BytesIO()
with pd.ExcelWriter(excel_file, engine='xlsxwriter') as writer:
    pd.DataFrame({'Month': ['Jan', 'Feb'], 'Sales': [100, 150]}).to_excel(writer, sheet_name='Sales', index=False)
    pd.DataFrame({'Month': ['Jan', 'Feb'], 'Expenses': [80, 90]}).to_excel(writer, sheet_name='Expenses', index=False)

excel_file.seek(0)

# Read only the 'Sales' sheet
sales_df = pd.read_excel(excel_file, sheet_name='Sales')
print(sales_df)
A
  Month  Sales
0   Jan    100
1   Feb    150
B
  Month  Expenses
0   Jan        80
1   Feb        90
C
Empty DataFrame
Columns: []
Index: []
DRaises FileNotFoundError
Attempts:
2 left
💡 Hint
Check which sheet is being read by the read_excel function.
data_output
intermediate
2:00remaining
Number of rows read from Excel with skiprows
Given an Excel file with 5 rows of data, what is the number of rows in the DataFrame after reading with skiprows=2?
Pandas
import pandas as pd
from io import BytesIO

# Create Excel file with 5 rows
excel_file = BytesIO()
data = {'A': range(5), 'B': range(5, 10)}
df = pd.DataFrame(data)
with pd.ExcelWriter(excel_file, engine='xlsxwriter') as writer:
    df.to_excel(writer, index=False)
excel_file.seek(0)

# Read Excel skipping first 2 rows
read_df = pd.read_excel(excel_file, skiprows=2)
print(len(read_df))
A3
B2
C5
D0
Attempts:
2 left
💡 Hint
skiprows skips the first rows before reading data.
🔧 Debug
advanced
2:00remaining
Identify the error when reading Excel with wrong engine
What error will this code raise when trying to read an Excel file but the file is actually a CSV?
Pandas
import pandas as pd
from io import StringIO

csv_data = "A,B\n1,2\n3,4"
csv_file = StringIO(csv_data)

# Attempt to read CSV data as Excel
pd.read_excel(csv_file)
AXLRDError: Unsupported format, or corrupt file
BFileNotFoundError
CValueError: Excel file format cannot be determined, you must specify an engine manually.
DTypeError: read_excel() got an unexpected keyword argument 'engine'
Attempts:
2 left
💡 Hint
The input is not an Excel file but read_excel expects one.
visualization
advanced
2:00remaining
Plotting data read from Excel
After reading this Excel sheet, what will the plot show?
Pandas
import pandas as pd
import matplotlib.pyplot as plt
from io import BytesIO

excel_file = BytesIO()
with pd.ExcelWriter(excel_file, engine='xlsxwriter') as writer:
    pd.DataFrame({'Year': [2020, 2021, 2022], 'Revenue': [1000, 1500, 2000]}).to_excel(writer, index=False)
excel_file.seek(0)

df = pd.read_excel(excel_file)
plt.plot(df['Year'], df['Revenue'])
plt.xlabel('Year')
plt.ylabel('Revenue')
plt.title('Yearly Revenue')
plt.grid(True)
plt.show()
AA bar chart with Year on x-axis and Revenue values
BA line plot with Year on x-axis and Revenue increasing from 1000 to 2000
CA scatter plot with random points
DNo plot is shown due to error
Attempts:
2 left
💡 Hint
Look at the plt.plot function and the data columns used.
🧠 Conceptual
expert
2:00remaining
Effect of usecols parameter in read_excel
If an Excel file has columns 'A', 'B', 'C', 'D', what columns will be read with usecols='B:D'?
AColumns 'C' and 'D' only
BColumns 'A', 'B', and 'C'
COnly column 'B'
DColumns 'B', 'C', and 'D'
Attempts:
2 left
💡 Hint
usecols='B:D' means columns from B to D inclusive.