0
0
Data Analysis Pythondata~20 mins

Exploratory Data Analysis (EDA) template in Data Analysis Python - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
EDA Mastery Badge
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
What is the output of this code snippet for missing value count?

Given a DataFrame df with some missing values, what does the following code output?

Data Analysis Python
import pandas as pd
import numpy as np
data = {'A': [1, 2, np.nan, 4], 'B': [np.nan, 2, 3, 4]}
df = pd.DataFrame(data)
missing_counts = df.isnull().sum()
print(missing_counts)
A
A    2
B    2
dtype: int64
B
A    0
B    0
dtype: int64
C
A    1
B    2
dtype: int64
D
A    1
B    1
dtype: int64
Attempts:
2 left
💡 Hint

Use isnull() to find missing values, then sum() counts them per column.

data_output
intermediate
2:00remaining
What is the shape of the DataFrame after filtering?

Consider this DataFrame df. After filtering rows where column 'Age' is greater than 30, what is the shape of the resulting DataFrame?

Data Analysis Python
import pandas as pd
data = {'Name': ['Alice', 'Bob', 'Charlie', 'David'], 'Age': [25, 35, 30, 40]}
df = pd.DataFrame(data)
filtered_df = df[df['Age'] > 30]
print(filtered_df.shape)
A(3, 2)
B(2, 2)
C(1, 2)
D(4, 2)
Attempts:
2 left
💡 Hint

Count rows where 'Age' > 30.

visualization
advanced
2:00remaining
Which plot shows the distribution of 'Salary' correctly?

You have a DataFrame df with a 'Salary' column. Which code snippet produces a histogram of 'Salary' with 10 bins?

Adf['Salary'].hist(bins=10)
Bdf.hist(column='Salary', bins=10)
Cdf['Salary'].plot(kind='hist', bins=10)
DAll of the above
Attempts:
2 left
💡 Hint

All these methods create histograms with 10 bins.

🔧 Debug
advanced
2:00remaining
What error does this code raise?

What error occurs when running this code?

Data Analysis Python
import pandas as pd
data = {'A': [1, 2, 3], 'B': [4, 5, 6]}
df = pd.DataFrame(data)
mean_val = df['C'].mean()
AKeyError
BAttributeError
CTypeError
DNo error, returns NaN
Attempts:
2 left
💡 Hint

Check if column 'C' exists in the DataFrame.

🚀 Application
expert
3:00remaining
Which code snippet correctly calculates the correlation matrix?

Given a DataFrame df with numeric columns, which code correctly computes the correlation matrix?

Data Analysis Python
import pandas as pd
data = {'X': [1, 2, 3, 4], 'Y': [4, 3, 2, 1], 'Z': [10, 20, 30, 40]}
df = pd.DataFrame(data)
A
corr_matrix = df.corr()
print(corr_matrix)
B
corr_matrix = pd.corr(df)
print(corr_matrix)
C
corr_matrix = df.corrcoef()
print(corr_matrix)
D
corr_matrix = df.correlation()
print(corr_matrix)
Attempts:
2 left
💡 Hint

Use the pandas method for correlation matrix.