Challenge - 5 Problems

🎖️

Exploratory Inspection Mastery

Get all challenges correct to earn this badge!

Test your skills under time pressure!

🧠 Conceptual

intermediate

2:00remaining

Why is exploratory data inspection important before analysis?

Which of the following best explains why we perform exploratory data inspection before starting formal analysis?

ATo finalize conclusions without looking at the data distribution.

BTo understand data quality, spot missing values, and detect unusual patterns that affect analysis.

CTo immediately apply machine learning models without checking data.

DTo skip data cleaning and jump directly to visualization.

Attempts:

2 left

❓ data_output

intermediate

2:00remaining

Output of basic exploratory commands on a dataset

Given the dataset below, what is the output of df.describe()?

import pandas as pd
data = {'age': [25, 30, 22, 40, 28], 'income': [50000, 60000, 45000, 80000, 52000]}
df = pd.DataFrame(data)
print(df.describe())

       age        income
count   5.0      5.000000
mean   29.0  57400.000000
std     7.0  13629.969289
min    22.0  45000.000000
25%    25.0  50000.000000
50%    28.0  52000.000000
75%    30.0  60000.000000
max    40.0  80000.000000

       age        income
count   4.0      4.000000
mean   29.0  57400.000000
std     7.0  13629.969289
min    22.0  45000.000000
25%    25.0  50000.000000
50%    28.0  52000.000000
75%    30.0  60000.000000
max    40.0  80000.000000

       age        income
count   5.0      5.000000
mean   30.0  57400.000000
std     7.0  13629.969289
min    22.0  45000.000000
25%    25.0  50000.000000
50%    28.0  52000.000000
75%    30.0  60000.000000
max    40.0  80000.000000

       age        income
count   5.0      5.000000
mean   29.0  57000.000000
std     7.0  13629.969289
min    22.0  45000.000000
25%    25.0  50000.000000
50%    28.0  52000.000000
75%    30.0  60000.000000
max    40.0  80000.000000

Attempts:

2 left

❓ visualization

advanced

2:00remaining

Identifying outliers with boxplot visualization

Which boxplot below correctly shows an outlier in the dataset [10, 12, 12, 13, 14, 15, 100]?

Data Analysis Python

import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns

data = [10, 12, 12, 13, 14, 15, 100]
df = pd.DataFrame({'values': data})
sns.boxplot(x='values', data=df)
plt.show()

ABoxplot with whiskers extending beyond 100.

BBoxplot showing multiple outliers below 10.

CBoxplot with a single point far above the upper whisker at 100, indicating an outlier.

DBoxplot with no points outside whiskers, all data within range 10 to 100.

Attempts:

2 left

🔧 Debug

advanced

2:00remaining

Error in inspecting missing data with pandas

What error does the following code produce?

import pandas as pd
data = {'A': [1, 2, None], 'B': [4, None, 6]}
df = pd.DataFrame(data)
print(df.isnull().sum())
print(df.missing())

ANo error, prints counts of missing values

BTypeError: unsupported operand type(s) for +: 'int' and 'NoneType'

CKeyError: 'missing'

DAttributeError: 'DataFrame' object has no attribute 'missing'

Attempts:

2 left

🚀 Application

expert

3:00remaining

Choosing next steps after exploratory inspection

You have a dataset with many missing values in some columns and a few extreme outliers in others. After exploratory inspection, what is the best next step?

ARemove or impute missing values and consider transforming or removing outliers before analysis.

BIgnore missing values and outliers and proceed with analysis as is.

COnly visualize data without cleaning or transformation.

DDelete entire dataset and start over with new data.

Attempts:

2 left