Challenge - 5 Problems

🎖️

NumPy-Pandas Integration Master

Get all challenges correct to earn this badge!

Test your skills under time pressure!

❓ Predict Output

intermediate

2:00remaining

Output of NumPy array used in a Pandas DataFrame

What is the output of the following code snippet?

NumPy

import numpy as np
import pandas as pd

arr = np.array([10, 20, 30, 40])
df = pd.DataFrame({'values': arr})
print(df['values'] * 2)

0    20
1    40
2    60
3    80
Name: values, dtype: int64

B[20 40 60 80]

C[10 20 30 40]

DTypeError: unsupported operand type(s) for *: 'DataFrame' and 'int'

Attempts:

2 left

❓ data_output

intermediate

1:30remaining

Shape of NumPy array extracted from DataFrame

Given the code below, what is the shape of the NumPy array 'arr'?

NumPy

import pandas as pd
import numpy as np

df = pd.DataFrame({'A': range(3), 'B': range(3, 6)})
arr = df.to_numpy()

A(3, 2)

B(2, 3)

C(3,)

D(2,)

Attempts:

2 left

❓ visualization

advanced

2:30remaining

Visualizing correlation matrix using NumPy and Pandas

Which option correctly produces a heatmap visualization of the correlation matrix of the DataFrame 'df' using NumPy and Pandas?

NumPy

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

df = pd.DataFrame({
    'X': np.random.randn(100),
    'Y': np.random.randn(100),
    'Z': np.random.randn(100)
})

corr = df.corr()
plt.plot(corr)
plt.show()

corr = np.corrcoef(df)
plt.imshow(corr, cmap='coolwarm')
plt.colorbar()
plt.show()

corr = df.corr().values
plt.scatter(corr)
plt.show()

corr = df.corr().to_numpy()
plt.imshow(corr, cmap='coolwarm')
plt.colorbar()
plt.show()

Attempts:

2 left

🔧 Debug

advanced

2:00remaining

Identify the error in NumPy array assignment from DataFrame

What error will the following code produce?

NumPy

import pandas as pd
import numpy as np

df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
arr = np.array(df['A', 'B'])

AIndexError: too many indices for array

BKeyError: ('A', 'B')

CNo error, arr is a NumPy array with columns A and B

DTypeError: unhashable type: 'slice'

Attempts:

2 left

🚀 Application

expert

3:00remaining

Efficiently replace NaN values in DataFrame using NumPy

You have a DataFrame 'df' with some NaN values. Which option efficiently replaces all NaNs with the mean of their respective columns using NumPy?

NumPy

import pandas as pd
import numpy as np

df = pd.DataFrame({
    'A': [1, np.nan, 3],
    'B': [4, 5, np.nan]
})

for col in df.columns:
    df[col].fillna(df[col].mean(), inplace=True)

Bdf.fillna(df.mean(), inplace=True)

arr = df.to_numpy()
col_means = np.nanmean(arr, axis=0)
inds = np.where(np.isnan(arr))
arr[inds] = np.take(col_means, inds[1])
df.loc[:, :] = arr

arr = df.values
arr[np.isnan(arr)] = 0
df[:] = arr

Attempts:

2 left