0
0
NumPydata~20 mins

NumPy with Pandas integration - Practice Problems & Coding Challenges

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
NumPy-Pandas Integration Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
Output of NumPy array used in a Pandas DataFrame
What is the output of the following code snippet?
NumPy
import numpy as np
import pandas as pd

arr = np.array([10, 20, 30, 40])
df = pd.DataFrame({'values': arr})
print(df['values'] * 2)
A
0    20
1    40
2    60
3    80
Name: values, dtype: int64
B[20 40 60 80]
C[10 20 30 40]
DTypeError: unsupported operand type(s) for *: 'DataFrame' and 'int'
Attempts:
2 left
💡 Hint
Remember that multiplying a Pandas Series by a number multiplies each element.
data_output
intermediate
1:30remaining
Shape of NumPy array extracted from DataFrame
Given the code below, what is the shape of the NumPy array 'arr'?
NumPy
import pandas as pd
import numpy as np

df = pd.DataFrame({'A': range(3), 'B': range(3, 6)})
arr = df.to_numpy()
A(3, 2)
B(2, 3)
C(3,)
D(2,)
Attempts:
2 left
💡 Hint
Check how many rows and columns the DataFrame has.
visualization
advanced
2:30remaining
Visualizing correlation matrix using NumPy and Pandas
Which option correctly produces a heatmap visualization of the correlation matrix of the DataFrame 'df' using NumPy and Pandas?
NumPy
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

df = pd.DataFrame({
    'X': np.random.randn(100),
    'Y': np.random.randn(100),
    'Z': np.random.randn(100)
})
A
corr = df.corr()
plt.plot(corr)
plt.show()
B
corr = np.corrcoef(df)
plt.imshow(corr, cmap='coolwarm')
plt.colorbar()
plt.show()
C
corr = df.corr().values
plt.scatter(corr)
plt.show()
D
corr = df.corr().to_numpy()
plt.imshow(corr, cmap='coolwarm')
plt.colorbar()
plt.show()
Attempts:
2 left
💡 Hint
Use the DataFrame's built-in correlation method and a heatmap for visualization.
🔧 Debug
advanced
2:00remaining
Identify the error in NumPy array assignment from DataFrame
What error will the following code produce?
NumPy
import pandas as pd
import numpy as np

df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]})
arr = np.array(df['A', 'B'])
AIndexError: too many indices for array
BKeyError: ('A', 'B')
CNo error, arr is a NumPy array with columns A and B
DTypeError: unhashable type: 'slice'
Attempts:
2 left
💡 Hint
Check how columns are accessed in a DataFrame.
🚀 Application
expert
3:00remaining
Efficiently replace NaN values in DataFrame using NumPy
You have a DataFrame 'df' with some NaN values. Which option efficiently replaces all NaNs with the mean of their respective columns using NumPy?
NumPy
import pandas as pd
import numpy as np

df = pd.DataFrame({
    'A': [1, np.nan, 3],
    'B': [4, 5, np.nan]
})
A
for col in df.columns:
    df[col].fillna(df[col].mean(), inplace=True)
Bdf.fillna(df.mean(), inplace=True)
C
arr = df.to_numpy()
col_means = np.nanmean(arr, axis=0)
inds = np.where(np.isnan(arr))
arr[inds] = np.take(col_means, inds[1])
df.loc[:, :] = arr
D
arr = df.values
arr[np.isnan(arr)] = 0
df[:] = arr
Attempts:
2 left
💡 Hint
Use NumPy functions to compute means and replace NaNs in the array, then assign back.