Challenge - 5 Problems
NumPy-Pandas Integration Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
Output of NumPy array used in a Pandas DataFrame
What is the output of the following code snippet?
NumPy
import numpy as np import pandas as pd arr = np.array([10, 20, 30, 40]) df = pd.DataFrame({'values': arr}) print(df['values'] * 2)
Attempts:
2 left
💡 Hint
Remember that multiplying a Pandas Series by a number multiplies each element.
✗ Incorrect
The DataFrame column 'values' is a Pandas Series. Multiplying it by 2 doubles each element and preserves the index and name.
❓ data_output
intermediate1:30remaining
Shape of NumPy array extracted from DataFrame
Given the code below, what is the shape of the NumPy array 'arr'?
NumPy
import pandas as pd import numpy as np df = pd.DataFrame({'A': range(3), 'B': range(3, 6)}) arr = df.to_numpy()
Attempts:
2 left
💡 Hint
Check how many rows and columns the DataFrame has.
✗ Incorrect
The DataFrame has 3 rows and 2 columns, so the NumPy array has shape (3, 2).
❓ visualization
advanced2:30remaining
Visualizing correlation matrix using NumPy and Pandas
Which option correctly produces a heatmap visualization of the correlation matrix of the DataFrame 'df' using NumPy and Pandas?
NumPy
import pandas as pd import numpy as np import matplotlib.pyplot as plt df = pd.DataFrame({ 'X': np.random.randn(100), 'Y': np.random.randn(100), 'Z': np.random.randn(100) })
Attempts:
2 left
💡 Hint
Use the DataFrame's built-in correlation method and a heatmap for visualization.
✗ Incorrect
Option D correctly computes the correlation matrix as a NumPy array and visualizes it as a heatmap with colorbar.
🔧 Debug
advanced2:00remaining
Identify the error in NumPy array assignment from DataFrame
What error will the following code produce?
NumPy
import pandas as pd import numpy as np df = pd.DataFrame({'A': [1, 2], 'B': [3, 4]}) arr = np.array(df['A', 'B'])
Attempts:
2 left
💡 Hint
Check how columns are accessed in a DataFrame.
✗ Incorrect
df['A', 'B'] tries to access a single key that is a tuple, which does not exist, causing a KeyError.
🚀 Application
expert3:00remaining
Efficiently replace NaN values in DataFrame using NumPy
You have a DataFrame 'df' with some NaN values. Which option efficiently replaces all NaNs with the mean of their respective columns using NumPy?
NumPy
import pandas as pd import numpy as np df = pd.DataFrame({ 'A': [1, np.nan, 3], 'B': [4, 5, np.nan] })
Attempts:
2 left
💡 Hint
Use NumPy functions to compute means and replace NaNs in the array, then assign back.
✗ Incorrect
Option C uses NumPy's nanmean and indexing to replace NaNs efficiently in the underlying array, then updates the DataFrame.