0
0
Pandasdata~20 mins

Vectorized operations vs loops in Pandas - Practice Questions

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Vectorized Operations Mastery
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
Output of vectorized addition vs loop addition
Consider a pandas DataFrame with a column of numbers. What is the output of the following code snippets?
Pandas
import pandas as pd

df = pd.DataFrame({'A': [1, 2, 3, 4, 5]})

# Vectorized addition
result_vectorized = df['A'] + 10

# Loop addition
result_loop = []
for x in df['A']:
    result_loop.append(x + 10)

print(result_vectorized.tolist())
print(result_loop)
A[11, 12, 13, 14, 15] and [11, 12, 13, 14, 15]
B[11, 12, 13, 14, 15] and [1, 2, 3, 4, 5]
C[11, 12, 13, 14, 15] and [10, 20, 30, 40, 50]
D[1, 2, 3, 4, 5] and [11, 12, 13, 14, 15]
Attempts:
2 left
💡 Hint
Think about how vectorized operations apply the operation to each element automatically.
data_output
intermediate
1:30remaining
Performance difference in execution time
Which of the following statements correctly describes the typical performance difference between vectorized operations and loops in pandas?
ALoops and vectorized operations have the same performance in pandas.
BVectorized operations are usually slower than loops because they add overhead.
CVectorized operations are usually faster than loops because they use optimized C code internally.
DLoops are faster because they allow more control over each element.
Attempts:
2 left
💡 Hint
Think about how pandas and numpy are implemented under the hood.
🔧 Debug
advanced
2:00remaining
Identify the error in loop vs vectorized operation
What error will the following code produce and why?
Pandas
import pandas as pd

df = pd.DataFrame({'A': [1, 2, 3]})

# Incorrect loop to add 5
result = []
for x in df['A']:
    result.append(x + '5')

print(result)
AValueError: cannot add string to integer
BTypeError: unsupported operand type(s) for +: 'int' and 'str'
CNo error, output: [6, 7, 8]
DSyntaxError: invalid syntax
Attempts:
2 left
💡 Hint
Check the data types involved in the addition inside the loop.
visualization
advanced
2:30remaining
Visualizing speed difference between vectorized and loop operations
You run the following code to compare execution times. Which plot correctly shows the expected result?
Pandas
import pandas as pd
import numpy as np
import time
import matplotlib.pyplot as plt

size = 100000
s = pd.Series(np.arange(size))

start = time.time()
result_vec = s + 1
vec_time = time.time() - start

start = time.time()
result_loop = []
for x in s:
    result_loop.append(x + 1)
loop_time = time.time() - start

plt.bar(['Vectorized', 'Loop'], [vec_time, loop_time])
plt.ylabel('Time in seconds')
plt.title('Execution time comparison')
plt.show()
ABar plot showing Vectorized time much smaller than Loop time
BBar plot showing Vectorized time much larger than Loop time
CBar plot showing Vectorized and Loop times almost equal
DLine plot showing Vectorized time increasing and Loop time decreasing
Attempts:
2 left
💡 Hint
Think about how pandas optimizes vectorized operations compared to Python loops.
🧠 Conceptual
expert
3:00remaining
Why vectorized operations are preferred in pandas
Which of the following is NOT a reason why vectorized operations are preferred over loops in pandas?
AVectorized operations reduce the amount of Python code and improve readability.
BVectorized operations minimize overhead by avoiding explicit Python loops.
CVectorized operations leverage optimized C code for faster execution.
DVectorized operations automatically parallelize computations across multiple CPU cores.
Attempts:
2 left
💡 Hint
Consider what pandas does internally and what it does not do automatically.