0
0
Data Analysis Pythondata~20 mins

Creating interaction features in Data Analysis Python - Practice Exercises

Choose your learning style9 modes available
Challenge - 5 Problems
🎖️
Interaction Features Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
Output of interaction feature multiplication
What is the output DataFrame after creating an interaction feature by multiplying columns 'A' and 'B'?
Data Analysis Python
import pandas as pd

df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
df['A_B'] = df['A'] * df['B']
print(df)
A{'A': [1, 2, 3], 'B': [4, 5, 6], 'A_B': [4, 5, 6]}
B{'A': [1, 2, 3], 'B': [4, 5, 6], 'A_B': [5, 7, 9]}
C{'A': [1, 2, 3], 'B': [4, 5, 6], 'A_B': [1, 2, 3]}
D{'A': [1, 2, 3], 'B': [4, 5, 6], 'A_B': [4, 10, 18]}
Attempts:
2 left
💡 Hint
Multiply each value in column 'A' by the corresponding value in column 'B'.
data_output
intermediate
1:30remaining
Number of unique interaction features created
Given a DataFrame with columns 'X', 'Y', and 'Z', how many unique pairwise interaction features can be created by multiplying two different columns?
Data Analysis Python
import pandas as pd

df = pd.DataFrame({'X': [1,2], 'Y': [3,4], 'Z': [5,6]})

# Interaction features are created by multiplying pairs of different columns
pairs = [(a,b) for i,a in enumerate(df.columns) for b in df.columns[i+1:]]
print(len(pairs))
A3
B6
C9
D2
Attempts:
2 left
💡 Hint
Count all unique pairs without repetition or order.
🔧 Debug
advanced
2:00remaining
Identify the error in interaction feature creation
What error will this code raise when trying to create an interaction feature by multiplying columns 'A' and 'B'?
Data Analysis Python
import pandas as pd

df = pd.DataFrame({'A': [1, 2, 3], 'B': ['4', '5', '6']})
df['A_B'] = df['A'] * df['B']
AValueError: invalid literal for int() with base 10
BTypeError: unsupported operand type(s) for *: 'int' and 'str'
CNo error, output is a column of strings repeated
DTypeError: can't multiply sequence by non-int of type 'int'
Attempts:
2 left
💡 Hint
Check the data types of columns before multiplication.
🚀 Application
advanced
2:30remaining
Creating polynomial interaction features with scikit-learn
Which code snippet correctly creates interaction features (degree 2) from a numeric DataFrame using scikit-learn's PolynomialFeatures?
Data Analysis Python
import pandas as pd
from sklearn.preprocessing import PolynomialFeatures

df = pd.DataFrame({'X': [1, 2], 'Y': [3, 4]})
poly = PolynomialFeatures(degree=2, interaction_only=True, include_bias=False)
X_poly = poly.fit_transform(df)
print(X_poly)
A
poly = PolynomialFeatures(degree=2, interaction_only=False, include_bias=True)
X_poly = poly.fit_transform(df)
B
poly = PolynomialFeatures(degree=3, interaction_only=True, include_bias=False)
X_poly = poly.fit_transform(df)
C
poly = PolynomialFeatures(degree=2, interaction_only=True, include_bias=False)
X_poly = poly.fit_transform(df)
D
poly = PolynomialFeatures(degree=2, interaction_only=True, include_bias=True)
X_poly = poly.fit_transform(df)
Attempts:
2 left
💡 Hint
Interaction features exclude powers of single features and bias adds a column of ones.
🧠 Conceptual
expert
1:30remaining
Effect of interaction features on model complexity
How does adding interaction features between numeric variables affect the complexity of a linear regression model?
AIt increases the number of features, which can improve model flexibility but may cause overfitting if not controlled.
BIt decreases the number of features, simplifying the model and reducing overfitting risk.
CIt has no effect on model complexity because interaction features are ignored by linear regression.
DIt always improves model accuracy without any risk of overfitting.
Attempts:
2 left
💡 Hint
Think about how adding new features changes the model's ability to fit data.