Complete the code to read a CSV file into a DataFrame using pandas.
import pandas as pd data = pd.read_csv([1])
To read a CSV file, the filename must be a string in quotes.
Complete the code to split data into training and testing sets using scikit-learn.
from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=[1], random_state=42)
test_size=0.2 means 20% of data is for testing, which is common practice.
Fix the error in the code to automate data scaling with StandardScaler.
from sklearn.preprocessing import StandardScaler scaler = StandardScaler() X_scaled = scaler.[1](X_train)
fit_transform fits the scaler and transforms the data in one step.
Fill both blanks to create a pipeline that scales data and fits a logistic regression model.
from sklearn.pipeline import Pipeline from sklearn.linear_model import LogisticRegression pipeline = Pipeline([('scaler', [1]()), ('model', [2]())])
The pipeline first scales data with StandardScaler, then fits LogisticRegression.
Fill all three blanks to automate training, prediction, and accuracy calculation.
pipeline.fit([1], [2]) predictions = pipeline.predict([3]) from sklearn.metrics import accuracy_score accuracy = accuracy_score(y_test, predictions)
Train with X_train and y_train, predict on X_test, then compare predictions to y_test.