Regularization helps a model avoid overfitting by keeping it simple. It adds a small penalty to large model weights to make predictions more reliable on new data.
Regularization (Ridge, Lasso) in ML Python
from sklearn.linear_model import Ridge, Lasso ridge_model = Ridge(alpha=1.0) lasso_model = Lasso(alpha=1.0) ridge_model.fit(X_train, y_train) lasso_model.fit(X_train, y_train) ridge_predictions = ridge_model.predict(X_test) lasso_predictions = lasso_model.predict(X_test)
alpha controls how strong the penalty is. Higher alpha means more regularization.
Ridge uses squared weights penalty (L2), Lasso uses absolute weights penalty (L1).
ridge = Ridge(alpha=0.5)
ridge.fit(X_train, y_train)lasso = Lasso(alpha=0.1)
lasso.fit(X_train, y_train)ridge_predictions = ridge.predict(X_test) print(ridge_predictions[:5])
This program creates a simple regression dataset, splits it, trains Ridge and Lasso models, then prints their mean squared errors and coefficients. You can see how regularization affects the model weights.
from sklearn.datasets import make_regression from sklearn.linear_model import Ridge, Lasso from sklearn.model_selection import train_test_split from sklearn.metrics import mean_squared_error # Create sample data X, y = make_regression(n_samples=100, n_features=5, noise=10, random_state=42) # Split data X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42) # Create models ridge = Ridge(alpha=1.0) lasso = Lasso(alpha=1.0) # Train models ridge.fit(X_train, y_train) lasso.fit(X_train, y_train) # Predict ridge_pred = ridge.predict(X_test) lasso_pred = lasso.predict(X_test) # Calculate errors ridge_mse = mean_squared_error(y_test, ridge_pred) lasso_mse = mean_squared_error(y_test, lasso_pred) print(f"Ridge MSE: {ridge_mse:.2f}") print(f"Lasso MSE: {lasso_mse:.2f}") print(f"Ridge Coefficients: {ridge.coef_}") print(f"Lasso Coefficients: {lasso.coef_}")
Ridge keeps all features but shrinks their weights.
Lasso can shrink some weights exactly to zero, effectively selecting features.
Choosing the right alpha is important; use cross-validation to find the best value.
Regularization helps prevent overfitting by adding a penalty to large weights.
Ridge uses L2 penalty and shrinks weights but keeps all features.
Lasso uses L1 penalty and can remove less important features by setting weights to zero.