Content-based filtering helps recommend items by looking at what you like. It finds similar things based on item features.
0
0
Content-based filtering in ML Python
Introduction
When you want to suggest movies similar to ones a user has watched.
When recommending books based on the genres a user prefers.
When suggesting songs that match the style of a user's favorite tracks.
When an online store wants to show products similar to what a customer viewed.
When a news app wants to recommend articles related to what a reader has read.
Syntax
ML Python
1. Collect features of items (like genre, keywords). 2. Represent items as vectors using these features. 3. For a user, find items similar to those they liked by comparing vectors. 4. Recommend the most similar items.
Features can be simple tags or complex descriptions.
Similarity is often measured using cosine similarity or distance metrics.
Examples
We compare items by counting common keywords to find similarity.
ML Python
# Example: Item features as keywords item1 = ['action', 'adventure'] item2 = ['action', 'thriller'] # Compare similarity based on shared keywords
Text descriptions are converted into number vectors to compare items.
ML Python
# Example: Using TF-IDF vectors for item descriptions from sklearn.feature_extraction.text import TfidfVectorizer items = ['fast car racing', 'slow romantic movie'] vectorizer = TfidfVectorizer() vectors = vectorizer.fit_transform(items)
Sample Model
This program shows how to recommend items similar to one the user liked using content-based filtering. It uses text descriptions and cosine similarity.
ML Python
from sklearn.feature_extraction.text import TfidfVectorizer from sklearn.metrics.pairwise import cosine_similarity # Sample items with descriptions items = [ 'action adventure hero', 'romantic comedy love', 'action thriller spy', 'romantic drama relationship' ] # User liked item index 0 (action adventure hero) user_liked_index = 0 # Convert item descriptions to vectors vectorizer = TfidfVectorizer() vectors = vectorizer.fit_transform(items) # Calculate similarity of all items to the liked item similarities = cosine_similarity(vectors[user_liked_index], vectors).flatten() # Get indices of items sorted by similarity (excluding the liked item itself) recommended_indices = similarities.argsort()[::-1][1:3] # Print recommended items print('User liked:', items[user_liked_index]) print('Recommended items:') for i in recommended_indices: print(f'- {items[i]} (similarity: {similarities[i]:.2f})')
OutputSuccess
Important Notes
Content-based filtering only needs data about items, not other users.
It works well when item features are clear and descriptive.
It may recommend items similar to what the user already knows, limiting diversity.
Summary
Content-based filtering recommends items by comparing their features.
It uses item descriptions or tags to find similar items.
This method personalizes recommendations based on what the user liked before.