Bird
Raised Fist0
NLPml~20 mins

Hybrid approaches in NLP - Practice Problems & Coding Challenges

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Challenge - 5 Problems
🎖️
Hybrid NLP Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
🧠 Conceptual
intermediate
2:00remaining
Understanding Hybrid Models in NLP

Which of the following best describes a hybrid approach in Natural Language Processing (NLP)?

ACombining rule-based methods with machine learning models to improve text understanding.
BUsing only deep learning models without any handcrafted rules.
CApplying unsupervised learning exclusively for text classification.
DRelying solely on dictionary lookups for language translation.
Attempts:
2 left
💡 Hint

Think about mixing different techniques to get better results.

Model Choice
intermediate
2:00remaining
Choosing Models for a Hybrid Sentiment Analysis System

You want to build a sentiment analysis system that uses both a lexicon-based method and a machine learning classifier. Which combination below fits a hybrid approach?

AUse only a pre-trained transformer model without any lexicon.
BUse a sentiment dictionary to score words and a logistic regression model trained on labeled reviews.
CApply k-means clustering on unlabeled text data.
DUse a rule-based system that assigns sentiment based on fixed patterns only.
Attempts:
2 left
💡 Hint

Look for a mix of dictionary and machine learning.

Predict Output
advanced
3:00remaining
Output of Hybrid Text Classification Pipeline

What is the output of the following Python code that combines TF-IDF features with a rule-based keyword count for classification?

NLP
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.linear_model import LogisticRegression
import numpy as np

texts = ["I love sunny days", "I hate rain", "Sunny weather is great", "Rainy days are gloomy"]
labels = [1, 0, 1, 0]

# Rule-based feature: count of positive words
positive_words = {'love', 'sunny', 'great'}
rule_features = np.array([[sum(word in positive_words for word in text.lower().split())] for text in texts])

# TF-IDF features
vectorizer = TfidfVectorizer()
tfidf_features = vectorizer.fit_transform(texts).toarray()

# Combine features
X = np.hstack((tfidf_features, rule_features))

model = LogisticRegression().fit(X, labels)
predictions = model.predict(X)
print(predictions.tolist())
A[0, 0, 0, 0]
B[0, 1, 0, 1]
C[1, 0, 1, 0]
D[1, 1, 1, 1]
Attempts:
2 left
💡 Hint

Check how the rule-based feature and TF-IDF features help the logistic regression model.

Hyperparameter
advanced
2:00remaining
Tuning Hybrid Model Parameters

In a hybrid NLP model combining a rule-based sentiment score and a neural network, which hyperparameter adjustment is most likely to improve the balance between the two components?

AIncrease the number of epochs for training the neural network only.
BUse a smaller batch size without changing the model architecture.
CRemove the rule-based component to simplify the model.
DAdjust the weight given to the rule-based score in the final prediction layer.
Attempts:
2 left
💡 Hint

Think about how to control the influence of each part in the combined output.

🔧 Debug
expert
3:00remaining
Debugging a Hybrid NLP Pipeline Error

Consider this hybrid NLP pipeline code snippet that combines a rule-based feature with a machine learning model. It raises a ValueError: shapes (4,5) and (4,1) not aligned. What is the cause?

NLP
import numpy as np
from sklearn.linear_model import LogisticRegression

texts = ["happy day", "sad night", "joyful morning", "gloomy evening"]
labels = [1, 0, 1, 0]

# Rule-based feature: count of positive words
positive_words = {'happy', 'joyful'}
rule_features = np.array([[sum(word in positive_words for word in text.split())] for text in texts])

# Dummy TF-IDF features with wrong shape
tfidf_features = np.random.rand(4, 5)

# Incorrect feature combination
X = np.dot(tfidf_features, rule_features)

model = LogisticRegression().fit(X, labels)
AUsing np.dot to combine features with incompatible shapes causes the error.
BThe rule-based feature calculation is incorrect and returns empty arrays.
CLogisticRegression cannot be trained on combined features.
DLabels array length does not match feature rows.
Attempts:
2 left
💡 Hint

Check how features are combined and their shapes.

Practice

(1/5)
1. What is the main benefit of using hybrid approaches in NLP?
easy
A. They ignore language context to simplify processing.
B. They rely only on large datasets for training.
C. They use only handcrafted rules without learning.
D. They combine rules and machine learning to improve understanding.

Solution

  1. Step 1: Understand hybrid approach components

    Hybrid approaches mix handcrafted rules and machine learning models.
  2. Step 2: Identify the benefit

    This mix improves language understanding by using strengths of both methods.
  3. Final Answer:

    They combine rules and machine learning to improve understanding. -> Option D
  4. Quick Check:

    Hybrid = rules + ML [OK]
Hint: Hybrid means mixing rules and learning for better results [OK]
Common Mistakes:
  • Thinking hybrid uses only rules
  • Assuming hybrid needs huge data only
  • Believing hybrid ignores language context
2. Which of the following is the correct way to combine rule-based and machine learning outputs in a hybrid NLP system?
easy
A. Combine outputs by voting or weighted averaging.
B. Apply rules first, then use machine learning on the filtered data.
C. Use only the machine learning output and ignore rules.
D. Run rules and machine learning separately without combining results.

Solution

  1. Step 1: Understand output combination methods

    Hybrid systems combine rule and ML outputs to improve accuracy.
  2. Step 2: Identify correct combination method

    Voting or weighted averaging merges predictions effectively.
  3. Final Answer:

    Combine outputs by voting or weighted averaging. -> Option A
  4. Quick Check:

    Combine outputs = voting/averaging [OK]
Hint: Combine outputs smartly using voting or weights [OK]
Common Mistakes:
  • Ignoring rule outputs
  • Not combining results at all
  • Applying rules after ML without filtering
3. Consider this Python code snippet combining rule and ML predictions:
rule_pred = [1, 0, 1, 1]
ml_pred = [1, 1, 0, 1]
combined = [int(r or m) for r, m in zip(rule_pred, ml_pred)]
print(combined)
What is the output?
medium
A. [0, 1, 1, 0]
B. [1, 0, 0, 1]
C. [1, 1, 1, 1]
D. [1, 1, 0, 0]

Solution

  1. Step 1: Understand the logic of combining predictions

    The code uses logical OR between rule_pred and ml_pred elements.
  2. Step 2: Calculate each combined element

    Positions: 1 or 1 = 1, 0 or 1 = 1, 1 or 0 = 1, 1 or 1 = 1.
  3. Final Answer:

    [1, 1, 1, 1] -> Option C
  4. Quick Check:

    OR operation on lists = [1,1,1,1] [OK]
Hint: OR means if either is 1, result is 1 [OK]
Common Mistakes:
  • Confusing OR with AND
  • Mixing up list positions
  • Forgetting to convert boolean to int
4. This code tries to combine rule and ML outputs but has a bug:
rule_pred = [True, False, True]
ml_pred = [False, False, True]
combined = [r and m for r, m in zip(rule_pred, ml_pred)]
print(combined)
What is the bug and how to fix it?
medium
A. Bug: Using AND drops some positives; fix by using OR instead.
B. Bug: Lists have different lengths; fix by padding shorter list.
C. Bug: Using booleans instead of integers; fix by casting to int.
D. Bug: zip is incorrect; fix by using enumerate instead.

Solution

  1. Step 1: Analyze the logical operation used

    The code uses AND, which requires both to be True to get True.
  2. Step 2: Identify why this causes a problem

    AND drops positives where only one prediction is True, losing some correct results.
  3. Step 3: Suggest fix

    Using OR keeps positives if either prediction is True, improving recall.
  4. Final Answer:

    Bug: Using AND drops some positives; fix by using OR instead. -> Option A
  5. Quick Check:

    AND drops positives; OR fixes [OK]
Hint: Use OR to keep positives from either source [OK]
Common Mistakes:
  • Thinking zip causes error
  • Confusing booleans with integers
  • Ignoring logical operation impact
5. You have a small dataset and want to build an NLP system for sentiment analysis. Which hybrid approach is best to improve accuracy?
hard
A. Train a deep neural network only, ignoring rules.
B. Use handcrafted rules to catch key sentiment words, then train a simple ML model on remaining data.
C. Use only handcrafted rules without any machine learning.
D. Randomly guess sentiment labels to save time.

Solution

  1. Step 1: Consider dataset size and approach

    Small data limits deep learning effectiveness; rules help catch key patterns.
  2. Step 2: Combine rules and ML effectively

    Use rules for important sentiment words, then train ML on leftover data for better coverage.
  3. Final Answer:

    Use handcrafted rules to catch key sentiment words, then train a simple ML model on remaining data. -> Option B
  4. Quick Check:

    Small data + rules + ML = best hybrid [OK]
Hint: Use rules for key words, ML for rest on small data [OK]
Common Mistakes:
  • Relying only on deep learning with little data
  • Ignoring machine learning completely
  • Guessing randomly instead of using data