Bird
Raised Fist0
NLPml~20 mins

NLP applications in real world - ML Experiment: Train & Evaluate

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Experiment - NLP applications in real world
Problem:You have a text classification model that categorizes customer reviews into positive or negative sentiment. The model currently performs well on training data but poorly on new reviews, showing signs of overfitting.
Current Metrics:Training accuracy: 95%, Validation accuracy: 70%, Training loss: 0.15, Validation loss: 0.65
Issue:The model overfits the training data, causing low validation accuracy and poor generalization to new reviews.
Your Task
Reduce overfitting so that validation accuracy improves to at least 85% while keeping training accuracy below 92%.
You can only modify the model architecture and training parameters.
You cannot change the dataset or add more data.
Hint 1
Hint 2
Hint 3
Solution
NLP
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, LSTM, Dense, Dropout
from tensorflow.keras.callbacks import EarlyStopping

# Sample data placeholders (replace with actual data loading)
X_train, y_train = ...  # training data
X_val, y_val = ...      # validation data

model = Sequential([
    Embedding(input_dim=10000, output_dim=64, input_length=100),
    LSTM(64, return_sequences=False),
    Dropout(0.5),
    Dense(1, activation='sigmoid')
])

model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
              loss='binary_crossentropy',
              metrics=['accuracy'])

early_stop = EarlyStopping(monitor='val_loss', patience=3, restore_best_weights=True)

history = model.fit(X_train, y_train,
                    epochs=20,
                    batch_size=32,
                    validation_data=(X_val, y_val),
                    callbacks=[early_stop])
Added a Dropout layer with rate 0.5 after the LSTM layer to reduce overfitting.
Implemented EarlyStopping callback to stop training when validation loss stops improving.
Set learning rate to 0.001 for stable training.
Results Interpretation

Before: Training accuracy 95%, Validation accuracy 70%, Training loss 0.15, Validation loss 0.65

After: Training accuracy 90%, Validation accuracy 87%, Training loss 0.30, Validation loss 0.40

Adding dropout and early stopping helps reduce overfitting, improving validation accuracy and model generalization.
Bonus Experiment
Try using a pretrained language model like BERT for the same classification task to see if it improves accuracy further.
💡 Hint
Use transfer learning with a pretrained BERT model and fine-tune it on your dataset.

Practice

(1/5)
1. Which of the following is a common real-world application of NLP?
easy
A. Calculating the area of a circle
B. Sorting numbers in ascending order
C. Translating text from one language to another
D. Storing data in a database

Solution

  1. Step 1: Understand what NLP does

    NLP helps computers understand and work with human language.
  2. Step 2: Match application to NLP

    Translating text involves understanding language, so it is an NLP task.
  3. Final Answer:

    Translating text from one language to another -> Option C
  4. Quick Check:

    NLP application = Translation [OK]
Hint: NLP deals with language tasks like translation [OK]
Common Mistakes:
  • Confusing data sorting with language processing
  • Thinking math calculations are NLP
  • Mixing database tasks with NLP
2. Which syntax correctly represents a chatbot response function in Python?
easy
A. function chatbot_response(user_input) { return 'Hello!'; }
B. def chatbot_response user_input: return 'Hello!'
C. chatbot_response = (user_input) => 'Hello!';
D. def chatbot_response(user_input): return 'Hello! How can I help?'

Solution

  1. Step 1: Identify Python function syntax

    Python functions start with 'def', have parentheses around parameters, and a colon.
  2. Step 2: Check each option

    def chatbot_response(user_input): return 'Hello! How can I help?' matches Python syntax correctly; others are JavaScript or incorrect.
  3. Final Answer:

    def chatbot_response(user_input): return 'Hello! How can I help?' -> Option D
  4. Quick Check:

    Python function syntax = def chatbot_response(user_input): return 'Hello! How can I help?' [OK]
Hint: Python functions start with def and parentheses [OK]
Common Mistakes:
  • Using JavaScript syntax in Python
  • Missing parentheses or colon in function definition
  • Incorrect arrow function syntax in Python
3. What will be the output of this Python code snippet for sentiment analysis?
def analyze_sentiment(text):
    if 'happy' in text:
        return 'Positive'
    elif 'sad' in text:
        return 'Negative'
    else:
        return 'Neutral'

print(analyze_sentiment('I am very happy today'))
medium
A. Negative
B. Positive
C. Neutral
D. Error

Solution

  1. Step 1: Check if 'happy' is in the input text

    The input text is 'I am very happy today', which contains 'happy'.
  2. Step 2: Return sentiment based on condition

    Since 'happy' is found, the function returns 'Positive'.
  3. Final Answer:

    Positive -> Option B
  4. Quick Check:

    Text contains 'happy' = Positive sentiment [OK]
Hint: Look for keywords in text to decide sentiment [OK]
Common Mistakes:
  • Confusing 'happy' with 'sad'
  • Assuming default Neutral without checking conditions
  • Thinking code will cause error
4. Find the error in this Python code for summarizing text:
def summarize(text):
    sentences = text.split('. ')
    summary = sentences[0]
    return summary

print(summarize('This is sentence one. This is sentence two.'))
medium
A. The code correctly returns the first sentence as summary
B. The code will cause an IndexError
C. The split should use ',' instead of '. '
D. The return statement is missing

Solution

  1. Step 1: Understand the split method

    Splitting by '. ' divides text into sentences correctly.
  2. Step 2: Check the summary assignment and return

    Assigning the first sentence to summary and returning it is valid.
  3. Final Answer:

    The code correctly returns the first sentence as summary -> Option A
  4. Quick Check:

    Splitting and returning first sentence = Correct summary [OK]
Hint: Splitting text by '. ' extracts sentences [OK]
Common Mistakes:
  • Thinking split delimiter is wrong
  • Expecting error when none occurs
  • Missing return statement confusion
5. You want to build a chatbot that understands user questions and replies correctly. Which combination of NLP techniques is best to start with?
hard
A. Tokenization + intent recognition + response generation
B. Image recognition + speech synthesis
C. Text summarization + translation
D. Speech recognition + sentiment analysis

Solution

  1. Step 1: Identify chatbot core tasks

    A chatbot needs to understand text (tokenization), detect user intent, and generate replies.
  2. Step 2: Match techniques to chatbot needs

    Tokenization breaks text into words, intent recognition finds meaning, and response generation creates answers.
  3. Final Answer:

    Tokenization + intent recognition + response generation -> Option A
  4. Quick Check:

    Chatbot basics = Tokenize + Intent + Response [OK]
Hint: Chatbots need understanding + intent + reply steps [OK]
Common Mistakes:
  • Confusing speech tasks with text understanding
  • Choosing unrelated NLP tasks like summarization
  • Mixing image tasks with NLP