Bird
Raised Fist0
TensorFlowml~8 mins

First neural network in TensorFlow - Model Metrics & Evaluation

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Metrics & Evaluation - First neural network
Which metric matters for this concept and WHY

When training your first neural network, the main metric to watch is accuracy. Accuracy tells you how often your model guesses right. It is simple and easy to understand, making it perfect for beginners. However, if your data has many more examples of one type than another, accuracy alone might not tell the full story. In that case, also look at loss, which shows how well the model is learning during training.

Confusion matrix or equivalent visualization (ASCII)
      Confusion Matrix Example:

          Predicted
          0    1
        ---------
      0 | 50 | 10 |
        ---------
      1 | 5  | 35 |
        ---------

    Here:
    - True Positives (TP) = 35 (correctly predicted 1)
    - True Negatives (TN) = 50 (correctly predicted 0)
    - False Positives (FP) = 10 (wrongly predicted 1)
    - False Negatives (FN) = 5  (missed 1)
    
Precision vs Recall tradeoff with concrete examples

Imagine your first neural network is a spam detector:

  • Precision means: When the model says "spam", how often is it really spam? High precision means fewer good emails get marked as spam.
  • Recall means: Of all the spam emails, how many did the model catch? High recall means fewer spam emails sneak into your inbox.

If you want to avoid missing spam, focus on recall. If you want to avoid losing good emails, focus on precision. Your first neural network might need tuning to find the right balance.

What "good" vs "bad" metric values look like for this use case

For a simple first neural network on balanced data:

  • Good: Accuracy above 80%, loss steadily decreasing, precision and recall both above 75%.
  • Bad: Accuracy near 50% (like random guessing), loss not improving, very low precision or recall (below 50%).

Good metrics mean your network is learning patterns. Bad metrics mean it might be guessing or stuck.

Metrics pitfalls
  • Accuracy paradox: High accuracy can be misleading if one class dominates the data.
  • Data leakage: If test data leaks into training, metrics look unrealistically good.
  • Overfitting indicators: Training accuracy very high but test accuracy low means the model memorizes training data but fails on new data.
Self-check question

Your first neural network has 98% accuracy but only 12% recall on the positive class (e.g., fraud). Is it good for production? Why not?

Answer: No, it is not good. The model misses most positive cases (only 12% recall), which is critical in fraud detection. High accuracy is misleading because most data is negative. You need to improve recall to catch more fraud.

Key Result
Accuracy is key for first neural networks but watch precision and recall to understand true performance.

Practice

(1/5)
1. What is the main purpose of the compile method in a TensorFlow neural network model?
easy
A. To set the optimizer, loss function, and metrics for training
B. To add layers to the model
C. To train the model on data
D. To make predictions on new data

Solution

  1. Step 1: Understand the role of compile

    The compile method prepares the model for training by specifying how it learns, including the optimizer, loss function, and metrics.
  2. Step 2: Differentiate from other methods

    Adding layers is done before compiling, training is done with fit, and predictions use predict.
  3. Final Answer:

    To set the optimizer, loss function, and metrics for training -> Option A
  4. Quick Check:

    compile sets training details = A [OK]
Hint: Compile sets how the model learns before training [OK]
Common Mistakes:
  • Confusing compile with fit (training)
  • Thinking compile adds layers
  • Mixing compile with prediction
2. Which of the following is the correct way to add a dense hidden layer with 10 neurons and ReLU activation in TensorFlow?
easy
A. model.add(tf.keras.Dense(10, activation='relu'))
B. model.add(Dense(activation='relu', 10))
C. model.add(tf.keras.layers.Dense(10, activation='relu'))
D. model.add(tf.layers.Dense(activation='relu', units=10))

Solution

  1. Step 1: Recall correct TensorFlow syntax for adding layers

    The correct way is to use tf.keras.layers.Dense with units first, then activation as a named argument.
  2. Step 2: Check each option

    model.add(tf.keras.layers.Dense(10, activation='relu')) matches the correct syntax. model.add(Dense(activation='relu', 10)) has wrong argument order. model.add(tf.layers.Dense(activation='relu', units=10)) uses deprecated tf.layers. model.add(tf.keras.Dense(10, activation='relu')) misses layers in the path.
  3. Final Answer:

    model.add(tf.keras.layers.Dense(10, activation='relu')) -> Option C
  4. Quick Check:

    Correct layer syntax = D [OK]
Hint: Use tf.keras.layers.Dense(units, activation='relu') [OK]
Common Mistakes:
  • Wrong argument order in Dense layer
  • Using deprecated tf.layers instead of tf.keras.layers
  • Missing 'layers' in the import path
3. What will be the output shape of the model after adding these layers?
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(5, input_shape=(3,), activation='relu'))
model.add(tf.keras.layers.Dense(2, activation='softmax'))
print(model.output_shape)
medium
A. (None, 5)
B. (None, 2)
C. (None, 3)
D. (3, 2)

Solution

  1. Step 1: Understand input and output shapes

    The input shape is (3,), first layer outputs 5 units, second layer outputs 2 units.
  2. Step 2: Determine final output shape

    The model output shape is (None, 2) where None is batch size, 2 is output units.
  3. Final Answer:

    (None, 2) -> Option B
  4. Quick Check:

    Output units = 2 means shape (None, 2) [OK]
Hint: Output shape matches last layer units with batch size None [OK]
Common Mistakes:
  • Confusing input shape with output shape
  • Ignoring batch size dimension None
  • Mixing layer units and input dimensions
4. Identify the error in this code snippet for creating a simple neural network:
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(10, activation='relu'))
model.compile(optimizer='adam', loss='mse')
model.summary()
model.fit(x_train, y_train, epochs=5)
medium
A. Optimizer 'adam' is not supported
B. Loss function 'mse' is invalid
C. fit method requires batch_size argument
D. Missing input shape in the first layer

Solution

  1. Step 1: Check layer definition

    The first Dense layer lacks an input shape, which is required for the model to know input dimensions.
  2. Step 2: Verify other parts

    Loss 'mse' and optimizer 'adam' are valid. Batch size is optional in fit.
  3. Final Answer:

    Missing input shape in the first layer -> Option D
  4. Quick Check:

    Input shape needed in first layer = C [OK]
Hint: Always specify input shape in first layer [OK]
Common Mistakes:
  • Skipping input_shape in first layer
  • Thinking batch_size is mandatory in fit
  • Confusing loss and optimizer names
5. You want to build a neural network to classify images into 3 categories. Which model setup is best?
model = tf.keras.Sequential([
  tf.keras.layers.Flatten(input_shape=(28,28)),
  tf.keras.layers.Dense(64, activation='relu'),
  tf.keras.layers.Dense(3, activation='softmax')
])
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
hard
A. Correct setup for multi-class classification
B. Use sigmoid activation in last layer instead of softmax
C. Use mean squared error loss for classification
D. Missing Flatten layer before Dense layers

Solution

  1. Step 1: Analyze model layers

    Flatten converts 2D image to 1D, Dense with 64 units and ReLU is hidden layer, final Dense with 3 units and softmax outputs class probabilities.
  2. Step 2: Check compile settings

    Optimizer 'adam' is good, loss 'sparse_categorical_crossentropy' fits multi-class with integer labels, metrics include accuracy.
  3. Final Answer:

    Correct setup for multi-class classification -> Option A
  4. Quick Check:

    Softmax + sparse_categorical_crossentropy = B [OK]
Hint: Use softmax and sparse_categorical_crossentropy for multi-class [OK]
Common Mistakes:
  • Using sigmoid for multi-class output
  • Using MSE loss for classification
  • Skipping Flatten for image input