Overview - Freezing and unfreezing layers

What is it?

Freezing and unfreezing layers is a technique in machine learning where some parts of a neural network are kept fixed (frozen) during training, while others are allowed to change (unfrozen). This helps control which parts of the model learn from new data and which parts keep their previous knowledge. It is often used when adapting a pre-trained model to a new task. By freezing layers, we save time and avoid losing useful information already learned.

Why it matters

Without freezing layers, training a large model from scratch every time would be slow and require lots of data. Freezing lets us reuse knowledge from earlier training, making learning faster and more efficient. It also helps prevent forgetting important features when adapting to new tasks. This technique is key in transfer learning, which powers many practical AI applications like image recognition and language understanding.

Where it fits

Before learning freezing and unfreezing layers, you should understand neural networks, layers, and basic training concepts like backpropagation. After this, you can explore transfer learning, fine-tuning strategies, and advanced model optimization techniques.

Mental Model

Core Idea

Freezing layers means stopping some parts of a model from learning so that only selected parts update during training.

Think of it like...

Imagine a recipe book where some recipes are perfect and should not change (frozen), while others are new and need experimenting (unfrozen). You only rewrite the new recipes, keeping the good ones intact.

Model Layers
┌───────────────┐
│ Layer 1 (Frozen) │
├───────────────┤
│ Layer 2 (Frozen) │
├───────────────┤
│ Layer 3 (Unfrozen) │
├───────────────┤
│ Layer 4 (Unfrozen) │
└───────────────┘

Training updates weights only in unfrozen layers.

Build-Up - 7 Steps

1

FoundationWhat are model layers?

Concept: Understanding what layers are in a neural network and their role.

A neural network is made of layers stacked one after another. Each layer transforms input data into a more useful form. Layers have weights that the model learns during training to make better predictions.

Result

You know that layers are building blocks of models and have weights that change during training.

Knowing layers are the parts that learn helps you understand why freezing some layers affects learning.

2

FoundationHow training updates layers

3

IntermediateWhat does freezing layers mean?

4

IntermediateWhy freeze layers in transfer learning?

5

IntermediateHow to unfreeze layers for fine-tuning

6

AdvancedFreezing layers in TensorFlow code

7

ExpertSurprises in freezing: BatchNorm layers

Under the Hood

When training a model, gradients are calculated for each layer's weights to know how to update them. Freezing a layer sets a flag that stops gradients from flowing back to that layer's weights, so they remain unchanged. However, some layers like BatchNormalization have internal states updated differently, which may still change unless handled carefully.

Why designed this way?

Freezing layers was introduced to enable transfer learning, where models trained on large datasets can be adapted efficiently to new tasks. It avoids retraining all weights, saving time and data. The design balances flexibility and efficiency by allowing selective training.

Training Flow
┌───────────────┐
│ Input Data    │
└──────┬────────┘
       │
┌──────▼────────┐
│ Forward Pass  │
│ (all layers)  │
└──────┬────────┘
       │
┌──────▼────────┐
│ Compute Loss  │
└──────┬────────┘
       │
┌──────▼────────┐
│ Backpropagation│
│ (gradients)   │
└──────┬────────┘
       │
┌──────▼────────┐
│ Update Weights│
│ (only unfrozen│
│ layers)       │
└───────────────┘

Myth Busters - 4 Common Misconceptions

Quick: does setting layer.trainable = False after compiling the model freeze the layer immediately? Commit yes or no.

Common Belief:Setting layer.trainable = False anytime freezes the layer immediately.

Tap to reveal reality

Quick: do frozen layers never change any internal state during training? Commit yes or no.

Common Belief:Frozen layers do not change at all during training.

Tap to reveal reality

Quick: does freezing layers always improve model performance on new tasks? Commit yes or no.

Common Belief:Freezing layers always helps improve transfer learning performance.

Tap to reveal reality

Quick: is freezing layers the same as removing them from the model? Commit yes or no.

Common Belief:Freezing layers means removing them from the model.

Tap to reveal reality

Expert Zone

1

BatchNormalization layers require special handling because their internal statistics update even when frozen, which can affect fine-tuning stability.

2

The order of freezing and compiling matters; changing trainable flags after compiling requires recompilation to take effect.

3

Partial freezing (freezing some layers but not others) can be combined with learning rate scheduling for better fine-tuning control.

When NOT to use

Freezing layers is not ideal when you have a large, task-specific dataset and want the model to fully adapt. In such cases, training all layers from scratch or using different architectures like training from scratch or using smaller models may be better.

Production Patterns

In production, freezing early layers of pre-trained models is common to speed up training and reduce overfitting. Later, selective unfreezing and fine-tuning improve accuracy. Automated pipelines often freeze base models and add custom heads for specific tasks.

Connections

Transfer Learning

Freezing layers is a core technique used in transfer learning to reuse knowledge from pre-trained models.

Understanding freezing layers clarifies how transfer learning efficiently adapts models to new tasks without full retraining.

Gradient Descent Optimization

Freezing layers stops gradients from updating certain weights during gradient descent.

Knowing how freezing blocks gradients deepens understanding of optimization control in neural networks.

Software Version Control

Freezing layers is like locking parts of code to prevent changes, similar to version control locking files.

This cross-domain link shows how controlling change is a universal concept in managing complexity.

Common Pitfalls

#1Changing layer.trainable after compiling without recompiling.

Wrong approach:for layer in model.layers: layer.trainable = False # No recompilation model.fit(data)

Correct approach:for layer in model.layers: layer.trainable = False model.compile(optimizer='adam', loss='binary_crossentropy') model.fit(data)

Root cause:Misunderstanding that trainable flags only take effect at compile time.

#2Freezing BatchNormalization layers without special handling.

Wrong approach:for layer in model.layers: if isinstance(layer, tf.keras.layers.BatchNormalization): layer.trainable = False

Correct approach:for layer in model.layers: if isinstance(layer, tf.keras.layers.BatchNormalization): layer.trainable = True # Keep BN layers trainable for stability

Root cause:Not knowing BatchNorm layers update internal stats even when frozen.

#3Freezing all layers and expecting the model to learn new tasks well.

Wrong approach:for layer in model.layers: layer.trainable = False model.compile(...) model.fit(new_data)

Correct approach:for layer in model.layers[:-3]: # Freeze early layers layer.trainable = False for layer in model.layers[-3:]: # Unfreeze last layers layer.trainable = True model.compile(...) model.fit(new_data)

Root cause:Assuming freezing all layers allows learning new tasks without adaptation.

Key Takeaways

Freezing layers stops their weights from updating during training, preserving learned features.

It is essential in transfer learning to reuse knowledge and speed up training on new tasks.

In TensorFlow, set layer.trainable before compiling the model to freeze layers correctly.

BatchNormalization layers behave differently and often should remain trainable even when freezing others.

Careful selection of which layers to freeze and unfreeze balances preserving knowledge and adapting to new data.