0
0
TensorFlowml~15 mins

Freezing and unfreezing layers in TensorFlow - Deep Dive

Choose your learning style9 modes available
Overview - Freezing and unfreezing layers
What is it?
Freezing and unfreezing layers is a technique in machine learning where some parts of a neural network are kept fixed (frozen) during training, while others are allowed to change (unfrozen). This helps control which parts of the model learn from new data and which parts keep their previous knowledge. It is often used when adapting a pre-trained model to a new task. By freezing layers, we save time and avoid losing useful information already learned.
Why it matters
Without freezing layers, training a large model from scratch every time would be slow and require lots of data. Freezing lets us reuse knowledge from earlier training, making learning faster and more efficient. It also helps prevent forgetting important features when adapting to new tasks. This technique is key in transfer learning, which powers many practical AI applications like image recognition and language understanding.
Where it fits
Before learning freezing and unfreezing layers, you should understand neural networks, layers, and basic training concepts like backpropagation. After this, you can explore transfer learning, fine-tuning strategies, and advanced model optimization techniques.
Mental Model
Core Idea
Freezing layers means stopping some parts of a model from learning so that only selected parts update during training.
Think of it like...
Imagine a recipe book where some recipes are perfect and should not change (frozen), while others are new and need experimenting (unfrozen). You only rewrite the new recipes, keeping the good ones intact.
Model Layers
┌───────────────┐
│ Layer 1 (Frozen) │
├───────────────┤
│ Layer 2 (Frozen) │
├───────────────┤
│ Layer 3 (Unfrozen) │
├───────────────┤
│ Layer 4 (Unfrozen) │
└───────────────┘

Training updates weights only in unfrozen layers.
Build-Up - 7 Steps
1
FoundationWhat are model layers?
🤔
Concept: Understanding what layers are in a neural network and their role.
A neural network is made of layers stacked one after another. Each layer transforms input data into a more useful form. Layers have weights that the model learns during training to make better predictions.
Result
You know that layers are building blocks of models and have weights that change during training.
Knowing layers are the parts that learn helps you understand why freezing some layers affects learning.
2
FoundationHow training updates layers
🤔
Concept: Training changes layer weights using data and feedback to improve predictions.
During training, the model compares its predictions to true answers and calculates errors. It then adjusts weights in all layers to reduce errors using a method called backpropagation.
Result
You understand that training changes weights in every layer by default.
Recognizing that all layers update by default sets the stage for why freezing some layers is useful.
3
IntermediateWhat does freezing layers mean?
🤔Before reading on: do you think freezing layers means stopping their weights from changing or deleting those layers? Commit to your answer.
Concept: Freezing layers means preventing their weights from updating during training.
When you freeze a layer, you tell the training process not to change its weights. This keeps the knowledge in that layer intact. In TensorFlow, this is done by setting layer.trainable = False.
Result
Frozen layers keep their learned features unchanged during training.
Understanding freezing as stopping weight updates helps control which parts of the model learn new things.
4
IntermediateWhy freeze layers in transfer learning?
🤔Before reading on: do you think freezing layers helps speed up training or slows it down? Commit to your answer.
Concept: Freezing layers helps reuse learned features and speeds up adapting models to new tasks.
In transfer learning, you start with a model trained on a large dataset. Freezing early layers keeps general features like edges or shapes, while you retrain later layers to learn new task-specific details.
Result
Training is faster and requires less data because only some layers learn.
Knowing freezing preserves useful features prevents losing valuable knowledge when adapting models.
5
IntermediateHow to unfreeze layers for fine-tuning
🤔Before reading on: do you think unfreezing layers means making them trainable again or removing them? Commit to your answer.
Concept: Unfreezing layers means allowing their weights to update again during training.
After freezing some layers and training others, you can unfreeze some frozen layers to fine-tune the whole model. This lets the model adjust all weights slightly for better performance.
Result
The model can improve by refining all layers, not just the last ones.
Knowing when and how to unfreeze layers helps balance preserving knowledge and adapting to new data.
6
AdvancedFreezing layers in TensorFlow code
🤔Before reading on: do you think setting layer.trainable = False before or after compiling the model matters? Commit to your answer.
Concept: How to freeze and unfreeze layers properly in TensorFlow with correct order.
In TensorFlow, you freeze layers by setting layer.trainable = False before compiling the model. Changing trainable after compiling has no effect until you recompile. Example: import tensorflow as tf base_model = tf.keras.applications.MobileNetV2(input_shape=(224,224,3), include_top=False) for layer in base_model.layers: layer.trainable = False model = tf.keras.Sequential([ base_model, tf.keras.layers.GlobalAveragePooling2D(), tf.keras.layers.Dense(1, activation='sigmoid') ]) model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
Result
Frozen layers do not update during training; only unfrozen layers learn.
Understanding the compile step's role prevents common bugs where freezing seems ignored.
7
ExpertSurprises in freezing: BatchNorm layers
🤔Before reading on: do you think BatchNormalization layers behave like normal layers when frozen? Commit to your answer.
Concept: BatchNormalization layers behave differently when frozen and can still update some internal statistics.
BatchNormalization layers have internal moving averages that update even if trainable is False. This can cause unexpected behavior if you freeze them without care. Experts often keep BatchNorm layers trainable or handle them specially during freezing.
Result
Freezing BatchNorm layers may not fully stop their internal updates, affecting model behavior.
Knowing BatchNorm quirks helps avoid subtle bugs and improves fine-tuning results.
Under the Hood
When training a model, gradients are calculated for each layer's weights to know how to update them. Freezing a layer sets a flag that stops gradients from flowing back to that layer's weights, so they remain unchanged. However, some layers like BatchNormalization have internal states updated differently, which may still change unless handled carefully.
Why designed this way?
Freezing layers was introduced to enable transfer learning, where models trained on large datasets can be adapted efficiently to new tasks. It avoids retraining all weights, saving time and data. The design balances flexibility and efficiency by allowing selective training.
Training Flow
┌───────────────┐
│ Input Data    │
└──────┬────────┘
       │
┌──────▼────────┐
│ Forward Pass  │
│ (all layers)  │
└──────┬────────┘
       │
┌──────▼────────┐
│ Compute Loss  │
└──────┬────────┘
       │
┌──────▼────────┐
│ Backpropagation│
│ (gradients)   │
└──────┬────────┘
       │
┌──────▼────────┐
│ Update Weights│
│ (only unfrozen│
│ layers)       │
└───────────────┘
Myth Busters - 4 Common Misconceptions
Quick: does setting layer.trainable = False after compiling the model freeze the layer immediately? Commit yes or no.
Common Belief:Setting layer.trainable = False anytime freezes the layer immediately.
Tap to reveal reality
Reality:In TensorFlow, you must set trainable = False before compiling; otherwise, the change has no effect until recompiling.
Why it matters:Not recompiling after changing trainable causes layers to remain trainable, wasting time and causing unexpected training behavior.
Quick: do frozen layers never change any internal state during training? Commit yes or no.
Common Belief:Frozen layers do not change at all during training.
Tap to reveal reality
Reality:Some layers like BatchNormalization update internal statistics even when frozen, which can affect model output.
Why it matters:Ignoring this can lead to subtle bugs and degraded model performance during fine-tuning.
Quick: does freezing layers always improve model performance on new tasks? Commit yes or no.
Common Belief:Freezing layers always helps improve transfer learning performance.
Tap to reveal reality
Reality:Freezing too many layers or the wrong layers can hurt performance by preventing necessary adaptation.
Why it matters:Blindly freezing layers can cause poor results; careful selection and unfreezing are needed.
Quick: is freezing layers the same as removing them from the model? Commit yes or no.
Common Belief:Freezing layers means removing them from the model.
Tap to reveal reality
Reality:Freezing keeps layers in the model but stops their weights from updating.
Why it matters:Confusing freezing with removal can lead to incorrect model design and errors.
Expert Zone
1
BatchNormalization layers require special handling because their internal statistics update even when frozen, which can affect fine-tuning stability.
2
The order of freezing and compiling matters; changing trainable flags after compiling requires recompilation to take effect.
3
Partial freezing (freezing some layers but not others) can be combined with learning rate scheduling for better fine-tuning control.
When NOT to use
Freezing layers is not ideal when you have a large, task-specific dataset and want the model to fully adapt. In such cases, training all layers from scratch or using different architectures like training from scratch or using smaller models may be better.
Production Patterns
In production, freezing early layers of pre-trained models is common to speed up training and reduce overfitting. Later, selective unfreezing and fine-tuning improve accuracy. Automated pipelines often freeze base models and add custom heads for specific tasks.
Connections
Transfer Learning
Freezing layers is a core technique used in transfer learning to reuse knowledge from pre-trained models.
Understanding freezing layers clarifies how transfer learning efficiently adapts models to new tasks without full retraining.
Gradient Descent Optimization
Freezing layers stops gradients from updating certain weights during gradient descent.
Knowing how freezing blocks gradients deepens understanding of optimization control in neural networks.
Software Version Control
Freezing layers is like locking parts of code to prevent changes, similar to version control locking files.
This cross-domain link shows how controlling change is a universal concept in managing complexity.
Common Pitfalls
#1Changing layer.trainable after compiling without recompiling.
Wrong approach:for layer in model.layers: layer.trainable = False # No recompilation model.fit(data)
Correct approach:for layer in model.layers: layer.trainable = False model.compile(optimizer='adam', loss='binary_crossentropy') model.fit(data)
Root cause:Misunderstanding that trainable flags only take effect at compile time.
#2Freezing BatchNormalization layers without special handling.
Wrong approach:for layer in model.layers: if isinstance(layer, tf.keras.layers.BatchNormalization): layer.trainable = False
Correct approach:for layer in model.layers: if isinstance(layer, tf.keras.layers.BatchNormalization): layer.trainable = True # Keep BN layers trainable for stability
Root cause:Not knowing BatchNorm layers update internal stats even when frozen.
#3Freezing all layers and expecting the model to learn new tasks well.
Wrong approach:for layer in model.layers: layer.trainable = False model.compile(...) model.fit(new_data)
Correct approach:for layer in model.layers[:-3]: # Freeze early layers layer.trainable = False for layer in model.layers[-3:]: # Unfreeze last layers layer.trainable = True model.compile(...) model.fit(new_data)
Root cause:Assuming freezing all layers allows learning new tasks without adaptation.
Key Takeaways
Freezing layers stops their weights from updating during training, preserving learned features.
It is essential in transfer learning to reuse knowledge and speed up training on new tasks.
In TensorFlow, set layer.trainable before compiling the model to freeze layers correctly.
BatchNormalization layers behave differently and often should remain trainable even when freezing others.
Careful selection of which layers to freeze and unfreeze balances preserving knowledge and adapting to new data.