When inspecting model parameters, the key metrics are parameter count and parameter distribution. These tell us how complex the model is and if parameters are learning useful patterns. We look at parameter values to check if they are too large, too small, or stuck at zero, which can indicate problems like underfitting or overfitting.
Model parameters inspection in PyTorch - Model Metrics & Evaluation
Model parameters inspection does not use a confusion matrix. Instead, we use parameter histograms or summary tables showing counts and statistics like mean and standard deviation of weights and biases.
Parameter summary example:
Layer: Conv2d(3, 16, kernel=3x3)
- Total params: 432
- Mean weight: 0.012
- Std weight: 0.045
- Min weight: -0.12
- Max weight: 0.15
Layer: Linear(128, 10)
- Total params: 1290
- Mean weight: 0.005
- Std weight: 0.03
- Min weight: -0.08
- Max weight: 0.09
A model with too few parameters may not learn well (underfitting). Too many parameters can memorize training data but fail on new data (overfitting). Inspecting parameters helps find a balance. For example, a small model with 1,000 parameters might be fast but less accurate, while a large model with 1 million parameters might be accurate but slow and prone to overfitting.
Good: Parameters have varied values, not all zeros or very large numbers. Weight distributions look normal (bell-shaped). Parameter count matches model design.
Bad: Parameters are all zeros or very close to zero (dead model). Parameters have extremely large values (exploding weights). Parameter count is unexpectedly low or high, indicating a coding error.
- Ignoring parameter scale: Very large or very small weights can cause training issues.
- Confusing parameter count with model quality: More parameters do not always mean better models.
- Not checking biases separately: Biases affect output shifts and need inspection.
- Overlooking frozen parameters: Parameters not updating during training can cause poor learning.
Your model has 10,000 parameters, but after training, all weights in one layer are zero. Is this good? Why or why not?
Answer: This is not good. All zero weights mean that layer is not learning anything. It could be a bug or a problem with training. You should investigate why weights are stuck at zero.