When using dynamic computation graphs, the key metric to consider is training iteration time and memory efficiency. This is because dynamic graphs build the model structure on the fly, allowing flexibility but potentially affecting speed and memory use. Monitoring training speed and GPU memory usage helps understand if the dynamic graph advantage is realized in your task.
Dynamic computation graph advantage in PyTorch - Model Metrics & Evaluation
Dynamic computation graphs do not have a confusion matrix, but we can visualize the difference in resource usage:
+----------------------+----------------------+----------------------+
| Metric | Static Graph Model | Dynamic Graph Model |
+----------------------+----------------------+----------------------+
| Training Time (sec) | 120 | 130 |
| Memory Usage (MB) | 4000 | 3500 |
| Flexibility | Low | High |
+----------------------+----------------------+----------------------+
This table shows dynamic graphs may use less memory but sometimes take slightly longer per iteration due to graph rebuilding.
For dynamic computation graphs, the tradeoff is between flexibility and performance:
- Flexibility: Dynamic graphs let you change model structure during training, useful for variable input sizes or complex models.
- Performance: Static graphs can be faster and more memory efficient because the graph is fixed and optimized ahead of time.
Example: If you want to experiment with different model layers on the fly, dynamic graphs help. But if you want maximum speed for a fixed model, static graphs might be better.
Good values for dynamic computation graph advantage:
- Training iteration time close to static graph baseline (e.g., within 10-15%)
- Lower or comparable memory usage due to on-demand graph building
- Ability to handle variable input sizes or dynamic model changes without errors
Bad values:
- Significantly slower training (e.g., 30%+ slower than static graph)
- Memory usage spikes or leaks due to graph rebuilding
- Errors or crashes when model structure changes dynamically
Common pitfalls when evaluating dynamic computation graphs include:
- Ignoring overhead: Not accounting for extra time spent rebuilding graphs each iteration can mislead you about training speed.
- Memory leaks: Dynamic graphs can cause memory to grow if not properly cleared, leading to crashes.
- Comparing apples to oranges: Comparing dynamic graph models with different architectures or batch sizes without controlling variables can confuse results.
- Overfitting to speed: Optimizing only for speed might reduce model flexibility and accuracy.
No, this model is not good for fraud detection. Even though accuracy is high, recall is very low. This means the model misses most fraud cases, which is dangerous. For fraud detection, high recall is critical to catch as many frauds as possible, even if it means some false alarms. So, focus on improving recall rather than just accuracy.