When deploying models on Jetson Nano, key metrics include inference speed (latency), accuracy, and power consumption.
Inference speed matters because Jetson Nano has limited computing power, so the model must run fast enough for real-time use.
Accuracy ensures the model makes correct predictions.
Power consumption is important since Jetson Nano is often used in low-power or mobile setups.
Jetson Nano deployment in Computer Vision - Model Metrics & Evaluation
Start learning this pattern below
Jump into concepts and practice - no test required
Actual \ Predicted | Positive | Negative
-------------------|----------|---------
Positive | 80 | 20
Negative | 10 | 90
This shows 80 true positives (TP), 20 false negatives (FN), 10 false positives (FP), and 90 true negatives (TN).
From this, precision = 80 / (80 + 10) = 0.89, recall = 80 / (80 + 20) = 0.80.
For example, if Jetson Nano runs a security camera model detecting intruders:
- High recall means catching most intruders (few missed detections).
- High precision means few false alarms (few false intruder alerts).
If recall is low, intruders might be missed, which is bad.
If precision is low, many false alarms waste attention.
Depending on use, you may prioritize recall (security) or precision (reduce false alarms).
Good metrics:
- Accuracy above 85% for reliable predictions.
- Inference latency under 100 milliseconds for smooth real-time use.
- Power consumption low enough to run on battery or limited power supply.
Bad metrics:
- Accuracy below 70%, causing many wrong predictions.
- Inference latency over 500 milliseconds, causing lag.
- High power use, draining battery quickly or overheating device.
- Ignoring latency: A model with high accuracy but slow speed is unusable in real-time.
- Overfitting: Model performs well on training data but poorly on real Jetson Nano inputs.
- Data leakage: Training data too similar to test data inflates accuracy falsely.
- Power spikes: Not measuring power use can cause device overheating or shutdown.
- Not testing in real environment: Metrics from desktop may not reflect Jetson Nano performance.
Your Jetson Nano model has 98% accuracy but only 12% recall on detecting intruders. Is it good for production? Why or why not?
Answer: No, it is not good. Although accuracy is high, recall is very low, meaning the model misses most intruders. For security, missing intruders is dangerous, so recall must be much higher.
Practice
Solution
Step 1: Understand Jetson Nano's purpose
Jetson Nano is designed to run AI models locally on a small device, enabling offline use.Step 2: Compare options
Options A, B, and D are incorrect because Jetson Nano does not require cloud servers, supports inference, and primarily uses Python and C++, not Java.Final Answer:
It allows running AI models locally without needing internet connection. -> Option AQuick Check:
Local AI inference = C [OK]
- Thinking Jetson Nano needs cloud servers
- Confusing training with inference capabilities
- Assuming it only supports Java
Solution
Step 1: Identify the library for TensorRT
The 'tensorrt' Python library is specifically designed to load and run TensorRT models on Jetson Nano.Step 2: Eliminate other options
'tensorflow' is for TensorFlow models, 'scikit-learn' is for classical ML, and 'matplotlib' is for plotting, not model loading.Final Answer:
tensorrt -> Option BQuick Check:
TensorRT model loading = tensorrt [OK]
- Choosing tensorflow instead of tensorrt
- Confusing plotting libraries with model libraries
- Using scikit-learn for deep learning models
import tensorrt as trt
TRT_LOGGER = trt.Logger()
with open('model.engine', 'rb') as f:
engine_data = f.read()
runtime = trt.Runtime(TRT_LOGGER)
engine = runtime.deserialize_cuda_engine(engine_data)
print(type(engine))Solution
Step 1: Understand deserialization output
The 'deserialize_cuda_engine' method returns an ICudaEngine object representing the TensorRT engine.Step 2: Check print statement output
Printing type(engine) will show <class 'tensorrt.ICudaEngine'> indicating successful engine loading.Final Answer:
<class 'tensorrt.ICudaEngine'> -> Option DQuick Check:
deserialize_cuda_engine returns ICudaEngine [OK]
- Expecting TensorFlow graph type
- Assuming None is returned
- Confusing syntax error with runtime output
RuntimeError: CUDA out of memory. What is the best way to fix this?Solution
Step 1: Understand CUDA out of memory error
This error means the GPU memory is full and cannot allocate more for the model inference.Step 2: Choose the best fix
Reducing batch size lowers memory usage, fixing the error. Increasing learning rate or using larger models increases memory use. Disabling CUDA slows inference drastically.Final Answer:
Reduce the batch size during inference. -> Option CQuick Check:
CUDA memory error fix = reduce batch size [OK]
- Increasing learning rate to fix memory issues
- Using bigger models without memory check
- Disabling CUDA without considering speed impact
Solution
Step 1: Understand deployment workflow
First, train the model on a powerful machine, then convert it to TensorRT engine for Jetson Nano optimized inference.Step 2: Load and run inference
After conversion, load the TensorRT engine on Jetson Nano using the tensorrt library and run inference.Final Answer:
Train model -> Convert to TensorRT engine -> Load engine with tensorrt -> Run inference -> Option AQuick Check:
Correct deployment order = A [OK]
- Trying to run inference before conversion
- Converting before training the model
- Loading engine before training
