Python CV ecosystem helps you work with images and videos easily. It lets you read, edit, and analyze pictures for projects like face detection or photo filters.
Python CV ecosystem (OpenCV, PIL, torchvision) in Computer Vision
Start learning this pattern below
Jump into concepts and practice - no test required
import cv2 from PIL import Image import torchvision.transforms as transforms
OpenCV (cv2) is great for fast image and video processing.
PIL (Pillow) is simple for opening and editing images.
torchvision helps prepare images for deep learning models.
import cv2 img = cv2.imread('photo.jpg') cv2.imshow('Image', img) cv2.waitKey(0) cv2.destroyAllWindows()
from PIL import Image img = Image.open('photo.jpg') img = img.resize((100, 100)) img.show()
from PIL import Image import torchvision.transforms as transforms img = Image.open('photo.jpg') transform = transforms.Compose([ transforms.Resize((128, 128)), transforms.ToTensor() ]) tensor_img = transform(img)
This program shows how to open an image with PIL, convert it to OpenCV format, resize it, convert back to PIL, and finally transform it to a tensor using torchvision. It prints the size and shape at each step.
import cv2 import numpy as np from PIL import Image import torchvision.transforms as transforms # Open image with PIL img_pil = Image.open('sample.jpg') print(f'PIL image size: {img_pil.size}') # Convert PIL image to OpenCV format img_cv = cv2.cvtColor(np.array(img_pil), cv2.COLOR_RGB2BGR) print(f'OpenCV image shape: {img_cv.shape}') # Resize image using OpenCV img_cv_resized = cv2.resize(img_cv, (64, 64)) print(f'Resized OpenCV image shape: {img_cv_resized.shape}') # Convert back to PIL img_pil_resized = Image.fromarray(cv2.cvtColor(img_cv_resized, cv2.COLOR_BGR2RGB)) # Use torchvision to transform image to tensor transform = transforms.ToTensor() tensor_img = transform(img_pil_resized) print(f'Tensor shape: {tensor_img.shape}')
OpenCV uses BGR color order, while PIL uses RGB. Remember to convert colors when switching.
torchvision transforms are useful to prepare images for deep learning models.
Always check image shapes and sizes after transformations to avoid errors.
OpenCV, PIL, and torchvision are key tools for working with images in Python.
Use OpenCV for fast image/video processing, PIL for easy image editing, and torchvision for ML image preparation.
Converting between these libraries is common and requires attention to color formats and shapes.
Practice
Solution
Step 1: Understand library purposes
OpenCV is designed for fast image and video processing, widely used in computer vision.Step 2: Compare with other libraries
PIL is mainly for image editing, torchvision is for ML image datasets, matplotlib is for plotting.Final Answer:
OpenCV -> Option BQuick Check:
Fast image/video processing = OpenCV [OK]
- Confusing PIL as the fastest for video processing
- Thinking torchvision handles video processing
- Assuming matplotlib is for image processing
Solution
Step 1: Identify OpenCV image reading syntax
OpenCV uses cv2.imread() to load images from files.Step 2: Differentiate from other libraries
PIL uses Image.open(), torchvision uses torchvision.io.read_image(), matplotlib uses plt.imread().Final Answer:
img = cv2.imread('image.jpg') -> Option AQuick Check:
OpenCV image read = cv2.imread() [OK]
- Using Image.open() which is from PIL, not OpenCV
- Using plt.imread() which is for plotting, not OpenCV
- Confusing torchvision's read_image with OpenCV
import cv2
img = cv2.imread('image.jpg')
print(img.shape)Solution
Step 1: Understand OpenCV image shape
OpenCV loads images as NumPy arrays with shape (height, width, channels).Step 2: Know OpenCV color format
OpenCV uses BGR color order by default, not RGB.Final Answer:
(height, width, 3) with BGR color order -> Option DQuick Check:
OpenCV shape = (H, W, 3), color = BGR [OK]
- Assuming RGB color order instead of BGR
- Swapping width and height in shape
- Thinking OpenCV loads grayscale by default
from PIL import Image
import numpy as np
img_pil = Image.open('image.jpg')
img_cv = np.array(img_pil)What is the likely cause and fix?
Solution
Step 1: Identify PIL image mode issue
PIL images may not be in RGB mode by default; could be 'P' or 'L' mode causing np.array to have unexpected shape.Step 2: Fix by converting to RGB mode
Use img_pil.convert('RGB') to ensure 3 color channels before converting to NumPy array.Final Answer:
Use img_pil.convert('RGB') before np.array() to ensure 3 channels -> Option CQuick Check:
PIL to NumPy needs RGB mode [OK]
- Assuming np.array always works without convert()
- Ignoring color channel order differences
- Trying to convert to grayscale unnecessarily
Solution
Step 1: Understand torchvision transform pipeline
To prepare images for PyTorch models, convert PIL image to tensor with ToTensor(), which scales pixels to [0,1].Step 2: Normalize tensor with mean and std
Use Normalize() with pretrained model's mean and std to standardize input.Final Answer:
Use torchvision.transforms.ToTensor() then torchvision.transforms.Normalize(mean, std) -> Option AQuick Check:
ToTensor + Normalize = correct PyTorch prep [OK]
- Trying to normalize PIL images directly
- Using OpenCV images without conversion
- Skipping normalization step
