We turn images into numbers so computers can understand and work with them. This helps us teach machines to see and recognize things.
0
0
Image as numerical data (pixels, channels) in Computer Vision
Introduction
When building a program to recognize faces in photos.
When creating an app that sorts pictures by content.
When training a robot to identify objects by looking.
When analyzing medical images like X-rays or MRIs.
When enhancing photos by adjusting colors or brightness.
Syntax
Computer Vision
image_array = [[[R, G, B], [R, G, B], ...], # row 1 [[R, G, B], [R, G, B], ...], # row 2 ...] # R, G, B are numbers from 0 to 255 representing colors
Images are stored as 3D arrays: height x width x channels.
Each pixel has values for Red, Green, and Blue channels.
Examples
This pixel is pure red because red is 255 and green and blue are 0.
Computer Vision
pixel = [255, 0, 0] # bright red pixel
This is a tiny 2x2 image with black, white, gray, and dark gray pixels.
Computer Vision
image = [[[0, 0, 0], [255, 255, 255]], [[128, 128, 128], [64, 64, 64]]] # 2x2 image
This creates a 100 by 100 black image using a numpy array filled with zeros.
Computer Vision
import numpy as np image_np = np.zeros((100, 100, 3), dtype=np.uint8) # 100x100 black image
Sample Model
This code creates a small 3x3 image with different colors. It shows how to check the image size, get a pixel's color, and find the average color of the whole image.
Computer Vision
import numpy as np # Create a 3x3 image with 3 color channels (RGB) image = np.array([ [[255, 0, 0], [0, 255, 0], [0, 0, 255]], # red, green, blue [[255, 255, 0], [0, 255, 255], [255, 0, 255]], # yellow, cyan, magenta [[0, 0, 0], [128, 128, 128], [255, 255, 255]] # black, gray, white ], dtype=np.uint8) # Print shape of image print(f"Image shape: {image.shape}") # Access pixel at row 1, column 2 pixel = image[1, 2] print(f"Pixel at (1,2): {pixel}") # Calculate average color of the image avg_color = image.mean(axis=(0,1)) print(f"Average color (RGB): {avg_color.astype(int)}")
OutputSuccess
Important Notes
Pixel values usually range from 0 to 255 for each color channel.
Images can have more channels, like an alpha channel for transparency.
Converting images to numbers is the first step before feeding them to machine learning models.
Summary
Images are stored as numbers in arrays with height, width, and color channels.
Each pixel has values for red, green, and blue colors.
Understanding this helps us prepare images for machine learning tasks.