0
0
Computer Visionml~5 mins

Image as numerical data (pixels, channels) in Computer Vision

Choose your learning style9 modes available
Introduction

We turn images into numbers so computers can understand and work with them. This helps us teach machines to see and recognize things.

When building a program to recognize faces in photos.
When creating an app that sorts pictures by content.
When training a robot to identify objects by looking.
When analyzing medical images like X-rays or MRIs.
When enhancing photos by adjusting colors or brightness.
Syntax
Computer Vision
image_array = [[[R, G, B], [R, G, B], ...],  # row 1
               [[R, G, B], [R, G, B], ...],  # row 2
               ...]

# R, G, B are numbers from 0 to 255 representing colors

Images are stored as 3D arrays: height x width x channels.

Each pixel has values for Red, Green, and Blue channels.

Examples
This pixel is pure red because red is 255 and green and blue are 0.
Computer Vision
pixel = [255, 0, 0]  # bright red pixel
This is a tiny 2x2 image with black, white, gray, and dark gray pixels.
Computer Vision
image = [[[0, 0, 0], [255, 255, 255]],
         [[128, 128, 128], [64, 64, 64]]]  # 2x2 image
This creates a 100 by 100 black image using a numpy array filled with zeros.
Computer Vision
import numpy as np
image_np = np.zeros((100, 100, 3), dtype=np.uint8)  # 100x100 black image
Sample Model

This code creates a small 3x3 image with different colors. It shows how to check the image size, get a pixel's color, and find the average color of the whole image.

Computer Vision
import numpy as np

# Create a 3x3 image with 3 color channels (RGB)
image = np.array([
    [[255, 0, 0], [0, 255, 0], [0, 0, 255]],  # red, green, blue
    [[255, 255, 0], [0, 255, 255], [255, 0, 255]],  # yellow, cyan, magenta
    [[0, 0, 0], [128, 128, 128], [255, 255, 255]]  # black, gray, white
], dtype=np.uint8)

# Print shape of image
print(f"Image shape: {image.shape}")

# Access pixel at row 1, column 2
pixel = image[1, 2]
print(f"Pixel at (1,2): {pixel}")

# Calculate average color of the image
avg_color = image.mean(axis=(0,1))
print(f"Average color (RGB): {avg_color.astype(int)}")
OutputSuccess
Important Notes

Pixel values usually range from 0 to 255 for each color channel.

Images can have more channels, like an alpha channel for transparency.

Converting images to numbers is the first step before feeding them to machine learning models.

Summary

Images are stored as numbers in arrays with height, width, and color channels.

Each pixel has values for red, green, and blue colors.

Understanding this helps us prepare images for machine learning tasks.