Bird
Raised Fist0
Computer Visionml~5 mins

Image as numerical data (pixels, channels) in Computer Vision - Cheat Sheet & Quick Revision

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Recall & Review
beginner
What is an image represented as in machine learning?
An image is represented as numerical data made up of pixels arranged in a grid. Each pixel has values that describe its color or brightness.
Click to reveal answer
beginner
What does a pixel represent in an image?
A pixel is the smallest unit of an image that holds information about color or brightness at a specific point.
Click to reveal answer
beginner
What are channels in an image, and how do they relate to pixels?
Channels are layers of data for each pixel that represent color components. For example, in an RGB image, there are 3 channels: Red, Green, and Blue.
Click to reveal answer
beginner
How is a grayscale image different from a color image in terms of channels?
A grayscale image has only one channel representing brightness, while a color image usually has multiple channels (like 3 for RGB) representing different colors.
Click to reveal answer
beginner
Why do machine learning models use numerical pixel values instead of images directly?
Models need numbers to perform calculations. Converting images to numbers (pixels and channels) allows models to learn patterns and make predictions.
Click to reveal answer
What does each pixel in a color image usually contain?
AText information
BOnly brightness value
CValues for multiple color channels
DSound data
How many channels does a typical RGB image have?
A3
B2
C4
D1
What is the main reason to convert images into numerical pixel data for machine learning?
ATo add sound effects
BTo make images colorful
CTo reduce image size
DTo allow models to perform calculations
Which of these is true about grayscale images?
AThey have three color channels
BThey have one channel representing brightness
CThey contain sound data
DThey have no pixels
If an image has a size of 100x100 pixels and 3 channels, how many numerical values represent it?
A30,000
B300
C10,000
D100
Explain how an image is represented as numerical data for machine learning.
Think about how each tiny dot in the image holds numbers for colors.
You got /4 concepts.
    Describe the difference between grayscale and color images in terms of channels.
    Consider how many layers of color information each image type has.
    You got /3 concepts.

      Practice

      (1/5)
      1. What does each pixel in a color image usually represent?
      easy
      A. A single number representing brightness only
      B. A sound wave frequency
      C. A text label describing the image
      D. A set of numbers for red, green, and blue colors

      Solution

      1. Step 1: Understand pixel representation in color images

        Each pixel stores values for red, green, and blue channels to show color.
      2. Step 2: Compare options to pixel data

        Only A set of numbers for red, green, and blue colors correctly describes pixels as sets of RGB numbers.
      3. Final Answer:

        A set of numbers for red, green, and blue colors -> Option D
      4. Quick Check:

        Pixel = RGB values [OK]
      Hint: Pixels hold RGB numbers, not text or sound [OK]
      Common Mistakes:
      • Thinking pixels store text labels
      • Confusing pixel with brightness only
      • Assuming pixels represent sound
      2. Which Python code correctly creates a 3x3 image with 3 color channels filled with zeros?
      easy
      A. image = np.zeros((3, 3, 3))
      B. image = np.zeros(3, 3, 3)
      C. image = np.zeros[3, 3, 3]
      D. image = zeros((3, 3, 3))

      Solution

      1. Step 1: Recall numpy zeros syntax

        np.zeros requires a single tuple argument for shape, like (3, 3, 3).
      2. Step 2: Check each option's syntax

        image = np.zeros((3, 3, 3)) uses correct tuple and function call syntax. Others have syntax errors or missing np.
      3. Final Answer:

        image = np.zeros((3, 3, 3)) -> Option A
      4. Quick Check:

        np.zeros((3,3,3)) creates 3x3 RGB image [OK]
      Hint: Use np.zeros with shape tuple inside parentheses [OK]
      Common Mistakes:
      • Passing multiple arguments instead of a tuple
      • Using square brackets instead of parentheses
      • Forgetting np. prefix
      3. Given this code:
      import numpy as np
      image = np.array([[[255, 0, 0], [0, 255, 0]],
                        [[0, 0, 255], [255, 255, 0]]])
      print(image.shape)

      What is the output?
      medium
      A. (2, 3, 2)
      B. (3, 2, 2)
      C. (2, 2, 3)
      D. (3, 3, 3)

      Solution

      1. Step 1: Analyze the array structure

        The array has 2 rows, each with 2 pixels, each pixel has 3 color values (RGB).
      2. Step 2: Determine shape order

        Shape is (height=2, width=2, channels=3), so (2, 2, 3).
      3. Final Answer:

        (2, 2, 3) -> Option C
      4. Quick Check:

        Shape = (rows, cols, channels) = (2, 2, 3) [OK]
      Hint: Shape is (height, width, channels) in that order [OK]
      Common Mistakes:
      • Mixing up dimensions order
      • Counting channels as first dimension
      • Assuming square shape without checking
      4. What is wrong with this code snippet for accessing the green channel of an image?
      green_channel = image[:, :, 1:2]
      medium
      A. It returns a 3D array instead of 2D
      B. It causes an index error
      C. It accesses the red channel instead
      D. It modifies the original image

      Solution

      1. Step 1: Understand slicing with 1:2

        Slicing with 1:2 keeps the channel dimension, returning shape (height, width, 1).
      2. Step 2: Compare with expected 2D array

        To get a 2D array, use index 1 without slice, like image[:, :, 1].
      3. Final Answer:

        It returns a 3D array instead of 2D -> Option A
      4. Quick Check:

        Slicing with 1:2 keeps channel dim [OK]
      Hint: Use single index, not slice, for 2D channel array [OK]
      Common Mistakes:
      • Using slice returns extra dimension
      • Confusing channel indices
      • Assuming it changes original image
      5. You have a grayscale image stored as a 2D array with shape (100, 100). You want to convert it to a 3-channel RGB image by repeating the grayscale values across all channels. Which code correctly does this?
      hard
      A. rgb_image = np.repeat(gray_image, 3)
      B. rgb_image = np.stack([gray_image]*3, axis=2)
      C. rgb_image = gray_image.reshape(100, 100, 3)
      D. rgb_image = np.concatenate(gray_image, 3)

      Solution

      1. Step 1: Understand the goal

        We want to create a 3D array where each pixel's grayscale value repeats in 3 channels.
      2. Step 2: Check each method

        rgb_image = np.stack([gray_image]*3, axis=2) stacks the grayscale image 3 times along new channel axis correctly. rgb_image = np.repeat(gray_image, 3) repeats flattening data, wrong shape. rgb_image = gray_image.reshape(100, 100, 3) reshapes without adding channels, causing error. rgb_image = np.concatenate(gray_image, 3) has wrong syntax.
      3. Final Answer:

        rgb_image = np.stack([gray_image]*3, axis=2) -> Option B
      4. Quick Check:

        Stack repeats grayscale across channels [OK]
      Hint: Use np.stack with axis=2 to add channels [OK]
      Common Mistakes:
      • Using np.repeat without axis
      • Reshaping without adding channel dimension
      • Wrong function syntax for concatenation