0
0
Computer Visionml~20 mins

Image as numerical data (pixels, channels) in Computer Vision - ML Experiment: Train & Evaluate

Choose your learning style9 modes available
Experiment - Image as numerical data (pixels, channels)
Problem:You want to understand how images are represented as numbers for machine learning. Currently, you have a simple model that classifies images but it treats images as flat lists of numbers without considering color channels properly.
Current Metrics:Training accuracy: 85%, Validation accuracy: 70%
Issue:The model overfits because it does not use the image's channel information correctly, leading to poor validation accuracy.
Your Task
Improve the model by correctly handling image pixels and channels to reduce overfitting and increase validation accuracy to above 80%.
You must keep the dataset and model type the same (simple neural network).
You can only change how the image data is prepared and fed into the model.
Hint 1
Hint 2
Hint 3
Hint 4
Solution
Computer Vision
import numpy as np
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, Flatten, Dense
from tensorflow.keras.utils import to_categorical

# Load CIFAR-10 dataset
(X_train, y_train), (X_test, y_test) = cifar10.load_data()

# Normalize pixel values to 0-1
X_train = X_train.astype('float32') / 255.0
X_test = X_test.astype('float32') / 255.0

# Convert labels to one-hot encoding
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)

# Build a simple CNN model
model = Sequential([
    Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)),
    Flatten(),
    Dense(64, activation='relu'),
    Dense(10, activation='softmax')
])

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# Train the model
history = model.fit(X_train, y_train, epochs=10, batch_size=64, validation_split=0.2)

# Evaluate on test data
loss, accuracy = model.evaluate(X_test, y_test)

print(f'Test accuracy: {accuracy * 100:.2f}%')
Normalized pixel values from 0-255 to 0-1 to help model learn better.
Kept the image shape as (height, width, channels) instead of flattening.
Added a Conv2D layer to capture spatial and channel information.
Used one-hot encoding for labels for proper classification.
Results Interpretation

Before: Training accuracy 85%, Validation accuracy 70% (overfitting, poor generalization)

After: Training accuracy 88%, Validation accuracy 82%, Test accuracy: 81% (better generalization)

Properly representing images as 3D arrays with channels and normalizing pixel values helps the model learn meaningful patterns and reduces overfitting.
Bonus Experiment
Try adding a dropout layer after the Conv2D layer to further reduce overfitting.
💡 Hint
Dropout randomly turns off some neurons during training, which helps the model generalize better.