What is Loading and inference in TensorFlow?

TensorFlowml~5 mins

Loading and inference in TensorFlow

Choose your learning style9 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Introduction

Loading a saved model lets you reuse it without retraining. Inference means using the model to make predictions on new data.

You trained a model and want to use it later without retraining.

You want to predict if an email is spam or not using a saved model.

You have a saved image classifier and want to identify new pictures.

You want to deploy a model in an app to make real-time predictions.

Syntax

TensorFlow

import tensorflow as tf

# Load the saved model
model = tf.keras.models.load_model('path_to_model')

# Prepare input data (example: numpy array)
input_data = ...

# Make prediction
predictions = model.predict(input_data)

Use tf.keras.models.load_model() to load a saved model.

Input data shape must match what the model expects.

Examples

Load a model and predict on one sample with 4 features.

TensorFlow

import tensorflow as tf

model = tf.keras.models.load_model('my_model')

import numpy as np
input_data = np.array([[5.1, 3.5, 1.4, 0.2]])
predictions = model.predict(input_data)
print(predictions)

Predict on a batch of two samples at once.

TensorFlow

import tensorflow as tf
model = tf.keras.models.load_model('saved_model')

import numpy as np
batch_data = np.array([[5.1, 3.5, 1.4, 0.2], [6.2, 3.4, 5.4, 2.3]])
preds = model.predict(batch_data)
print(preds)

Sample Model

This program trains a small model on dummy data, saves it, loads it back, and makes a prediction on new data.

TensorFlow

import tensorflow as tf
import numpy as np

# Create and train a simple model
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

model = Sequential([
    Dense(10, activation='relu', input_shape=(4,)),
    Dense(3, activation='softmax')
])

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Dummy data: 5 samples, 4 features each
x_train = np.array([
    [5.1, 3.5, 1.4, 0.2],
    [6.2, 3.4, 5.4, 2.3],
    [5.9, 3.0, 4.2, 1.5],
    [6.9, 3.1, 5.1, 2.3],
    [5.5, 2.3, 4.0, 1.3]
])

# Labels for 3 classes
y_train = np.array([0, 2, 1, 2, 1])

# Train briefly
model.fit(x_train, y_train, epochs=3, verbose=1)

# Save the model
model.save('my_simple_model')

# Load the model
loaded_model = tf.keras.models.load_model('my_simple_model')

# New data for inference
x_new = np.array([[6.0, 3.0, 4.8, 1.8]])

# Predict
predictions = loaded_model.predict(x_new)
print('Predictions:', predictions)

# Show predicted class
predicted_class = predictions.argmax(axis=1)
print('Predicted class:', predicted_class[0])

OutputSuccess

Important Notes

Make sure the input data shape matches the model's input shape exactly.

Loading a model restores its architecture, weights, and optimizer state.

Inference is usually done with model.predict() which returns probabilities or outputs.

Summary

Load saved models with tf.keras.models.load_model().

Use model.predict() to get predictions on new data.

Ensure input data shape matches what the model expects.