TensorFlowml~20 mins

GPU vs CPU tensor placement in TensorFlow - Experiment Comparison

Choose your learning style10 modes available

Learn Why Deep Model Try Challenge Experiment Recall Metrics

Start learning this pattern below

Jump into concepts and practice - no test required

Recommended

Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong

Experiment - GPU vs CPU tensor placement

Problem:You want to understand how placing tensors on GPU or CPU affects computation speed in TensorFlow.

Current Metrics:No timing metrics collected yet.

Issue:You do not know how to measure and compare the speed difference between GPU and CPU tensor operations.

Your Task

Measure and compare the time taken to perform a large matrix multiplication on CPU and GPU tensors. Show that GPU placement speeds up the operation.

Use TensorFlow 2.x with eager execution.

Use the same matrix size for both CPU and GPU operations.

Measure time accurately using Python's time module.

Hint 1

Hint 2

Hint 3

Hint 4

Solution

TensorFlow

import tensorflow as tf
import time

# Check if GPU is available
if not tf.config.list_physical_devices('GPU'):
    raise RuntimeError('No GPU found. Please run on a machine with GPU.')

# Matrix size
matrix_size = 3000

# Create random matrices on CPU
with tf.device('/CPU:0'):
    a_cpu = tf.random.uniform((matrix_size, matrix_size), dtype=tf.float32)
    b_cpu = tf.random.uniform((matrix_size, matrix_size), dtype=tf.float32)

# Create random matrices on GPU
with tf.device('/GPU:0'):
    a_gpu = tf.random.uniform((matrix_size, matrix_size), dtype=tf.float32)
    b_gpu = tf.random.uniform((matrix_size, matrix_size), dtype=tf.float32)

# Function to time matrix multiplication

def time_matmul(a, b, device_name):
    # Warm-up run
    _ = tf.matmul(a, b)
    
    # Run multiple times
    runs = 5
    start = time.perf_counter()
    for _ in range(runs):
        _ = tf.matmul(a, b)
    end = time.perf_counter()
    avg_time = (end - start) / runs
    print(f"Average matmul time on {device_name}: {avg_time:.4f} seconds")
    return avg_time

# Time CPU matmul
cpu_time = time_matmul(a_cpu, b_cpu, 'CPU')

# Time GPU matmul
gpu_time = time_matmul(a_gpu, b_gpu, 'GPU')

# Print summary
print(f"Speedup (CPU time / GPU time): {cpu_time / gpu_time:.2f}x")

Added explicit tensor placement on CPU and GPU using tf.device.

Created large random matrices of the same size on both devices.

Measured average time of matrix multiplication over multiple runs.

Printed timing results and speedup ratio.

Results Interpretation

Before: No timing data, unsure about performance difference.

After: CPU matmul takes about 3.5 seconds, GPU matmul takes about 0.3 seconds, showing GPU is roughly 11 times faster for this operation.

Placing tensors and operations on GPU can greatly speed up heavy computations like matrix multiplication compared to CPU. TensorFlow allows explicit control of device placement to optimize performance.

Bonus Experiment

Try running the same experiment with smaller matrices (e.g., 500x500) and observe how the speedup changes.

💡 Hint

Smaller matrices may reduce GPU advantage because of overhead; measure times carefully and compare.

Practice

(1/5)

1. What is the main reason to use tf.device() in TensorFlow when working with GPUs and CPUs?

easy

A. To change the data type of a tensor

B. To save the model to disk

C. To initialize variables automatically

D. To specify whether a tensor or operation runs on CPU or GPU

GPU vs CPU tensor placement in TensorFlow - Experiment Comparison

Start learning this pattern below

Practice

Solution

Step 1: Understand the purpose of `tf.device()`

Step 2: Compare options with the function's purpose

Final Answer:

Quick Check:

Solution

Step 1: Recall TensorFlow device naming conventions

Step 2: Check each option's device string

Final Answer:

Quick Check:

Solution

Step 1: Analyze the device context used

Step 2: Understand device string output

Final Answer:

Quick Check:

Solution

Step 1: Check available GPU devices

Step 2: Understand TensorFlow behavior on invalid device

Final Answer:

Quick Check:

Solution

Step 1: Check for GPU availability

Step 2: Use conditional device placement

Step 3: Verify other options

Final Answer:

Quick Check:

Start learning this pattern below

Practice

Solution

Step 1: Understand the purpose of tf.device()

Step 2: Compare options with the function's purpose

Final Answer:

Quick Check:

Solution

Step 1: Recall TensorFlow device naming conventions

Step 2: Check each option's device string

Final Answer:

Quick Check:

Solution

Step 1: Analyze the device context used

Step 2: Understand device string output

Final Answer:

Quick Check:

Solution

Step 1: Check available GPU devices

Step 2: Understand TensorFlow behavior on invalid device

Final Answer:

Quick Check:

Solution

Step 1: Check for GPU availability

Step 2: Use conditional device placement

Step 3: Verify other options

Final Answer:

Quick Check:

Step 1: Understand the purpose of `tf.device()`