Bird
Raised Fist0
TensorFlowml~20 mins

GPU vs CPU tensor placement in TensorFlow - Experiment Comparison

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Experiment - GPU vs CPU tensor placement
Problem:You want to understand how placing tensors on GPU or CPU affects computation speed in TensorFlow.
Current Metrics:No timing metrics collected yet.
Issue:You do not know how to measure and compare the speed difference between GPU and CPU tensor operations.
Your Task
Measure and compare the time taken to perform a large matrix multiplication on CPU and GPU tensors. Show that GPU placement speeds up the operation.
Use TensorFlow 2.x with eager execution.
Use the same matrix size for both CPU and GPU operations.
Measure time accurately using Python's time module.
Hint 1
Hint 2
Hint 3
Hint 4
Solution
TensorFlow
import tensorflow as tf
import time

# Check if GPU is available
if not tf.config.list_physical_devices('GPU'):
    raise RuntimeError('No GPU found. Please run on a machine with GPU.')

# Matrix size
matrix_size = 3000

# Create random matrices on CPU
with tf.device('/CPU:0'):
    a_cpu = tf.random.uniform((matrix_size, matrix_size), dtype=tf.float32)
    b_cpu = tf.random.uniform((matrix_size, matrix_size), dtype=tf.float32)

# Create random matrices on GPU
with tf.device('/GPU:0'):
    a_gpu = tf.random.uniform((matrix_size, matrix_size), dtype=tf.float32)
    b_gpu = tf.random.uniform((matrix_size, matrix_size), dtype=tf.float32)

# Function to time matrix multiplication

def time_matmul(a, b, device_name):
    # Warm-up run
    _ = tf.matmul(a, b)
    
    # Run multiple times
    runs = 5
    start = time.perf_counter()
    for _ in range(runs):
        _ = tf.matmul(a, b)
    end = time.perf_counter()
    avg_time = (end - start) / runs
    print(f"Average matmul time on {device_name}: {avg_time:.4f} seconds")
    return avg_time

# Time CPU matmul
cpu_time = time_matmul(a_cpu, b_cpu, 'CPU')

# Time GPU matmul
gpu_time = time_matmul(a_gpu, b_gpu, 'GPU')

# Print summary
print(f"Speedup (CPU time / GPU time): {cpu_time / gpu_time:.2f}x")
Added explicit tensor placement on CPU and GPU using tf.device.
Created large random matrices of the same size on both devices.
Measured average time of matrix multiplication over multiple runs.
Printed timing results and speedup ratio.
Results Interpretation

Before: No timing data, unsure about performance difference.

After: CPU matmul takes about 3.5 seconds, GPU matmul takes about 0.3 seconds, showing GPU is roughly 11 times faster for this operation.

Placing tensors and operations on GPU can greatly speed up heavy computations like matrix multiplication compared to CPU. TensorFlow allows explicit control of device placement to optimize performance.
Bonus Experiment
Try running the same experiment with smaller matrices (e.g., 500x500) and observe how the speedup changes.
💡 Hint
Smaller matrices may reduce GPU advantage because of overhead; measure times carefully and compare.

Practice

(1/5)
1. What is the main reason to use tf.device() in TensorFlow when working with GPUs and CPUs?
easy
A. To change the data type of a tensor
B. To save the model to disk
C. To initialize variables automatically
D. To specify whether a tensor or operation runs on CPU or GPU

Solution

  1. Step 1: Understand the purpose of tf.device()

    This function is used to tell TensorFlow where to place tensors or operations, either on CPU or GPU.
  2. Step 2: Compare options with the function's purpose

    Changing data types, initializing variables, or saving models are unrelated to device placement.
  3. Final Answer:

    To specify whether a tensor or operation runs on CPU or GPU -> Option D
  4. Quick Check:

    tf.device() controls device placement = B [OK]
Hint: tf.device() sets CPU or GPU for tensors [OK]
Common Mistakes:
  • Confusing device placement with data type changes
  • Thinking tf.device() initializes variables
  • Assuming tf.device() saves models
2. Which of the following is the correct syntax to place a tensor on GPU device 0 in TensorFlow?
easy
A. with tf.device('/GPU:0'): x = tf.constant([1, 2, 3])
B. with tf.device('device:GPU0'): x = tf.constant([1, 2, 3])
C. with tf.device('GPU0'): x = tf.constant([1, 2, 3])
D. with tf.device('/CPU:0'): x = tf.constant([1, 2, 3])

Solution

  1. Step 1: Recall TensorFlow device naming conventions

    TensorFlow uses '/GPU:0' to refer to the first GPU device.
  2. Step 2: Check each option's device string

    The correct format for GPU device 0 is with tf.device('/GPU:0'): x = tf.constant([1, 2, 3]). Formats like '/CPU:0', 'device:GPU0', and 'GPU0' are incorrect.
  3. Final Answer:

    with tf.device('/GPU:0'): x = tf.constant([1, 2, 3]) -> Option A
  4. Quick Check:

    Correct GPU device string = D [OK]
Hint: Use '/GPU:0' to specify first GPU device [OK]
Common Mistakes:
  • Using 'GPU0' without slash and colon
  • Confusing CPU and GPU device strings
  • Missing the 'with' context for tf.device
3. What will be the output device placement of the tensor x in the following code if a GPU is available?
with tf.device('/CPU:0'):
    x = tf.constant([1, 2, 3])
print(x.device)
medium
A. It will show a GPU device string like '/job:localhost/replica:0/task:0/device:GPU:0'
B. It will show a CPU device string like '/job:localhost/replica:0/task:0/device:CPU:0'
C. It will raise an error because GPU is available
D. It will show an empty string

Solution

  1. Step 1: Analyze the device context used

    The code uses with tf.device('/CPU:0'), so the tensor x is forced to be on CPU.
  2. Step 2: Understand device string output

    Printing x.device will show the full device string indicating CPU, regardless of GPU availability.
  3. Final Answer:

    It will show a CPU device string like '/job:localhost/replica:0/task:0/device:CPU:0' -> Option B
  4. Quick Check:

    Device context forces CPU = C [OK]
Hint: Device context overrides default device placement [OK]
Common Mistakes:
  • Assuming GPU is used automatically if available
  • Expecting error when CPU is forced
  • Thinking device string can be empty
4. Identify the error in this TensorFlow code snippet that tries to place a tensor on GPU:
with tf.device('/GPU:1'):
    x = tf.constant([4, 5, 6])
print(x.device)
Assuming the system has only one GPU device.
medium
A. Syntax error in tf.device string
B. No error, code runs fine on GPU 1
C. Error because GPU device '/GPU:1' does not exist
D. TensorFlow automatically switches to CPU without error

Solution

  1. Step 1: Check available GPU devices

    The system has only one GPU, which is '/GPU:0'. Trying to use '/GPU:1' refers to a non-existent second GPU.
  2. Step 2: Understand TensorFlow behavior on invalid device

    TensorFlow raises an error if the specified device does not exist.
  3. Final Answer:

    Error because GPU device '/GPU:1' does not exist -> Option C
  4. Quick Check:

    Invalid GPU index causes error = A [OK]
Hint: Check GPU count before using device index [OK]
Common Mistakes:
  • Assuming GPU indices start at 1
  • Expecting automatic fallback to CPU
  • Ignoring device existence errors
5. You want to speed up a large matrix multiplication in TensorFlow using GPU if available, but fall back to CPU if no GPU exists. Which code snippet correctly implements this logic?
hard
A. if tf.config.list_physical_devices('GPU'): with tf.device('/GPU:0'): result = tf.matmul(a, b) else: with tf.device('/CPU:0'): result = tf.matmul(a, b)
B. with tf.device('/GPU:0'): result = tf.matmul(a, b)
C. result = tf.matmul(a, b) # TensorFlow auto-chooses device
D. with tf.device('/CPU:0'): result = tf.matmul(a, b)

Solution

  1. Step 1: Check for GPU availability

    Use tf.config.list_physical_devices('GPU') to detect if GPU exists.
  2. Step 2: Use conditional device placement

    If GPU exists, place operation on '/GPU:0', else place on '/CPU:0' to ensure fallback.
  3. Step 3: Verify other options

    Forcing GPU without checking availability risks errors if no GPU. Auto-placement lacks explicit conditional control. Forcing CPU ignores available GPU.
  4. Final Answer:

    if tf.config.list_physical_devices('GPU'): with tf.device('/GPU:0'): result = tf.matmul(a, b) else: with tf.device('/CPU:0'): result = tf.matmul(a, b) -> Option A
  5. Quick Check:

    Conditional device placement with fallback = A [OK]
Hint: Check GPU presence before device placement [OK]
Common Mistakes:
  • Not handling fallback when GPU missing
  • Assuming TensorFlow always picks GPU
  • Forcing CPU even if GPU is available