Bird
Raised Fist0
TensorFlowml~20 mins

tf.data.Dataset creation in TensorFlow - Practice Problems & Coding Challenges

Choose your learning style10 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Challenge - 5 Problems
🎖️
TensorFlow Dataset Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
Predict Output
intermediate
2:00remaining
Output of Dataset from_tensor_slices with nested lists
What is the output of the following code snippet when iterated over?
import tensorflow as tf

data = [[1, 2], [3, 4], [5, 6]]
dataset = tf.data.Dataset.from_tensor_slices(data)
output = [element.numpy().tolist() for element in dataset]
TensorFlow
import tensorflow as tf

data = [[1, 2], [3, 4], [5, 6]]
dataset = tf.data.Dataset.from_tensor_slices(data)
output = [element.numpy().tolist() for element in dataset]
print(output)
A[[1], [2], [3], [4], [5], [6]]
B[1, 2, 3, 4, 5, 6]
C[[1, 3, 5], [2, 4, 6]]
D[[1, 2], [3, 4], [5, 6]]
Attempts:
2 left
💡 Hint
Remember that from_tensor_slices slices the first dimension, so each element corresponds to one sublist.
data_output
intermediate
2:00remaining
Number of elements in Dataset from_generator
Given the following generator and dataset creation code, how many elements does the dataset contain?
import tensorflow as tf

def gen():
    for i in range(5):
        yield i * 2

dataset = tf.data.Dataset.from_generator(gen, output_signature=tf.TensorSpec(shape=(), dtype=tf.int32))
count = sum(1 for _ in dataset)
TensorFlow
import tensorflow as tf

def gen():
    for i in range(5):
        yield i * 2

dataset = tf.data.Dataset.from_generator(gen, output_signature=tf.TensorSpec(shape=(), dtype=tf.int32))
count = sum(1 for _ in dataset)
print(count)
A0
B10
C5
D1
Attempts:
2 left
💡 Hint
Count how many times the generator yields values.
🔧 Debug
advanced
2:00remaining
Identify the error in Dataset creation from dict
What error does the following code raise when executed?
import tensorflow as tf

data = {'a': [1, 2], 'b': [3, 4, 5]}
dataset = tf.data.Dataset.from_tensor_slices(data)
for element in dataset:
    print(element)
TensorFlow
import tensorflow as tf

data = {'a': [1, 2], 'b': [3, 4, 5]}
dataset = tf.data.Dataset.from_tensor_slices(data)
for element in dataset:
    print(element)
ATypeError: Expected list or tuple
BValueError: All components must have the same size
CNo error, prints elements
DAttributeError: 'dict' object has no attribute 'numpy'
Attempts:
2 left
💡 Hint
Check if all lists in the dictionary have the same length.
visualization
advanced
2:00remaining
Visualize Dataset elements after map transformation
What is the output list after applying the map function to the dataset?
import tensorflow as tf

data = [1, 2, 3, 4]
dataset = tf.data.Dataset.from_tensor_slices(data)
dataset = dataset.map(lambda x: x * x)
output = [element.numpy() for element in dataset]
TensorFlow
import tensorflow as tf

data = [1, 2, 3, 4]
dataset = tf.data.Dataset.from_tensor_slices(data)
dataset = dataset.map(lambda x: x * x)
output = [element.numpy() for element in dataset]
print(output)
A[1, 4, 9, 16]
B[1, 2, 3, 4]
C[2, 4, 6, 8]
D[1, 8, 27, 64]
Attempts:
2 left
💡 Hint
The map function squares each element.
🚀 Application
expert
3:00remaining
Creating a Dataset from multiple numpy arrays with different shapes
You have two numpy arrays:
import numpy as np
import tensorflow as tf

arr1 = np.array([[1, 2], [3, 4], [5, 6]])
arr2 = np.array([10, 20, 30])

Which code snippet correctly creates a tf.data.Dataset that yields tuples of corresponding elements from arr1 and arr2?
TensorFlow
import numpy as np
import tensorflow as tf

arr1 = np.array([[1, 2], [3, 4], [5, 6]])
arr2 = np.array([10, 20, 30])

# Choose the correct dataset creation code
Adataset = tf.data.Dataset.from_tensor_slices((arr1, arr2))
Bdataset = tf.data.Dataset.from_tensor_slices(arr1).concatenate(tf.data.Dataset.from_tensor_slices(arr2))
Cdataset = tf.data.Dataset.from_tensor_slices(arr1).zip(tf.data.Dataset.from_tensor_slices(arr2))
Ddataset = tf.data.Dataset.from_tensor_slices(arr1 + arr2)
Attempts:
2 left
💡 Hint
from_tensor_slices can take a tuple of arrays with matching first dimension.

Practice

(1/5)
1. What is the main purpose of tf.data.Dataset in TensorFlow?
easy
A. To compile TensorFlow models
B. To create neural network layers
C. To visualize data in graphs
D. To manage and prepare data efficiently for TensorFlow models

Solution

  1. Step 1: Understand the role of tf.data.Dataset

    tf.data.Dataset is designed to handle data input pipelines, making data loading and preprocessing easier for TensorFlow models.
  2. Step 2: Differentiate from other TensorFlow components

    Creating layers, visualization, and compiling models are handled by other TensorFlow modules, not tf.data.Dataset.
  3. Final Answer:

    To manage and prepare data efficiently for TensorFlow models -> Option D
  4. Quick Check:

    tf.data.Dataset = data management [OK]
Hint: Remember: Dataset is for data, not model building [OK]
Common Mistakes:
  • Confusing dataset with model layers
  • Thinking it visualizes data
  • Assuming it compiles models
2. Which of the following is the correct way to create a tf.data.Dataset from a Python list [1, 2, 3]?
easy
A. dataset = tf.data.Dataset.from_tensor_slices([1, 2, 3])
B. dataset = tf.data.Dataset.from_list([1, 2, 3])
C. dataset = tf.data.Dataset.create([1, 2, 3])
D. dataset = tf.data.Dataset.make([1, 2, 3])

Solution

  1. Step 1: Recall correct Dataset creation methods

    The method from_tensor_slices is the standard way to create a dataset from a list or tensor by slicing elements.
  2. Step 2: Identify incorrect method names

    Methods like from_list, create, and make do not exist in TensorFlow's Dataset API.
  3. Final Answer:

    dataset = tf.data.Dataset.from_tensor_slices([1, 2, 3]) -> Option A
  4. Quick Check:

    Use from_tensor_slices for lists [OK]
Hint: Use from_tensor_slices to convert lists to datasets [OK]
Common Mistakes:
  • Using non-existent methods like from_list
  • Confusing Dataset creation with model creation
  • Trying to call Dataset directly
3. What will be the output of the following code?
import tensorflow as tf
list_data = [10, 20, 30]
dataset = tf.data.Dataset.from_tensor_slices(list_data)
for item in dataset:
    print(item.numpy())
medium
A. Tensor objects printed
B. [10, 20, 30]
C. 10 20 30 (each on a new line)
D. Error: Cannot iterate dataset

Solution

  1. Step 1: Understand from_tensor_slices behavior

    This method creates a dataset where each element is one item from the list, so iteration yields 10, then 20, then 30.
  2. Step 2: Analyze the loop and print statement

    Calling item.numpy() converts each tensor element to a Python number, printing each on its own line.
  3. Final Answer:

    10 20 30 (each on a new line) -> Option C
  4. Quick Check:

    Iterate dataset prints each element [OK]
Hint: from_tensor_slices yields one element per iteration [OK]
Common Mistakes:
  • Expecting a list printed at once
  • Not calling .numpy() to get values
  • Thinking iteration causes error
4. Identify the error in the following code snippet:
import tensorflow as tf
list_data = [1, 2, 3]
dataset = tf.data.Dataset.from_tensor(list_data)
medium
A. Method from_tensor does not exist
B. list_data should be a tensor, not a list
C. Dataset cannot be created from lists
D. Missing parentheses in Dataset call

Solution

  1. Step 1: Check Dataset API methods

    There is no method called from_tensor in the tf.data.Dataset API.
  2. Step 2: Correct method usage

    The correct method to create a dataset from a list or tensor is from_tensor_slices.
  3. Final Answer:

    Method from_tensor does not exist -> Option A
  4. Quick Check:

    Use from_tensor_slices, not from_tensor [OK]
Hint: Check method names carefully in Dataset API [OK]
Common Mistakes:
  • Using non-existent methods
  • Confusing from_tensor_slices with from_tensor
  • Assuming Dataset accepts lists directly without slicing
5. You want to create a tf.data.Dataset from a generator function that yields tuples of (features, label). Which of the following is the correct way to create this dataset?
hard
A. dataset = tf.data.Dataset.from_tensors(generator_func)
B. dataset = tf.data.Dataset.from_generator(generator_func, output_types=(tf.float32, tf.int32))
C. dataset = tf.data.Dataset.from_tensor_slices(generator_func())
D. dataset = tf.data.Dataset.from_list(generator_func)

Solution

  1. Step 1: Understand dataset creation from generators

    Use from_generator to create a dataset from a Python generator function, specifying output types.
  2. Step 2: Analyze other options

    from_tensor_slices expects a tensor or list, not a generator function; from_tensors creates a dataset with one element; from_list does not exist.
  3. Final Answer:

    dataset = tf.data.Dataset.from_generator(generator_func, output_types=(tf.float32, tf.int32)) -> Option B
  4. Quick Check:

    Use from_generator with output_types for generators [OK]
Hint: Use from_generator with output_types for generator functions [OK]
Common Mistakes:
  • Using from_tensor_slices on generator functions
  • Calling non-existent from_list method
  • Not specifying output_types with from_generator