Challenge - 5 Problems
TensorFlow Dataset Master
Get all challenges correct to earn this badge!
Test your skills under time pressure!
❓ Predict Output
intermediate2:00remaining
Output of loading text files with tf.data
What is the output of this code snippet that loads text lines from two files using TensorFlow's tf.data API?
TensorFlow
import tensorflow as tf # Create two small text files with open('file1.txt', 'w') as f: f.write('apple\nbanana') with open('file2.txt', 'w') as f: f.write('cherry\ndate') # Load dataset from files files = ['file1.txt', 'file2.txt'] dataset = tf.data.TextLineDataset(files) # Collect all lines into a list lines = list(dataset.as_numpy_iterator()) print(lines)
Attempts:
2 left
💡 Hint
tf.data.TextLineDataset reads each line from all files in order.
✗ Incorrect
The TextLineDataset reads each line from the files in the order they are listed. So it outputs each line as a byte string in sequence.
❓ data_output
intermediate1:30remaining
Number of elements in a dataset from multiple CSV files
Given three CSV files each with 2 rows, what is the number of elements in the dataset created by tf.data.experimental.CsvDataset loading all files?
TensorFlow
import tensorflow as tf # Assume files: data1.csv, data2.csv, data3.csv each with 2 rows files = ['data1.csv', 'data2.csv', 'data3.csv'] dataset = tf.data.experimental.CsvDataset(files, [tf.float32, tf.int32]) count = 0 for _ in dataset: count += 1 print(count)
Attempts:
2 left
💡 Hint
Each file has 2 rows, and dataset reads all rows from all files.
✗ Incorrect
CsvDataset reads all rows from all files. 3 files * 2 rows each = 6 elements.
🔧 Debug
advanced1:30remaining
Error raised when loading non-existent files
What error will this code raise when trying to create a TextLineDataset from a file that does not exist?
TensorFlow
import tensorflow as tf files = ['missing_file.txt'] dataset = tf.data.TextLineDataset(files) for line in dataset: print(line.numpy())
Attempts:
2 left
💡 Hint
TensorFlow raises its own error type for missing files.
✗ Incorrect
TensorFlow raises tf.errors.NotFoundError when a file is missing during dataset creation or iteration.
🚀 Application
advanced2:30remaining
Creating a dataset from image files with labels
You have a folder with images and a CSV file mapping image filenames to labels. Which code snippet correctly creates a tf.data.Dataset yielding (image_tensor, label) pairs?
Attempts:
2 left
💡 Hint
CsvDataset is designed to read CSV files with typed columns.
✗ Incorrect
tf.data.experimental.CsvDataset reads CSV files with typed columns, allowing you to get filenames and labels. Then mapping loads images and pairs them with labels.
🧠 Conceptual
expert3:00remaining
Effect of interleave on dataset from multiple files
What is the main difference between tf.data.TextLineDataset(files) and tf.data.Dataset.from_tensor_slices(files).interleave(tf.data.TextLineDataset, cycle_length=2) when reading multiple text files?
Attempts:
2 left
💡 Hint
Interleave cycles through datasets to mix their elements.
✗ Incorrect
TextLineDataset reads all lines from the first file, then second, etc. Interleave cycles through files, reading lines from each in turn, mixing lines from files.