What is Batching and shuffling in TensorFlow?

Batching helps train models faster by processing small groups of data at once. Shuffling mixes data so the model learns better and avoids bias.

Batching and shuffling in TensorFlow - Syntax, Examples & Explanation

Practice

(1/5)

1. What is the main purpose of batching data in TensorFlow during training?

easy

A. To group data into smaller sets for faster and efficient training

B. To randomly mix data to avoid bias

C. To increase the size of the dataset

D. To convert data into images

Solution

Step 1: Understand batching concept
Batching means grouping data into smaller sets instead of using all data at once.
Step 2: Identify batching benefit
This grouping helps speed up training and uses memory efficiently.
Final Answer:
To group data into smaller sets for faster and efficient training -> Option A
Quick Check:
Batching = grouping data for efficiency [OK]

Hint: Batching groups data; shuffling mixes data [OK]

Common Mistakes:

Confusing batching with shuffling
Thinking batching increases dataset size
Believing batching changes data type

2. Which of the following is the correct way to shuffle and batch a TensorFlow dataset named ds with batch size 32?

easy

A. ds.batch(100).shuffle(32)

B. ds.batch(32).shuffle(100)

C. ds.shuffle(32).batch(100)

D. ds.shuffle(100).batch(32)

Solution

Step 1: Recall correct order of operations
In TensorFlow, you first shuffle the dataset, then batch it.
Step 2: Match batch size and shuffle buffer
Shuffle buffer size is usually larger than batch size; here shuffle(100) and batch(32) is correct.
Final Answer:
ds.shuffle(100).batch(32) -> Option D
Quick Check:
Shuffle before batch = ds.shuffle().batch() [OK]

Hint: Shuffle first, then batch with correct sizes [OK]

Common Mistakes:

Batching before shuffling
Using smaller shuffle buffer than batch size
Mixing batch and shuffle parameters

3. What will be the output shape of batches if you run the following code on a dataset of 100 samples with shape (28, 28, 1)?

batched_ds = ds.batch(20)
for batch in batched_ds:
    print(batch.shape)

medium

A. (20, 28, 28) for all batches

B. (20, 28, 28, 1) for all batches

C. (100, 28, 28, 1) for all batches

D. (28, 28, 1) for all batches

Solution

Step 1: Understand batch size effect on shape
Batching groups samples; each batch has shape (batch_size, sample_shape).
Step 2: Calculate batch shapes for 100 samples with batch size 20
There will be 5 batches; first 4 batches have 20 samples, last batch also 20 (100 divisible by 20).
Final Answer:
(20, 28, 28, 1) for all batches -> Option B
Quick Check:
Batch shape = (batch_size, sample_shape) [OK]

Hint: Batch shape adds batch size as first dimension [OK]

Common Mistakes:

Ignoring batch dimension in shape
Assuming last batch is smaller when divisible
Confusing sample shape with batch shape

4. You wrote this code but the dataset is not shuffled properly:

ds = tf.data.Dataset.range(10)
ds = ds.batch(2).shuffle(5)

What is the main issue?

medium

A. Shuffle should be called before batch to mix individual elements

B. Shuffle buffer size is too large

C. Batch size must be 1 for shuffle to work

D. Dataset.range(10) cannot be shuffled

Solution

Step 1: Analyze order of shuffle and batch
Shuffling after batching shuffles batches, not individual elements.
Step 2: Correct order for proper shuffling
Shuffle should be called before batch to mix individual data points.
Final Answer:
Shuffle should be called before batch to mix individual elements -> Option A
Quick Check:
Shuffle before batch for proper mixing [OK]

Hint: Shuffle before batch to mix single items [OK]

Common Mistakes:

Calling shuffle after batch
Using too small shuffle buffer
Thinking batch size must be 1

5. You have a dataset with 103 samples. You want to shuffle it with a buffer size of 50 and batch it with size 20. How many batches will you get and what will be the size of the last batch if you use:

ds.shuffle(50).batch(20)

hard

A. 6 batches; last batch size 20

B. 5 batches; last batch size 20

C. 6 batches; last batch size 3

D. 5 batches; last batch size 3

Solution

Step 1: Calculate number of batches
103 samples divided by batch size 20 gives 5 full batches (20*5=100) plus 1 partial batch with 3 samples.
Step 2: Understand shuffle effect on batch count
Shuffling does not change total samples, so batch count remains 6 with last batch smaller.
Final Answer:
6 batches; last batch size 3 -> Option C
Quick Check:
103/20 = 5 full + 1 partial batch [OK]

Hint: Divide samples by batch size; last batch may be smaller [OK]

Common Mistakes:

Ignoring last partial batch
Assuming shuffle changes batch count
Miscounting batches as 5 instead of 6

Start learning this pattern below

Practice

Solution

Step 1: Understand batching concept

Step 2: Identify batching benefit

Final Answer:

Quick Check:

Solution

Step 1: Recall correct order of operations

Step 2: Match batch size and shuffle buffer

Final Answer:

Quick Check:

Solution

Step 1: Understand batch size effect on shape

Step 2: Calculate batch shapes for 100 samples with batch size 20

Final Answer:

Quick Check:

Solution

Step 1: Analyze order of shuffle and batch

Step 2: Correct order for proper shuffling

Final Answer:

Quick Check:

Solution

Step 1: Calculate number of batches

Step 2: Understand shuffle effect on batch count

Final Answer:

Quick Check: