Bird
Raised Fist0
Interview Prepdp-knapsackmediumAmazonGoogle

Minimum Subset Sum Difference

Choose your preparation mode4 modes available

Start learning this pattern below

Jump into concepts and practice - no test required

or
Recommended
Test this pattern10 questions across easy, medium, and hard to know if this pattern is strong
Steps
setup

Calculate total sum and initialize DP array

Calculate the total sum of the input array elements (1+6+11+5=23). Initialize a boolean DP array of length 24 with dp[0] = True and others False.

💡 Initializing dp[0] = True means sum 0 is always achievable (empty subset). This sets the base for building achievable sums.
Line:total_sum = sum(arr) dp = [False] * (total_sum + 1) dp[0] = True
💡 The DP array length depends on total sum; dp[0] = True is the base case for subset sums.
📊
Minimum Subset Sum Difference - Watch the Algorithm Execute, Step by Step
Watching the step-by-step updates of the DP array reveals how subset sums accumulate and why checking sums up to half the total sum is sufficient to find the minimal difference.
Step 1/10
·Active fillAnswer cell
Initializing dp array
i\w01234567891011121314151617181920212223
i=0TrueFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalse
dp[0] = True (base)
Item 0 - wt:1 val:1
i\w01234567891011121314151617181920212223
i=0TrueTrueFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalse
dp[0] True base
Item 1 - wt:6 val:6
i\w01234567891011121314151617181920212223
i=0TrueTrueFalseFalseFalseFalseTrueTrueFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalseFalse
dp[0] True base
Item 2 - wt:11 val:11
i\w01234567891011121314151617181920212223
i=0TrueTrueFalseFalseFalseFalseTrueTrueFalseFalseFalseTrueTrueFalseFalseFalseFalseTrueTrueFalseFalseFalseFalseFalse
dp[0] True base
Item 3 - wt:5 val:5
i\w01234567891011121314151617181920212223
i=0TrueTrueFalseFalseFalseTrueTrueTrueFalseFalseFalseTrueTrueFalseFalseFalseTrueTrueTrueFalseFalseFalseTrueTrue
dp[0] True base
Initializing dp array
i\w01234567891011121314151617181920212223
i=0TrueTrueFalseFalseFalseTrueTrueTrueFalseFalseFalseTrueTrueFalseFalseFalseTrueTrueTrueFalseFalseFalseTrueTrue
half = 11
Initializing dp array
i\w01234567891011121314151617181920212223
i=0TrueTrueFalseFalseFalseTrueTrueTrueFalseFalseFalseTrueTrueFalseFalseFalseTrueTrueTrueFalseFalseFalseTrueTrue
Checking dp[11]
Initializing dp array
i\w01234567891011121314151617181920212223
i=0TrueTrueFalseFalseFalseTrueTrueTrueFalseFalseFalseTrueTrueFalseFalseFalseTrueTrueTrueFalseFalseFalseTrueTrue
dp[11] True - chosen sum
Initializing dp array
i\w01234567891011121314151617181920212223
i=0TrueTrueFalseFalseFalseTrueTrueTrueFalseFalseFalseTrueTrueFalseFalseFalseTrueTrueTrueFalseFalseFalseTrueTrue
Answer cell
Initializing dp array
i\w01234567891011121314151617181920212223
i=0TrueTrueFalseFalseFalseTrueTrueTrueFalseFalseFalseTrueTrueFalseFalseFalseTrueTrueTrueFalseFalseFalseTrueTrue
Answer cell

Key Takeaways

The DP array tracks which subset sums are achievable as elements are processed.

This insight is hard to see from code alone because the boolean array updates are subtle and cumulative.

Backward iteration over sums prevents reuse of the same element multiple times in one iteration.

Visualizing the backward update clarifies why this order is necessary for correctness.

The minimal difference is found by scanning achievable sums only up to half the total sum.

This reduces the search space and is a key optimization that is not obvious without seeing the DP array.

Practice

(1/5)
1. Consider the following Python code implementing the space-optimized 0/1 Knapsack solution. What is the output when weights = [10, 20, 30], values = [60, 100, 120], and W = 50?
def knapsack_space_optimized(weights, values, W):
    n = len(weights)
    dp = [0] * (W + 1)
    for i in range(n):
        for w in range(W, weights[i] - 1, -1):
            dp[w] = max(dp[w], dp[w - weights[i]] + values[i])
    return dp[W]

weights = [10, 20, 30]
values = [60, 100, 120]
W = 50
print(knapsack_space_optimized(weights, values, W))
easy
A. 240
B. 180
C. 160
D. 220

Solution

  1. Step 1: Trace dp array updates for each item

    For item 0 (weight=10, value=60), dp[w] updated for w=50 down to 10, dp[10..50]=60. For item 1 (weight=20, value=100), dp[30]=max(dp[30], dp[10]+100)=160, dp[50]=max(dp[50], dp[30]+100)=220. For item 2 (weight=30, value=120), dp[50]=max(dp[50], dp[20]+120)=max(220, 160+120)=280 but dp[20] is 60, so dp[50]=max(220, 180)=220.
  2. Step 2: Final dp[50] value

    The maximum value achievable with capacity 50 is 220.
  3. Final Answer:

    Option D -> Option D
  4. Quick Check:

    Manual dp tracing confirms max value 220 [OK]
Hint: Trace dp updates backward to avoid overcounting [OK]
Common Mistakes:
  • Forwards iteration over w causes overcounting items.
2. You are given a list of positive integers and need to determine if it can be split into two subsets with equal sums. Which algorithmic approach guarantees an optimal solution for this problem?
easy
A. Dynamic programming using a subset-sum style 0/1 knapsack approach to check for a subset with sum equal to half the total
B. Divide and conquer by recursively splitting the array into halves and checking sums
C. Sorting the array and using two pointers to find two subsets with equal sums
D. Greedy algorithm that picks the largest elements first until half the total sum is reached

Solution

  1. Step 1: Understand the problem requires partitioning into two equal sum subsets

    Since the problem asks if the array can be split into two subsets with equal sums, it reduces to finding a subset with sum equal to half the total sum.
  2. Step 2: Identify the algorithm that checks subset sums efficiently

    The 0/1 knapsack dynamic programming approach can determine if such a subset exists by building a DP table or array that tracks achievable sums up to half the total.
  3. Final Answer:

    Option A -> Option A
  4. Quick Check:

    Greedy and sorting approaches fail on many inputs; DP guarantees correctness [OK]
Hint: Partition reduces to subset sum with half total [OK]
Common Mistakes:
  • Thinking greedy or sorting solves partition optimally
3. Consider the following buggy code for lastStoneWeightII. Which line contains the subtle bug that causes incorrect results on some inputs?
medium
A. Line 4: dp[0] = true is missing
B. Line 7: Outer loop over stones
C. Line 8: Inner loop iterates backwards
D. Line 10: Return statement calculation

Solution

  1. Step 1: Identify dp initialization

    dp[0] must be true to represent sum zero achievable with no stones; missing this means no sums are marked achievable.
  2. Step 2: Consequence of missing dp[0] = true

    Without dp[0] = true, dp array remains false, so no sums can be formed, causing the function to fail to find any valid partition.
  3. Final Answer:

    Option A -> Option A
  4. Quick Check:

    Initializing dp[0] is critical base case for subset sum DP [OK]
Hint: dp[0] = true is base case for subset sums [OK]
Common Mistakes:
  • Forgetting dp[0] initialization
  • Iterating forward in inner loop causing double counting
  • Miscomputing return value
4. Suppose the integer break problem is modified so that you can reuse the same integer parts multiple times (unbounded splits), and you want to find the maximum product. Which modification to the DP approach below correctly handles this variant?
def integer_break(n):
    dp = [0] * (n + 1)
    dp[1] = 1
    for i in range(2, n + 1):
        max_product = 0
        for j in range(1, i):
            max_product = max(max_product, max(j, dp[j]) * max(i - j, dp[i - j]))
        dp[i] = max_product
    return dp[n]
Options:
hard
A. Use a nested loop where for each i, iterate j from 1 to i and update dp[i] = max(dp[i], j * dp[i-j]) to allow reuse
B. Sort possible parts and greedily pick the largest part repeatedly to maximize product
C. Keep the same code but add memoization to avoid recomputation
D. Change inner loop to iterate over all j ≤ i and update dp[i] as max(dp[i], dp[i-j] * dp[j]) without max(j, dp[j])

Solution

  1. Step 1: Understand reuse requirement

    Allowing reuse means dp[i] depends on dp[i-j] multiplied by j (or dp[j]) where j can be reused multiple times.
  2. Step 2: Modify DP update

    Updating dp[i] = max(dp[i], j * dp[i-j]) correctly models unbounded splits by combining current part j and best product for remaining i-j.
  3. Step 3: Evaluate other options

    Change inner loop to iterate over all j ≤ i and update dp[i] as max(dp[i], dp[i-j] * dp[j]) without max(j, dp[j]) removes max(j, dp[j]) incorrectly; Keep the same code but add memoization to avoid recomputation adds memoization but doesn't change logic; Sort possible parts and greedily pick the largest part repeatedly to maximize product is greedy and fails on some inputs.
  4. Final Answer:

    Option A -> Option A
  5. Quick Check:

    Unbounded knapsack style DP uses dp[i] = max(dp[i], j * dp[i-j]) [OK]
Hint: Unbounded knapsack uses dp[i] = max(dp[i], j * dp[i-j]) [OK]
Common Mistakes:
  • Using greedy approach
  • Not adjusting DP recurrence for reuse
5. Suppose the problem is modified so that jobs can be scheduled multiple times (i.e., unlimited reuse of the same job), still without overlapping. Which modification to the optimal DP approach is correct?
hard
A. Replace binary search with linear search to find compatible jobs and keep DP unchanged
B. Use a 1D DP array indexed by time and for each time, consider all jobs that can start then, allowing reuse
C. Keep the same DP with binary search but remove the skip option to always take the current job
D. Sort jobs by start time and use memoized recursion without binary search

Solution

  1. Step 1: Understand unlimited reuse impact

    Allowing multiple uses of the same job means the problem resembles unbounded interval scheduling with no overlaps.
  2. Step 2: Modify DP approach

    Use a 1D DP array indexed by time (e.g., dp[t] = max profit up to time t). For each time, consider all jobs starting at or after t, allowing reuse by updating dp[end_i] = max(dp[end_i], dp[start_i] + profit_i).
  3. Final Answer:

    Option B -> Option B
  4. Quick Check:

    Binary search and skip logic do not handle reuse; time-indexed DP handles multiple scheduling [OK]
Hint: Use time-indexed DP for unlimited job reuse [OK]
Common Mistakes:
  • Trying to reuse binary search DP without changes
  • Removing skip option breaks correctness
  • Sorting by start time only