Complete the code to initialize the beam width for beam search decoding.
beam_width = [1]The beam width controls how many candidate sequences are kept at each step. A typical value is 10.
Complete the code to select the top scoring sequences at each decoding step.
top_sequences = sorted(all_candidates, key=lambda x: x.score, reverse=[1])[:beam_width]
We sort candidates in descending order of score, so reverse=True is needed.
Fix the error in the code that updates the beam with new candidates.
beam = [1]The beam should be updated to the top scoring sequences only, not all candidates or an empty list.
Fill both blanks to complete the loop that expands sequences and applies beam search.
for step in range(max_length): all_candidates = [] for seq in beam: next_tokens = model.predict(seq.sequence) for token, score in next_tokens.items(): candidate = seq.sequence + [token] candidate_score = seq.score [1] score all_candidates.append(Candidate(candidate, candidate_score)) beam = sorted(all_candidates, key=lambda x: x.score, reverse=[2])[:beam_width]
We add the new token score to the sequence score, and sort descending (reverse=True) to keep best sequences.
Fill all three blanks to complete the beam search decoding function.
def beam_search_decode(model, start_token, beam_width, max_length): beam = [Candidate([start_token], 0.0)] for _ in range(max_length): all_candidates = [] for seq in beam: next_tokens = model.predict(seq.sequence) for token, score in next_tokens.items(): candidate = seq.sequence + [[1]] candidate_score = seq.score [2] score all_candidates.append(Candidate(candidate, candidate_score)) beam = sorted(all_candidates, key=lambda x: x.score, reverse=[3])[:beam_width] return beam
We add the predicted token to the sequence, add scores, and sort descending to keep best sequences.