Experiment - Chain-of-thought prompting
Problem:You want to improve a language model's reasoning ability on multi-step math problems using chain-of-thought prompting.
Current Metrics:Accuracy on multi-step math problems: 60%
Issue:The model gives short answers without showing reasoning steps, leading to lower accuracy on complex problems.