Introduction
Imagine telling a helpful robot exactly what you want, but someone else sneaks in and changes your instructions without you noticing. This problem happens with AI systems that follow prompts, where attackers try to trick the AI into doing something harmful or unexpected.