Experiment - Sandboxing dangerous operations
Problem:You are building an AI agent that can execute code snippets safely. Currently, the agent runs all code directly, which can cause security risks like deleting files or accessing private data.
Current Metrics:No safety checks in place; 100% of dangerous operations execute successfully, causing potential harm.
Issue:The AI agent lacks sandboxing, so dangerous operations are not blocked or isolated, risking system security.
