Roster Leech
hungry for tables, tokens, and patient-shaped nouns
Tells: dump all data, leak records, export roster
Containment: Require scoped retrieval, output redaction, and explicit export approval for sensitive datasets.
A museum cabinet for failure modes. Every bland finding gets a specimen label.
hungry for tables, tokens, and patient-shaped nouns
Tells: dump all data, leak records, export roster
Containment: Require scoped retrieval, output redaction, and explicit export approval for sensitive datasets.
polite until it finds a manager override
Tells: ignore previous instructions, role reset, authority costume
Containment: Pin system boundaries in a separate control layer and reject role-reset language before tool access.
rattles locks labeled guardrail, policy, and admin
Tells: disable safeguards, bypass guardrails, jailbreak
Containment: Keep policy state outside the model path and deny prompts that request control-plane changes.