Comparison
Evals tell you how a system performs. Proof bundles tell other people what happened.
AI evals are the right tool for repeated measurement and regression tracking. Honeypot Med exists for a different moment: when you need a suspicious prompt to become a clean artifact a founder, buyer, or security reviewer can understand fast.