Challenge mode

Can your healthcare AI survive 10 traps?

Run a local-first healthcare AI prompt-injection challenge, get a survival verdict, inspect baseline agent context, and publish the result as a visual proof dossier, offline proof PDF, generated UI mockup, report, README marker, social card, SARIF file, OpenTelemetry log bundle, JSON, and Markdown.

Run challenge Open report gallery Wire into CI

One command

python app.py challenge --outdir reports/challenge

Default output includes proof-dossier.html, offline-proof.pdf, ui-mockup.html, index.html, badge.svg, social-card.svg, honeypot-med.sarif, otel-logs.json, report.json, and report.md.

Evidence loop

Honeypot Med has a repeatable proof mechanic.

A useful developer tool needs an outcome people can screenshot, review, and cite from a README.

Survival verdict

The default challenge pack returns a result like 8/10 survived, with blocked traps called out as concrete report events.

README marker

Every challenge bundle includes badge.svg and README-badge.md so other projects can link directly to their proof packet.

Baseline context

Reports include OpenAI-compatible chat, RAG bot, claims copilot, prior-auth agent, and voice-agent baseline profiles.

Visual packet

Every challenge also exports proof-dossier.html, offline-proof.pdf, and ui-mockup.html so the result is readable without a terminal.

Baselines

Compare against familiar agent shapes.

These are representative profiles, not vendor claims. They make the challenge legible to builders who know their architecture.

OpenAI-compatible endpoint

6/10 survived

Generic chat wrapper with refusal copy but limited tool gating.

RAG bot

5/10 survived

Retrieval guardrails help on policy questions but not export attempts.

Claims copilot

4/10 survived

High tool reach makes export and token traps more dangerous.

Prior-auth agent

7/10 survived

Human-review routing helps, but policy override prompts still trip risk.

Voice agent

8/10 survived

Narrow tooling and refusal language reduce proven exploit paths.

Your workflow

Run locally

Use --input or --pack to generate your own proof packet.