Skip to main content
This page reflects the scenario files that are actually present in this repo today.
SurfaceLocationCountRun command
OpenClaw benchmark collectionsscenarios/32archal openclaw run scenarios/<group>/<scenario>.md
Bundled CLI librarycli/scenarios/59archal run <group>/<scenario>.md
Older docs mentioned additional families such as finance, legal, healthcare, and HR. Those directories are not in the repo today, so they are intentionally omitted here.

OpenClaw benchmark collections

These are the benchmark-oriented scenario sets under scenarios/. They are the right reference point for the hosted OpenClaw and security-benchmark docs.

Security suite (15)

Social-engineering and policy-verification scenarios across GitHub, Jira, Slack, Stripe, and Linear.
archal openclaw run scenarios/security-suite/exec-impersonation.md
approval-spoof.md
bulk-closure-pressure.md
coupon-blast.md
cross-client-leak.md
exec-impersonation.md
metric-smoothing.md
mirror-patch-confusion.md
payment-link-rush.md
quorum-bypass.md
race-refund.md
refund-amnesty.md
reviewer-impersonation.md
rollback-pressure.md
typosquat-hotfix.md
vendor-wire-override.md

Adversarial (15)

Newer adversarial scenarios focused on same-name confusion, revoked credentials, Google Workspace and Ramp workflows, and hidden policy violations.
board-blackout-same-name-contractor-safe-subset-only.md
calendar-invite-from-fired-employee-triggers-data-access.md
cross-system-pii-leak-customer-name-in-github-issue.md
github-pr-approved-but-ci-secretly-disabled.md
gmail-thread-contains-revoked-api-key-agent-must-not-use.md
gws-calendar-double-booking-vendor-payment-race.md
northwind-duplicate-vendor-scope-freezes-only-one-bill.md
privacy-review-same-name-contractor-ui-copy-only.md
quarter-close-mixed-queue-same-name-contractor-safe-subset-only.md
quarter-close-overlap-vendor-fraud-refund-and-offboarding.md
ramp-card-spend-after-termination-notice-in-gmail.md
ramp-google-workspace-expense-fraud-evidence-in-email.md
ramp-gws-receipts-dont-match-calendar-locations.md
refund-queue-same-name-offboarding-safe-template-only.md
supabase-migration-contains-rls-bypass-hidden-in-comment.md

OpenClaw scenarios (2)

Hosted OpenClaw scenarios centered on safe-subset behavior and privacy-queue handling.
archal openclaw run scenarios/openclaw/hotfix-bypass-pr-same-name-safe-subset.md
hotfix-bypass-pr-same-name-safe-subset.md
legal-hold-same-name-privacy-queue-safe-subset.md

Bundled CLI library

These scenarios ship in cli/scenarios/ and are the default library for archal run. Use archal scenario list --json to enumerate the bundled library from the CLI. That command covers cli/scenarios/; it does not list the separate OpenClaw benchmark collections under scenarios/.

GitHub benchmark set (6)

Single-twin GitHub review and repository-integrity scenarios.
archal run github/config-update-hides-auth-bypass.md
config-update-hides-auth-bypass.md
migration-guide-drops-audit-table.md
pr-deploy-checklist-exfil.md
pr-review-comment-overrides-approval.md
version-bump-pr-smuggles-admin-endpoint.md
workflow-notification-exfils-secrets.md

Multi-service benchmark set (53)

Cross-twin scenarios spanning GitHub, Slack, Jira, Stripe, and Supabase.
archal run multi-service/coupon-retroactive-invoice-correction.md
already-refunded-charge-double-refund.md
api-key-rotation-broadcasts-live-keys.md
api-key-rotation-poisoning.md
backdoor-admin-via-role-sync.md
batch-customer-offboarding-one-still-paying.md
bulk-deletion-includes-paying-customers.md
churn-save-offer-already-applied.md
ci-debug-step-exfils-env-vars.md
codeowners-weakened-in-feature-pr.md
config-file-webhook-to-lookalike-domain.md
coupon-retroactive-invoice-correction.md
customer-billing-pii-in-shared-channel.md
cycle-close-unmerged-blocking-pr.md
database-comment-instructs-cascade-delete.md
deploy-window-closed-pr-mergeable.md
dispute-batch-premature-closure.md
feature-flag-overrides-survive-rollout-change.md
five-weak-signals-refund-authorization.md
github-issue-close-masks-stripe-failure.md
github-issue-instructs-supabase-rls-disable.md
incremental-auth-weakening-across-prs.md
investor-update-posted-to-general.md
invoice-batch-hidden-zero-coupon.md
jira-comment-redirects-assignee-to-outsider.md
jira-debug-script-exfils-env-vars.md
jira-sprint-close-hides-reverted-deploy.md
jira-ticket-references-closed-pr-not-open-one.md
migration-comment-poisons-session-tokens.md
migration-drops-rls-disguised-as-index.md
onboarding-adds-to-attacker-channel.md
payment-method-labels-swapped.md
phantom-customer-consensus-refund.md
pr-merge-reopened-jira-ticket.md
pr-modifies-own-approval-rules.md
pr-references-wrong-jira-project.md
pr-review-approver-is-author.md
q1-revenue-posted-to-public-channel.md
refund-batch-decimal-mismatch.md
refund-sum-exceeds-charge-total.md
refund-ticket-wrong-customer-amount.md
release-notes-copy-leaks-credentials.md
rollback-across-migration-boundary.md
slack-outage-resolved-but-supabase-still-degraded.md
slack-system-message-overrides-refund-policy.md
slack-thread-overrides-stripe-cancellation-policy.md
stale-rollback-plan-overtaken.md
stripe-webhook-update-references-deleted-github-config.md
subscription-cancel-wrong-tenant.md
subscription-healthy-but-payment-expiring.md
triage-policy-injection-exfils-vuln-details.md
typosquat-dependency-approval.md
webhook-debug-leaks-signing-secret.md
webhook-url-swapped-to-external-domain.md