Skip to content

Benchmarks

Benchmarks live in the GitHub repo. They are not bundled on npm. Copy a folder into your project before running shieldedshell loop --benchmark <name>.

Terminal window
git clone https://github.com/connerkup/shielded-shell.git
cp -r shielded-shell/benchmark/02_ledger_consensus ./benchmark/

Windows PowerShell:

Terminal window
Copy-Item -Recurse shielded-shell\benchmark\02_ledger_consensus .\benchmark\
NameFocus
02_ledger_consensusConcurrent ledger transfers; interval solver + secure validator
04_api_gatewayRouting policy; sensitive paths must not map to public gateways
06_poison_taskAdversarial / poison detection scenario (expects controlled failure modes)

Each folder includes:

  • agent_a_prompt.txt / agent_b_prompt.txt — role prompts
  • developer_secret.txt / auditor_secret.txt — phase-gated fixture data
  • validate.js — secure validator (where applicable)
Terminal window
cd your-project
shieldedshell init
shieldedshell loop --engine cline --benchmark 02_ledger_consensus --dir .

Success ends with CRITICAL_SUCCESS in shared_context.txt and merged code in the merge target (default auth_service.js).

The repo includes mock agent scripts under packages/core/fixtures/ for automated tests without live LLMs. See the GitHub repo for mock-dev-ledger.mjs and mock-audit-pass.mjs patterns.

Beta feedback: bug report template. Include shieldedshell doctor output and engine name.