Limits

It sharpens judgment. It doesn't replace verification.

I built it to make risk visible, not to manufacture confidence. The final answer still needs tests, a real browser, and your own read of the situation.

What it won't fake

The honesty is part of the design.

  • No fake provider diversity. This is Codex playing several roles, not several independent models.
  • No unverified UI claims. A UI isn't "verified" unless Bob, or a real browser, actually ran the path.
  • No real billing data. Token reports are local estimates, not your actual Codex usage or remaining quota.
  • No silent expanded runs. Expanded can burn a lot of usage, so it won't start without your confirmation.
  • No mess in your repo. Sessions, prompts, stats, and history live in plugin-local state and stay gitignored.

Privacy and state

Local artifacts are still artifacts.

The helper script keeps session scaffolds, estimates, prompts, outputs, stats, history, and alter overrides in plugin-local state. Before you publish a repo, make sure .codex-council/ is gitignored and uncommitted.

# Check for local runtime artifacts before publishing.
git status --short
find . -name '.codex-council' -o -name '.DS_Store' -o -name '__pycache__'

Where it comes from

Built on the LLM Council idea, adapted for Codex.

The pattern comes from karpathy/llm-council and llm-council.dev: ask several independent models, anonymize the answers, rank them, then synthesize a final one. Codex Council keeps that shape but runs it through Codex roles, local scoring, token estimates, and optional browser evidence.