Astrolexis Astrolexis
v0.3 — All three agents live · Hunter (security) on Enterprise

Find the bugs your test suite can't see

Inquisitor is an autonomous testing daemon. Three AI agents invent flows, generate adversarial inputs, and watch for the bugs that look like success. They write a developer-ready PDF when they find something.

See how it works

Silent failures, hallucinations, edge cases

The bugs Inquisitor was built to catch. They all return exit code 0. None of them get caught by Jest, Playwright, or your eyeballs in code review.

Silent confabulation

Upstream errors pass as content

A sub-task returns HTTP 401. Your agent "analyzes" the error string as if it were the answer. Exit 0, fabricated response.

Telemetry contradiction

Exit 0 + timeout + empty stdout

Three signals pointing three different directions. The user sees "success" but nothing happened.

Information leak

Stack traces in user output

Empty input bypasses validation, hits the model server, and the user sees raw template source line numbers in the error.

Permission failure

Tools silently denied

Non-interactive mode with no UI to prompt. Every tool call denied. Looks like the system just decided to do nothing.

Adversarial input

Unicode, NUL bytes, control chars

Bidi overrides, zero-width joiners, ANSI sequences. Real users send these by accident. Real attackers send them on purpose.

Logical paradox

Contradictory instructions hang the agent

"Create file X but guarantee X has never existed." The agent doesn't recognize the contradiction and silently spins until the external timeout fires.

How it works

Three specialized agents. One pipeline. Generate scenarios with an LLM, run them against your target, judge the output with a separate LLM that's biased toward skepticism.

Live
Pathfinder
Realistic flows · regressions · golden paths

Generates legitimate user-flow scenarios that exercise typical and edge interactions. Catches regressions and broken happy paths before users do.

Live
Breaker
Adversarial · stress · edge inputs

Generates adversarial inputs designed to break things — Unicode pathologies, contradictory tasks, resource exhaustion, malformed payloads. Looks for crashes, hangs, leaks, silent corruption.

Enterprise
Hunter
Adversarial security probes

Prompt injection, secret leakage, path-traversal and auth-bypass probes against your agent or CLI surface. Surfaces shallow security regressions an attacker would try first — not deep binary 0-days. Available on the Enterprise plan.

1 inquisitor run breaker kcode --count 10 2 → daemon picks up job, agent generates 10 scenarios via LLM 3 → adapter spawns target with each scenario, captures stdout/stderr/exit 4 → separate judge LLM rates each output for crash/hang/leak/silent-failure 5 → findings persisted to SQLite; full agent reasoning traced 6 inquisitor report 12 → Markdown + PDF for the dev team

Real findings, real targets

Inquisitor was dogfooded against kcode, our own production AI coding CLI. In a handful of runs it surfaced bugs no human or test suite had caught.

HIGH

Cascading 401 hallucination

When a parallel sub-task returned HTTP 401, the orchestrator passed the error string as input to the next sub-task, which "analyzed" the 401 body as if it were the failing test output. Exit 0, fabricated answer.

target: kcode · agent: pathfinder
HIGH

Empty stdout, timed out despite exit 0

A logically contradictory task caused the agent to hang silently. Telemetry: exitCode=0, timedOut=true, stdout empty. Three signals saying three different things — user has no idea what happened.

target: kcode · agent: breaker
HIGH

Permission denial in non-interactive mode

--print mode with default permissionMode="ask" had no UI to prompt. Every tool call denied. CLI looked like it just decided to do nothing.

target: kcode · agent: pathfinder
MEDIUM

Whitespace prompt leaks Jinja template internals

Empty input bypassed local validation, hit the model server, raised an internal Jinja exception. Raw template source and line numbers surfaced in the user-facing error, with a misleading "transient — retry" hint when the failure was deterministic.

target: kcode · agent: breaker
MEDIUM

Tool result claimed work that never happened

Agent reported "linter ran, no errors" but the tool-call array showed only a single Read of package.json — no Bash invocation, no biome.json verification. Fabricated steps in the summary.

target: kcode · agent: pathfinder

What you get

When the daemon finds something, you get the artifact. Not a notification, not a dashboard — the actual report you'd hand to the dev team.

PDF

Cover, summary, severity-color findings

Hand it to a developer. Or attach it to a Jira ticket. Or print it. It reads like an audit document because it is one.

MARKDOWN

Greppable, copy-paste, PR-ready

Same content, plain text. Drop into a PR comment. Search by severity. Pipe through your tooling.

REPRO

One-line command per finding

Every finding includes a copy-paste shell command that reproduces it. No "works on my machine" excuses.

TRACES

Full agent reasoning, no black box

Appendix shows every LLM call the agent made — input prompt and raw response. You can audit how each finding was reached.

Pricing

Buy a pack of testing sessions, use them whenever. Sessions don't expire. One session = one job, up to 20 scenarios, full PDF + Markdown report.

Trial
7 days · card required
$0
  • 3 free sessions
  • Pathfinder + Breaker agents
  • Full PDF + Markdown reports
  • Cancel anytime in 7 days, no charge
  • Auto-converts to Starter ($39) after 7d unless cancelled
Starter
Try it on a side project
$39 / 5 sessions
  • $7.80 per session
  • Pathfinder + Breaker agents
  • Up to 20 scenarios per session
  • PDF + Markdown reports
  • Reasoning traces appendix
  • Email support
Studio
For small dev teams · QA
$449 / 100 sessions
  • $4.49 per session
  • Everything in Pro
  • 5 user accounts
  • Slack / email notifications
  • Bulk-share reports across team
Enterprise
For AI platform teams
Custom
  • Volume session packs
  • Hunter agent (security probes)
  • Custom target adapters
  • Dedicated capacity
  • SSO + SLA + dedicated support
Contact sales

FAQ

Does Inquisitor replace my unit tests?

No. Unit tests verify the code does what it's supposed to do. Inquisitor verifies the system handles what it's not supposed to do — adversarial inputs, contradictory tasks, partial upstream failures. They're complementary.

Why does it run locally if it's a SaaS?

Because Inquisitor needs to spawn your CLI, hit your localhost APIs, exercise your dev environment. Running in our cloud would mean punching holes through your firewall. Instead, the daemon runs on your machine; only the LLM calls are bridged through our service. We never see your code or test outputs.

Do I need to bring my own LLM API key?

No. Each session you buy includes all LLM usage required to run that job. We bridge the calls through our infrastructure using our own keys. You never deal with provider keys, rate limits, or upstream outages.

What exactly counts as one session?

One session = one job execution, up to 20 scenarios. The session includes the agent generating scenarios, running them against your target, judging the outputs, persisting findings, and producing the PDF + Markdown report. If you only run 5 scenarios, that's still 1 session — sessions are billed per job, not per scenario.

Do sessions expire?

No. Buy a pack, use them when you need them. They sit in your account indefinitely.

What targets can I test?

Today: any CLI tool. The kcode adapter is generic enough that most prompt-driven CLIs work out of the box. v0.4 adds an HTTP adapter for any REST/JSON service. v0.5 adds an iOS adapter for SwiftUI/SpriteKit apps.

How do I trust the findings?

Every report includes a full reasoning appendix — every LLM call the agent made, with the exact input prompt and raw response. Nothing is hidden. The judge model can be wrong, but its reasoning is auditable. Cross-judging (same scenario judged by two independent models, disagreement flagged for review) is on the v0.6 roadmap.

What runs on the macOS Apple Silicon arm64 machines I have?

The daemon is cross-platform. Validated on Linux x86_64 and macOS arm64 (Apple Silicon, M-series). The Inquisitor pipeline doesn't care what your target is doing internally — it tests whatever you point it at.

How does the 7-day trial work?

You sign up with email + credit card (we use Stripe — same checkout you've used a hundred times). You get 3 free sessions immediately. If you cancel any time within 7 days, you're not charged a cent. If you don't cancel, the card is charged for a Starter pack ($39 / 5 sessions) on day 7, and you keep going from there. One-click cancellation in your account settings.

What about subscriptions / recurring billing?

There aren't any. Inquisitor sells one-time session packs. The only auto-charge is the trial-to-Starter conversion (which you can cancel before it triggers). After that, you buy more session packs only if and when you need them.

Stop shipping silent failures

Start the 7-day trial. 3 free sessions, full PDF reports, no commitment beyond the card on file.

Talk to us