LLMAssert
Configuration

Judge Configuration

Configure the LLM judge — models, timeouts, fallback chain, rate limiting

The judge is the LLM that evaluates your assertions. Configure it globally via playwright.config.ts or per-test with test.use().

Fallback chain

OPENAI_API_KEY set?
  Yes → GPT-5.4-mini (primary)
         Success → return score
         Fail/timeout → try fallback
  No  → skip to fallback

ANTHROPIC_API_KEY set? (@anthropic-ai/sdk installed?)
  Yes → Claude Haiku (fallback)
         Success → return score
         Fail/timeout → inconclusive
  No  → inconclusive

Provider outages or timeouts result in inconclusive (score: null) — the test passes. Your CI is never blocked by a judge API outage.

Rate limit errors (429) from the primary model get 3 retries with exponential backoff before falling through to the fallback.

JudgeConfig fields

FieldTypeDefaultDescription
primaryModelstring'gpt-5.4-mini'Primary judge model
fallbackModelstring'claude-3-5-haiku-20241022'Fallback judge model
timeoutnumber10000Timeout in ms. Exceeded = inconclusive, not fail
openaiApiKeystringprocess.env.OPENAI_API_KEYOpenAI API key
anthropicApiKeystringprocess.env.ANTHROPIC_API_KEYAnthropic API key
maxInputCharsnumber500000Max combined input character length
inputHandling'reject' | 'truncate''reject'How to handle oversized inputs
pricingRecord<string, {...}>Built-in tableCustom per-token pricing (USD)
rateLimit{ requestsPerMinute, burstCapacity }DisabledPer-worker rate limiting

Global configuration

Add judgeConfig to the use block in your playwright.config.ts:

playwright.config.ts
import { defineConfig } from "@playwright/test";

export default defineConfig({
  use: {
    judgeConfig: {
      primaryModel: "gpt-5.4-mini",
      timeout: 5000,
    },
  },
  reporter: [
    ["list"],
    [
      "@llmassert/playwright/reporter",
      {
        projectSlug: "my-project",
        apiKey: process.env.LLMASSERT_API_KEY,
      },
    ],
  ],
});

Per-test override

Override judge settings for a specific describe block with test.use():

import { test, expect } from "@llmassert/playwright";

test.describe("high-stakes evaluations", () => {
  test.use({ judgeConfig: { timeout: 15000 } });

  test("critical response is grounded", async () => {
    const response = "We process refunds within 5 business days.";
    const context = "Refunds are processed in 3-5 business days.";

    await expect(response).toBeGroundedIn(context, { threshold: 0.95 });
  });
});

Rate limiting

For CI environments with parallel Playwright workers, configure rate limiting to avoid hitting provider limits:

judgeConfig: {
  rateLimit: {
    requestsPerMinute: 30,  // per worker
    burstCapacity: 5,
  },
}

Rate limiting is per-worker. With 4 parallel workers at 30 RPM each, total throughput is 120 RPM. Set requestsPerMinute to Math.floor(your_provider_limit / worker_count).

On this page