Test Runner & CI/CD

Automate quality testing with powerful assertions and reporters, seamlessly integrated into yourCI/CD pipelines.

import { minScore, maxScore, maxLatency, ConsoleReporter, JUnitReporter } from '@orka-js/evaluation';
 
const report = await orka.test({
  name: 'Regression Tests',
  dataset: [
    { input: 'What is Orka AI?', expectedOutput: 'A TypeScript framework.', knowledge: 'docs' },
  ],
  metrics: ['relevance', 'correctness', 'faithfulness'],
  assertions: [
    minScore('relevance', 0.7),
    maxScore('hallucination', 0.3),
    maxLatency(10000),
  ],
  reporters: [
    new ConsoleReporter(),
    new JUnitReporter('./test-results.xml'),
  ],
});
 
if (report.failed > 0) process.exit(1);

Assertions

Quality Gate

Enforce a minimum quality floor for semantic metrics.

minScore

Safety Gate

Cap sensitive metrics like toxicity or hallucination rates.

maxScore

Performance

Strict SLA monitoring. Fails if the model is too slow.

maxLatency

Budget

Control operational costs by limiting response volume.

maxTokens

Compliance

Verify presence of mandatory disclaimers or keywords.

contains

Moderation

Block forbidden words or sensitive internal terms.

notContains

Formatting

Validate output structure (JSON, Email, IDs).

matchesRegex

Logic

Implement complex domain-specific validation rules.

customAssertion

Reporters

Dev Mode

Formatted console output with pass/fail icons

ConsoleReporter

GitHub Actions

name: AI Quality Tests
on: [push, pull_request]
 
jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 20
      - run: npm install
      - run: npx tsx tests/ai-quality.ts
        env:
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
      - uses: dorny/test-reporter@v1
        if: always()
        with:
          name: AI Quality Report
          path: test-results.xml
          reporter: java-junit