LLM Providers

Configure the adapters for OpenAI, Anthropic, Mistral, and Ollama forOrka JS.

OpenAI

GPT-4o, GPT-4o-mini, embeddings

new OpenAIAdapter({
  apiKey: process.env.OPENAI_API_KEY!,
  model: 'gpt-4o-mini',
  embeddingModel: 'text-embedding-3-small',
})

Anthropic

Claude 3.5 Sonnet, Claude 3 Opus

new AnthropicAdapter({
  apiKey: process.env.ANTHROPIC_API_KEY!,
  model: 'claude-3-5-sonnet-20241022',
})

Mistral

Mistral Small, Medium, Large

new MistralAdapter({
  apiKey: process.env.MISTRAL_API_KEY!,
  model: 'mistral-small-latest',
})

Ollama

Local models, no API key

new OllamaAdapter({
  model: 'llama3.2',
  baseURL: 'http://localhost:11434',
})

Google AI

Gemini 1.5 Flash, Pro

new GoogleAdapter({
  apiKey: process.env.GOOGLE_API_KEY!,
  model: 'gemini-1.5-flash',
})

Cohere

Command R+, embeddings

new CohereAdapter({
  apiKey: process.env.COHERE_API_KEY!,
  model: 'command-r-plus',
})

Replicate

Open source models, Llama, etc.

new ReplicateAdapter({
  apiKey: process.env.REPLICATE_API_TOKEN!,
  model: 'meta/llama-2-70b-chat',
})

Adapter Configuration

Each adapter accepts specific configuration options. Here are the full options for each provider:

OpenAIAdapter

new OpenAIAdapter({
  apiKey: string,              // Required: OpenAI API key
  model?: string,              // Default: 'gpt-4o-mini'
  embeddingModel?: string,     // Default: 'text-embedding-3-small'
  baseURL?: string,            // Custom API endpoint (for Azure, proxies)
  timeoutMs?: number,          // Request timeout in milliseconds
  maxRetries?: number,         // Retry attempts on failure
})

generate(prompt, options)Generate text completion

embed(texts)Generate embeddings for texts

chat(messages, options)Multi-turn conversation

AnthropicAdapter

new AnthropicAdapter({
  apiKey: string,              // Required: Anthropic API key
  model?: string,              // Default: 'claude-3-5-sonnet-20241022'
  maxTokens?: number,          // Default max tokens for responses
  timeoutMs?: number,          // Request timeout in milliseconds
})

Note: Anthropic does not provide embeddings. Use OpenAIAdapter for embeddings when using Anthropic for generation.

MistralAdapter

new MistralAdapter({
  apiKey: string,              // Required: Mistral API key
  model?: string,              // Default: 'mistral-small-latest'
  embeddingModel?: string,     // Default: 'mistral-embed'
  timeoutMs?: number,          // Request timeout in milliseconds
})

OllamaAdapter

new OllamaAdapter({
  model: string,               // Required: Model name (e.g., 'llama3.2', 'mistral')
  baseURL?: string,            // Default: 'http://localhost:11434'
  embeddingModel?: string,     // Model for embeddings (e.g., 'nomic-embed-text')
})

Ollama runs locally with no API key required. Install from ollama.ai and pull models with 'ollama pull llama3.2'.

# LLMAdapter Interface

All adapters implement the LLMAdapter interface, ensuring consistent behavior:

interface LLMAdapter {
  // Generate text from a prompt
  generate(prompt: string, options?: GenerateOptions): Promise<GenerateResult>;
 
  // Generate embeddings for one or more texts
  embed(texts: string | string[]): Promise<number[][]>;
 
  // Multi-turn chat conversation
  chat(messages: ChatMessage[], options?: ChatOptions): Promise<ChatResult>;
}
 
interface GenerateOptions {
  temperature?: number;    // 0-1, controls randomness
  maxTokens?: number;      // Maximum response length
  systemPrompt?: string;   // System message for context
  stopSequences?: string[]; // Stop generation at these sequences
}
 
interface GenerateResult {
  content: string;         // Generated text
  usage: TokenUsage;       // Token consumption
  latencyMs: number;       // Response time
}

# Switching Providers

One of Orka AI's core strengths is provider portability. Your application code stays the same regardless of which LLM you use:

import { createOrka } from '@orka-js/core';
import { OpenAIAdapter } from '@orka-js/openai';
import { OllamaAdapter } from '@orka-js/ollama';
import { MemoryVectorAdapter } from '@orka-js/memory';
 
// Development: free, local
const devLLM = new OllamaAdapter({ model: 'llama3.2' });
 
// Production: powerful, cloud
const prodLLM = new OpenAIAdapter({ 
  apiKey: process.env.OPENAI_API_KEY!,
  model: 'gpt-4o' 
});
 
// Same application code works with both
const orka = createOrka({
  llm: process.env.NODE_ENV === 'production' ? prodLLM: devLLM,
  vectorDB: new MemoryVectorAdapter(),
});
 
// This works identically with any adapter
const answer = await orka.ask({
  question: 'What is TypeScript?',
  knowledge: 'docs',
});

# Model Comparison

Provider	Core Strength	Embeddings
OpenAI	Ecosystem & General Purpose	Native Support
Anthropic	Safety & Complex Reasoning	External Required
Mistral AI	Efficiency & Multilingual	Native Support
Ollama	Privacy & Local Dev	Native Support
Google AI	Multimodality (Gemini)	Native Support
Cohere	Enterprise RAG Mastery	Native Support
Replicate	Open Source Flexibility	External Required

Anthropic Hybrid Strategy

Anthropic does not offer a native embeddings API. To use Claude with Knowledge Bases, you must implement a dual-adapter architecture: one for reasoning and another for semantic search.

Configuration Recommandée

Reasoning (LLM)AnthropicAdapter

VectorizationOpenAIAdapter

Tree-shaking Imports

// ✅ Import only the adapters you need
import { OpenAIAdapter } from '@orka-js/openai';
import { AnthropicAdapter } from '@orka-js/anthropic';
import { MistralAdapter } from '@orka-js/mistral';
import { OllamaAdapter } from '@orka-js/ollama';
import { GoogleAdapter } from '@orka-js/google';
import { CohereAdapter } from '@orka-js/cohere';
import { ReplicateAdapter } from '@orka-js/replicate';