OrkaJS
Orka.JS

Ask & Generate

Use orka.ask() for RAG-powered Q&A andorka.generate() fordirect LLM generation.

Idea

Import Methods

Standard Import

import { createOrka, OpenAIAdapter, MemoryVectorAdapter } from 'orkajs';

Optimized Import

Recommended
import { createOrka } from '@orka-js/core';
import { OpenAIAdapter } from '@orka-js/openai';
import { MemoryVectorAdapter } from '@orka-js/memory';

Enables tree-shaking for minimal bundle size in production.

# orka.ask()

The primary method for question-answering. When a knowledge base is specified, Orka AI automatically performs RAG (Retrieval-Augmented Generation). This is the most powerful method for building Q&A systems.

// Basic usage with RAG
const result = await orka.ask({
question: 'How do I reset my password?',
knowledge: 'support-docs', // Name of your knowledge base
});
 
console.log(result.answer); // To reset your password, go to Settings > Security...
 
// Advanced usage with all options
const result = await orka.ask({
question: 'How do I reset my password?',
knowledge: 'support-docs',
systemPrompt: 'You are a helpful support agent. Be concise and friendly.',
topK: 5, // Number of context chunks to retrieve
minScore: 0.7, // Minimum similarity score for chunks
temperature: 0.7, // LLM creativity (0-1)
maxTokens: 1024, // Maximum response length
includeContext: true, // Return retrieved chunks in result
});
 
console.log(result.answer); // The generated answer
console.log(result.context); // Array of retrieved chunks
console.log(result.latencyMs); // Total execution time
console.log(result.usage.totalTokens); // Tokens consumed

Parameters

question
string

The primary query. OrkaJS will use this to find relevant context and generate the answer.

required
knowledge
string

Target Knowledge Base. If omitted, the LLM responds using only its internal training data.

optional
topK
number

Controls the density of the RAG. Determines how many semantic chunks are injected into the prompt.

default: 5
minScore
number

Similarity threshold. Chunks with a score lower than this value will be ignored to prevent hallucinations.

0.0 to 1.0
systemPrompt
string

Override the default identity of the agent. Useful for enforcing specific personas or output formats.

optional
temperature
number

Higher values make the output more random, while lower values make it more focused and deterministic.

default: 0.7
includeContext
boolean

Debugging flag. When enabled, the raw source chunks used for the answer are returned in the response object.

default: false

Return Value

interface AskResult {
answer: string; // The generated answer
context?: ChunkResult[]; // Retrieved chunks (if includeContext: true)
latencyMs: number; // Total execution time in milliseconds
usage: {
promptTokens: number; // Tokens in the prompt
completionTokens: number; // Tokens in the response
totalTokens: number; // Total tokens consumed
};
}

# orka.generate()

Direct LLM generation without RAG. Use for creative tasks, text transformations, summarization, or any task that doesn't require external knowledge.

// Simple generation
const response = await orka.generate('Write a haiku about TypeScript');
console.log(response); // "Types flow like water..." (string)
 
// With options
const response = await orka.generate('Summarize this article: [article text]', {
temperature: 0.3, // Lower for more deterministic output
maxTokens: 200, // Limit response length
systemPrompt: 'You are a professional summarizer. Be concise.',
});
 
// Creative writing
const story = await orka.generate('Write a short story about a robot', {
temperature: 0.9, // Higher for more creativity
maxTokens: 1000,
systemPrompt: 'You are a creative fiction writer.',
});
 
// Code generation
const code = await orka.generate('Write a TypeScript function to sort an array', {
temperature: 0.2, // Low for precise code
systemPrompt: 'You are an expert TypeScript developer. Return only code.',
});

Parameters

prompt
string

The primary instruction. This is the main text the LLM will process and respond to.

required
options.temperature
number

Controls sampling randomness. Use 0 for strict logic (code, data) and 0.8+ for creative tasks.

default: 0.7
options.maxTokens
number

The strict limit for the response length. Prevents runaway generation and controls costs.

optional
options.systemPrompt
string

Initial instructions that set the persona, tone, and constraints of the assistant.

optional

# orka.embed()

Generate vector embeddings for text. Embeddings are high-dimensional vectors that capture semantic meaning. Use for custom similarity comparisons, clustering, or building your own retrieval systems.

// Single text embedding
const [embedding] = await orka.embed('Hello world');
console.log(embedding.length); // e.g., 1536 (depends on model)
console.log(embedding[0]); // e.g., 0.0234 (first dimension)
 
// Multiple texts (batched for efficiency)
const embeddings = await orka.embed([
'How do I reset my password?',
'I forgot my login credentials',
'What is the weather today?',
]);
 
// Compare similarity (cosine similarity)
function cosineSimilarity(a: number[], b: number[]): number {
const dot = a.reduce((sum, val, i) => sum + val * b[i], 0);
const magA = Math.sqrt(a.reduce((sum, val) => sum + val * val, 0));
const magB = Math.sqrt(b.reduce((sum, val) => sum + val * val, 0));
return dot / (magA * magB);
}
 
const sim1 = cosineSimilarity(embeddings[0], embeddings[1]); // ~0.85 (similar)
const sim2 = cosineSimilarity(embeddings[0], embeddings[2]); // ~0.45 (different)
When to use orka.embed()

Use orka.embed() when you need raw embeddings for custom logic. For standard RAG, use orka.knowledge.create() and orka.ask() instead — they handle embeddings automatically.

Complete Example

ask-generate-example.ts
import { createOrka } from '@orka-js/core';
import { OpenAIAdapter } from '@orka-js/openai';
import { MemoryVectorAdapter } from '@orka-js/memory';
 
const orka = createOrka({
llm: new OpenAIAdapter({ apiKey: process.env.OPENAI_API_KEY! }),
vectorDB: new MemoryVectorAdapter(),
});
 
// 1. Create a knowledge base
await orka.knowledge.create({
name: 'product-docs',
source: [
'Our product supports dark mode. Go to Settings > Appearance to enable it.',
'To reset your password, click "Forgot Password" on the login page.',
'Premium users get unlimited API calls and priority support.',
],
});
 
// 2. Ask questions with RAG
const result = await orka.ask({
question: 'How do I enable dark mode?',
knowledge: 'product-docs',
topK: 3,
includeContext: true,
});
 
console.log('Answer:', result.answer);
// "To enable dark mode, go to Settings > Appearance."
 
console.log('Sources:', result.context?.map(c => c.content.slice(0, 50)));
// ["Our product supports dark mode. Go to Settings > ..."]
 
// 3. Direct generation (no RAG)
const summary = await orka.generate(
'Summarize: ' + result.answer,
{ temperature: 0.3, maxTokens: 50 }
);
 
// 4. Get embeddings for custom logic
const [queryEmbed] = await orka.embed('dark mode settings');
console.log('Embedding dimensions:', queryEmbed.length);

Comparison Table

MethodPrimary Use CaseReturns
orka.ask()
Knowledge-grounded Q&A (RAG)INTELLIGENT
{ answer, usage }
orka.generate()
Text completion & TransformationCREATIVE
string
orka.embed()
Semantic vectors & SimilarityCORE
number[][]

Tree-shaking Imports

// Import from main package
import { createOrka } from '@orka-js/core';
 
// Import adapters separately
import { OpenAIAdapter } from '@orka-js/openai';
import { MemoryVectorAdapter } from '@orka-js/memory';