Ask & Generate
Use orka.ask() for RAG-powered Q&A andorka.generate() fordirect LLM generation.
Import Methods
Standard Import
import { createOrka, OpenAIAdapter, MemoryVectorAdapter } from 'orkajs';Optimized Import
Recommendedimport { createOrka } from '@orka-js/core';import { OpenAIAdapter } from '@orka-js/openai';import { MemoryVectorAdapter } from '@orka-js/memory';Enables tree-shaking for minimal bundle size in production.
# orka.ask()
The primary method for question-answering. When a knowledge base is specified, Orka AI automatically performs RAG (Retrieval-Augmented Generation). This is the most powerful method for building Q&A systems.
// Basic usage with RAGconst result = await orka.ask({ question: 'How do I reset my password?', knowledge: 'support-docs', // Name of your knowledge base}); console.log(result.answer); // To reset your password, go to Settings > Security... // Advanced usage with all optionsconst result = await orka.ask({ question: 'How do I reset my password?', knowledge: 'support-docs', systemPrompt: 'You are a helpful support agent. Be concise and friendly.', topK: 5, // Number of context chunks to retrieve minScore: 0.7, // Minimum similarity score for chunks temperature: 0.7, // LLM creativity (0-1) maxTokens: 1024, // Maximum response length includeContext: true, // Return retrieved chunks in result}); console.log(result.answer); // The generated answerconsole.log(result.context); // Array of retrieved chunksconsole.log(result.latencyMs); // Total execution timeconsole.log(result.usage.totalTokens); // Tokens consumedParameters
questionThe primary query. OrkaJS will use this to find relevant context and generate the answer.
knowledgeTarget Knowledge Base. If omitted, the LLM responds using only its internal training data.
topKControls the density of the RAG. Determines how many semantic chunks are injected into the prompt.
minScoreSimilarity threshold. Chunks with a score lower than this value will be ignored to prevent hallucinations.
systemPromptOverride the default identity of the agent. Useful for enforcing specific personas or output formats.
temperatureHigher values make the output more random, while lower values make it more focused and deterministic.
includeContextDebugging flag. When enabled, the raw source chunks used for the answer are returned in the response object.
Return Value
interface AskResult { answer: string; // The generated answer context?: ChunkResult[]; // Retrieved chunks (if includeContext: true) latencyMs: number; // Total execution time in milliseconds usage: { promptTokens: number; // Tokens in the prompt completionTokens: number; // Tokens in the response totalTokens: number; // Total tokens consumed };}# orka.generate()
Direct LLM generation without RAG. Use for creative tasks, text transformations, summarization, or any task that doesn't require external knowledge.
// Simple generationconst response = await orka.generate('Write a haiku about TypeScript');console.log(response); // "Types flow like water..." (string) // With optionsconst response = await orka.generate('Summarize this article: [article text]', { temperature: 0.3, // Lower for more deterministic output maxTokens: 200, // Limit response length systemPrompt: 'You are a professional summarizer. Be concise.',}); // Creative writingconst story = await orka.generate('Write a short story about a robot', { temperature: 0.9, // Higher for more creativity maxTokens: 1000, systemPrompt: 'You are a creative fiction writer.',}); // Code generationconst code = await orka.generate('Write a TypeScript function to sort an array', { temperature: 0.2, // Low for precise code systemPrompt: 'You are an expert TypeScript developer. Return only code.',});Parameters
promptThe primary instruction. This is the main text the LLM will process and respond to.
temperatureControls sampling randomness. Use 0 for strict logic (code, data) and 0.8+ for creative tasks.
maxTokensThe strict limit for the response length. Prevents runaway generation and controls costs.
systemPromptInitial instructions that set the persona, tone, and constraints of the assistant.
# orka.embed()
Generate vector embeddings for text. Embeddings are high-dimensional vectors that capture semantic meaning. Use for custom similarity comparisons, clustering, or building your own retrieval systems.
// Single text embeddingconst [embedding] = await orka.embed('Hello world');console.log(embedding.length); // e.g., 1536 (depends on model)console.log(embedding[0]); // e.g., 0.0234 (first dimension) // Multiple texts (batched for efficiency)const embeddings = await orka.embed([ 'How do I reset my password?', 'I forgot my login credentials', 'What is the weather today?',]); // Compare similarity (cosine similarity)function cosineSimilarity(a: number[], b: number[]): number { const dot = a.reduce((sum, val, i) => sum + val * b[i], 0); const magA = Math.sqrt(a.reduce((sum, val) => sum + val * val, 0)); const magB = Math.sqrt(b.reduce((sum, val) => sum + val * val, 0)); return dot / (magA * magB);} const sim1 = cosineSimilarity(embeddings[0], embeddings[1]); // ~0.85 (similar)const sim2 = cosineSimilarity(embeddings[0], embeddings[2]); // ~0.45 (different)Use orka.embed() when you need raw embeddings for custom logic. For standard RAG, use orka.knowledge.create() and orka.ask() instead — they handle embeddings automatically.
Complete Example
import { createOrka } from '@orka-js/core';import { OpenAIAdapter } from '@orka-js/openai';import { MemoryVectorAdapter } from '@orka-js/memory'; const orka = createOrka({ llm: new OpenAIAdapter({ apiKey: process.env.OPENAI_API_KEY! }), vectorDB: new MemoryVectorAdapter(),}); // 1. Create a knowledge baseawait orka.knowledge.create({ name: 'product-docs', source: [ 'Our product supports dark mode. Go to Settings > Appearance to enable it.', 'To reset your password, click "Forgot Password" on the login page.', 'Premium users get unlimited API calls and priority support.', ],}); // 2. Ask questions with RAGconst result = await orka.ask({ question: 'How do I enable dark mode?', knowledge: 'product-docs', topK: 3, includeContext: true,}); console.log('Answer:', result.answer);// "To enable dark mode, go to Settings > Appearance." console.log('Sources:', result.context?.map(c => c.content.slice(0, 50)));// ["Our product supports dark mode. Go to Settings > ..."] // 3. Direct generation (no RAG)const summary = await orka.generate( 'Summarize: ' + result.answer, { temperature: 0.3, maxTokens: 50 }); // 4. Get embeddings for custom logicconst [queryEmbed] = await orka.embed('dark mode settings');console.log('Embedding dimensions:', queryEmbed.length);Comparison Table
| Method | Primary Use Case | Returns |
|---|---|---|
orka.ask() | Knowledge-grounded Q&A (RAG)INTELLIGENT | { answer, usage } |
orka.generate() | Text completion & TransformationCREATIVE | string |
orka.embed() | Semantic vectors & SimilarityCORE | number[][] |
Tree-shaking Imports
// Import from main packageimport { createOrka } from '@orka-js/core'; // Import adapters separatelyimport { OpenAIAdapter } from '@orka-js/openai';import { MemoryVectorAdapter } from '@orka-js/memory';