Knowledge & RAG

Index documents, perform semantic search, and build RAG pipelines withOrka JS.

Why Knowledge?

The orka.knowledge or Knowledge API follows Orka's intent-based design philosophy. Instead of dealing with low-level concepts like vectors, embeddings, or collections, you work with a high-level abstraction called "knowledge".

Abstraction

Hides complexity: chunking, embedding generation, vector storage, and similarity search are handled automatically.

Fluent API

Natural method names like create() and search(). Your code reads like plain English.

Configurable

Swap VectorDBs (Memory, Pinecone, Qdrant) instantly without changing a single line of application logic.

Consistent

Follows the universal Orka pattern across all modules: ask(), workflow(), agent(), and knowledge().

What Happens Under the Hood

When you call orka.knowledge.create(), Orka performs a sophisticated pipeline of operations:

Source Normalization

Your source (string, file, URL) is converted into a uniform Document[] format with unique IDs and metadata.

Smart Chunking

Documents are split into chunks with recursive overlap (default: 200 chars) to preserve context at boundaries.

Embedding Generation

Each chunk is transformed into a high-dimensional vector capturing the pure semantic meaning of the text.

Vector Storage

A collection is created and vectors are upserted in efficient batches (100) into your chosen VectorDB.

API Reference

`create(options)`

Creates a new knowledge base from your source documents. This is the main entry point for indexing content.

const result = await orka.knowledge.create({
  name: 'my-docs',           // Required unique name for this knowledge base 
  source: [...],             // Required: content to index (see formats below)
  chunkSize: 1000,           // Optional: max characters per chunk (default: 1000)
  chunkOverlap: 200,         // Optional: overlap between chunks (default: 200)
  metadata: {                // Optional: metadata attached to all documents
    category: 'support',
    version: '1.0',
  },
});
 
// Returns:
// {
//   name: 'my-docs',
//   documentCount: 5,      // Number of source documents processed
//   chunkCount: 23,        // Total chunks created
// }

Understanding Chunk Size & Overlap

chunkSize determines how much text each vector represents. Smaller chunks = more precise retrieval but less context. Larger chunks = more context but may include irrelevant info. chunkOverlap ensures important information at chunk boundaries isn't lost.

Source Formats

Orka accepts multiple source formats, automatically detecting and processing each type:

Single Document

Automatic chunking for long-form content. Perfect for articles or manual entries.

STRING

await orka.knowledge.create({
  name: 'article',
  source: 'This is a very long article about AI...',
});

Bulk Strings

Treats each array element as an independent entry before sub-chunking.

ARRAY<STR>

await orka.knowledge.create({
  name: 'faq',
  source: [
    'How do I reset my password? ...',
    'How do I contact support? ...'
  ],
});

Structured Knowledge

Preserves per-document metadata for advanced RAG filtering and search results.

OBJECTS

await orka.knowledge.create({
  name: 'products',
  source: [
    { text: 'iPhone 15...', metadata: { brand: 'Apple' } },
  ],
});

File System

Recursive loading of .txt, .md, .json, etc. Perfect for local dev and docs folders.

FS_PATH

// Directory or Single File
await orka.knowledge.create({
  name: 'docs',
  source: { path: './documentation/' },
});

Remote URL

Secure fetching with protocol validation, 30s timeout, and 50MB size limit.

HTTP_HTTPS

await orka.knowledge.create({
  name: 'web-source',
  source: { url: 'https://example.com/api' },
});

API Reference

`add(name, source, options?)`

Adds more documents to an existing knowledge base. Uses the same source formats as create().

// Add new documents to existing knowledge base
const { addedChunks } = await orka.knowledge.add('my-docs', [
  'New document content...',
  'Another new document...',
], {
  chunkSize: 1000,
  chunkOverlap: 200,
  metadata: { addedAt: '2024-01-15' },
});
 
console.log('Added ' + addedChunks + ' new chunks');

`search(name, query, options?)`

Performs semantic search on a knowledge base. Returns the most relevant chunks based on vector similarity.

const results = await orka.knowledge.search('my-docs', 'how to reset password', {
  topK: 10,         // Number of results to return (default: 5)
  minScore: 0.7,    // Minimum similarity score 0-1 (optional)
});
 
// Each result contains:
for (const result of results) {
  console.log(result.id);        // Unique chunk ID
  console.log(result.content);   // The chunk text
  console.log(result.score);     // Similarity score (0-1, higher = more relevant)
  console.log(result.metadata);  // { documentId, chunkIndex, ...your metadata }
}

How Semantic Search Works

Your query is converted to an embedding vector, then compared against all stored vectors using cosine similarity. This finds semantically similar content even if the exact words don't match. "reset password" will match "change my credentials" because they have similar meaning.

`delete(name)`

Deletes an entire knowledge base and all its vectors from the VectorDB.

await orka.knowledge.delete('my-docs');

How RAG Works with orka.ask()

RAG (Retrieval-Augmented Generation) combines your knowledge base with LLM generation. When you call orka.ask() with a knowledge parameter, here's the complete flow:

User Question→Embed Query→Vector Search→Top-K Chunks→Build Prompt→LLM→Answer

// Basic RAG query
const result = await orka.ask({
  question: 'How do I reset my password?',
  knowledge: 'my-docs',    // Name of your knowledge base
  topK: 5,                 // Number of chunks to retrieve
});
console.log(result.answer);
 
// With context inspection (for debugging)
const { answer, context } = await orka.ask({
  question: 'How do I reset my password?',
  knowledge: 'my-docs',    // Name of your knowledge base
  topK: 5,                 // Number of chunks to retrieve
  includeContext: true,    // Returns the chunks used
});
 
console.log('Answer:', answer);
console.log('Sources used:');
for (const chunk of context) {
  console.log('- [' + chunk.score.toFixed(2) + '] ' + chunk.content.slice(0, 100) + '...');
}

Complete Example

Here's a full example showing how to build a documentation Q&A system:

knowledge-qa-example.ts

import { Orka } from '@orka-js/core';
import { OpenAIAdapter } from '@orka-js/openai';
import { MemoryVectorAdapter } from '@orka-js/memory';
 
// 1. Initialize Orka with adapters 
const orka = new Orka({
  llm: new OpenAIAdapter({ apiKey: process.env.OPENAI_API_KEY }),
  vectorDB: new MemoryVectorAdapter(),
  defaults: {
    chunkSize: 800,      // Smaller chunks for precise retrieval
    chunkOverlap: 100,   // Overlap between chunks for context continuity
    topK: 3,             // Number of chunks to retrieve
  },
});
 
// 2. Create knowledge base from your docs
const { chunkCount } = await orka.knowledge.create({
  name: 'product-docs',
  source: { path: './docs/' },
  metadata: { source: 'official-docs' },
});
console.log('Indexed ' + chunkCount + ' chunks');
 
// 3. Add more content later
await orka.knowledge.add('product-docs', [
  { text: 'New feature: Dark mode is now available...', metadata: { type: 'release-note' } },
]);
 
// 4. Query with RAG
const { answer, context } = await orka.ask({
  question: 'How do I enable dark mode?',
  knowledge: 'product-docs',
  includeContext: true,
});
 
console.log('Answer:', answer);
console.log('Based on', context.length, sources);
 
// 5. Direct search (without LLM)
const results = await orka.knowledge.search('product-docs', 'dark mode', { topK: 5 });
for (const r of results) {
  console.log('[' + r.score.toFixed(2) + '] ' + r.content.slice(0, 80) + '...');
}

Pro Tips

💡 Choose the right chunk size

For Q&A: 500-1000 chars. For summarization: 1500-2000 chars. For code: 300-500 chars. Experiment to find what works best for your content.

💡 Use metadata for filtering

Add category, date, author, or type metadata. This enables filtered searches and helps trace which documents contributed to answers.

💡 Debug with includeContext

Always use includeContext: true during development. It shows exactly which chunks the LLM used, invaluable for tuning your RAG pipeline.

💡 Consider minScore

Set minScore (e.g., 0.7) to filter out low-relevance results. This prevents the LLM from being confused by irrelevant context.

Supported VectorDB Adapters

orka.knowledge works with any VectorDB adapter. Swap adapters without changing your application code:

Adapter	Speed	Scalability	Setup
Memory	Ultra-Fast	Low	Zero-Config
Pinecone	Fast	Infinite	API Key
Qdrant	Fast	High	Docker / Cloud
Chroma	Moderate	Medium	Easy

MemoryVectorAdapter

local

In-memory, perfect for development and testing.

PineconeAdapter

cloud

Managed cloud vector database, scales to billions.

QdrantAdapter

hybrid

High-performance, self-hosted or cloud-native.

ChromaAdapter

embedded

Open-source, embedded or client-server architecture.

// Development: in-memory
import { MemoryVectorAdapter } from '@orka-js/memory';
const vectorDB = new MemoryVectorAdapter();
 
// Production: Pinecone
import { PineconeAdapter } from '@orka-js/pinecone';
const vectorDB = new PineconeAdapter({
  apiKey: process.env.PINECONE_API_KEY,
  environment: 'us-east-1',
  indexName: 'my-index',
});
 
// Same orka.knowledge API works with both!