OrkaJS
Orka.JS

Chunking

Understand how Orka JS splits documents into optimally-sized chunks for embedding andretrieval.

How Chunking Works

Orka JS uses a recursive text splitter that intelligently cuts text at natural boundaries:

\n\n

Paragraph Breaks

Preserves natural structure first.

Step 1
\n

Line Breaks

Split by lines if paragraphs are too long.

Step 2
.

Sentences

Targets logical thought boundaries.

Step 3

Words

Last resort to avoid cutting a word in half.

Step 4
char

Characters

Strict limit enforcement (Emergency).

Step 5

Configuration

await orka.knowledge.create({
name: 'docs',
source: myContent,
chunkSize: 1000, // Max characters per chunk
chunkOverlap: 200, // Overlap between consecutive chunks
});

Recommended Sizes

Content ProfileDimensions (Size / Overlap)Strategic Goal
FAQ / Q&A
Size: 300Ovp: 50
Atomic precision
Technical Docs
Size: 1000Ovp: 200
Block Integrity
Long Articles
Size: 1200Ovp: 250
Narrative Flow
Legal / Contracts
Size: 600Ovp: 200
Clause context

💡 Why Overlap?

Overlap ensures information at chunk boundaries isn't lost. When a chunk ends mid-sentence, the next chunk starts a few hundred characters earlier, capturing the full context.