Output Parsers
Parse and validate LLM outputs into structured data with JSON, Zod schemas, lists, and auto-fixing.
Why Output Parsers?
LLMs return unstructured text. Output parsers extract structured data, validate formats, and handle errors, making LLM outputs reliable for downstream processing.
# JSONParser
Extract and parse JSON from LLM responses, even when wrapped in markdown code blocks or mixed with text.
import { JSONParser } from '@orka-js/tools'; const parser = new JSONParser({ strict: false }); // LLM response with JSON in markdownconst llmOutput = `Here's the data you requested: //json//{// "name": "Alice",// "age": 30,// "skills": ["TypeScript", "Python"]//} const data = parser.parse(llmOutput);console.log(data);// { name: 'Alice', age: 30, skills: ['TypeScript', 'Python'] } // Get format instructions for the LLMconst instructions = parser.getFormatInstructions();console.log(instructions);
Smart Extraction
JSONParser automatically extracts JSON from markdown code blocks, handles both objects and arrays, and provides clear error messages when parsing fails.
# StructuredOutputParser
Parse and validate LLM outputs against a Zod schema for type-safe structured data.
📦 Installation Required
StructuredOutputParser requires Zod for schema validation:
npm install zodimport { StructuredOutputParser } from '@orka-js/tools';import { z } from 'zod'; // Define schemaconst schema = z.object({ name: z.string().describe('Person name'), age: z.number().describe('Age in years'), email: z.string().email().describe('Email address'), skills: z.array(z.string()).describe('List of skills')}); // Create parserconst parser = StructuredOutputParser.fromZodSchema(schema); // Get format instructions to send to LLMconst instructions = parser.getFormatInstructions();const prompt = `${isEn ? "Extract person info from this text:" : "Extrait les informations de cette personne:"} "Alice is 30 years old..." ${instructions}`; const llmResponse = await llm.generate(prompt); // Parse and validatetry { const data = parser.parse(llmResponse.content); console.log(data); // { name: 'Alice', age: 30, email: 'alice@example.com', skills: [...] } // ✅ Type-safe and validated} catch (error) { console.error('Validation failed:', error.message);}Naive Parsing
Unchecked JSON
JSON.parse(rawResponse);- High Hallucination Risk
- Silent Runtime Failures
- Manual Type Casting
Structured Extraction
Schema Enforcement
parser.parse(llmOutput);- Type-Safe with Zod
- Automatic Error Catching
- Production Ready
# ListParser
Parse lists from LLM outputs, automatically handling bullet points, numbers, and custom separators.
import { ListParser } from '@orka-js/tools'; const parser = new ListParser({ separator: '\n', // Split by newline (default) trim: true // Remove whitespace}); const llmOutput = `Here are the top programming languages: - TypeScript- Python- Go- Rust`; const items = parser.parse(llmOutput);console.log(items);// ['TypeScript', 'Python', 'Go', 'Rust'] // Works with numbered lists tooconst numbered = `1. First item2. Second item3. Third item`; const items2 = parser.parse(numbered);// ['First item', 'Second item', 'Third item'] // Custom separatorconst csvParser = new ListParser({ separator: ',' });const csv = 'apple, banana, orange';console.log(csvParser.parse(csv));// ['apple', 'banana', 'orange']# AutoFixParser
Wraps any parser and automatically retries with LLM correction when parsing fails.
import { AutoFixParser, StructuredOutputParser } from '@orka-js/tools';import { z } from 'zod'; const schema = z.object({ name: z.string(), age: z.number()}); const baseParser = StructuredOutputParser.fromZodSchema(schema); const autoFixParser = new AutoFixParser({ parser: baseParser, maxRetries: 3, llm: orka.getLLM()}); // Malformed LLM outputconst badOutput = `{ "name": "Alice", "age": "thirty" // ❌ Should be number}`; // Try to parse with auto-fixtry { const data = await autoFixParser.parseWithRetry(badOutput); console.log(data); // { name: 'Alice', age: 30 } ✅ Fixed automatically} catch (error) { console.error('Failed after retries:', error);}Base Parsing Attempt
The engine applies the initial schema (JSON/Zod) to the LLM's raw response.
Step 1: ParseValidation PhaseError Capture
If a syntax or schema error occurs, the trace and original output are preserved.
Step 2: DetectFailure HandlingLLM Reflection
A specific sub-prompt instructs the LLM to fix its own output based on the error log.
Step 3: CorrectSelf-HealingFinal Validation
The corrected output is re-parsed. This loop continues until success or maxRetries.
Step 4: RetryCycle Repeat# XMLParser
Parse XML-tagged outputs from LLMs. Useful when you need multiple named fields without JSON formatting, which some LLMs handle more naturally with XML tags.
import { XMLParser } from '@orka-js/tools'; // Basic usage — extract all XML tagsconst parser = new XMLParser(); const llmOutput = `Here is my analysis: <summary>The product has strong market potential</summary><sentiment>positive</sentiment><confidence>0.92</confidence><reasoning>Based on market trends and competitor analysis, the product fills a clear gap.</reasoning>`; const data = parser.parse(llmOutput);console.log(data);// {// summary: 'The product has strong market potential',// sentiment: 'positive',// confidence: '0.92',// reasoning: 'Based on market trends and competitor analysis...'// } // Strict mode — require specific tagsconst strictParser = new XMLParser({ tags: ['summary', 'sentiment', 'confidence'], strict: true // Throws if any required tag is missing}); const result = strictParser.parse(llmOutput);// ✅ Validates that all required tags are present // Get format instructions for the LLMconsole.log(strictParser.getFormatInstructions());// "Your response must use the following XML tags:// <summary>value</summary>// <sentiment>value</sentiment>// <confidence>value</confidence>"# CSVParser
Parse CSV-formatted outputs into arrays of objects. Handles quoted fields, custom separators, and optional predefined headers. Ideal for tabular data extraction from LLMs.
import { CSVParser } from '@orka-js/tools'; // Auto-detect headers from first rowconst parser = new CSVParser(); const llmOutput = `name,role,experienceAlice,Engineer,5 yearsBob,Designer,3 yearsCharlie,Manager,8 years`; const data = parser.parse(llmOutput);console.log(data);// [// { name: 'Alice', role: 'Engineer', experience: '5 years' },// { name: 'Bob', role: 'Designer', experience: '3 years' },// { name: 'Charlie', role: 'Manager', experience: '8 years' }// ] // Predefined headers (no header row in data)const noHeaderParser = new CSVParser({ headers: ['product', 'price', 'stock'], separator: ';', // Custom separator strict: true // Enforce column count}); const tabData = `iPhone;999;trueMacBook;1999;false`; console.log(noHeaderParser.parse(tabData));// [// { product: 'iPhone', price: '999', stock: 'true' },// { product: 'MacBook', price: '1999', stock: 'false' }// ] // Handles quoted fields with commasconst quotedCSV = `name,description"Smith, John","Senior engineer, 10+ years"`;console.log(parser.parse(quotedCSV));// [{ name: 'Smith, John', description: 'Senior engineer, 10+ years' }]# CommaSeparatedListParser
A specialized parser for comma-separated lists. Simpler than CSVParser when you just need a flat list of values. Supports deduplication and automatic trimming.
import { CommaSeparatedListParser } from '@orka-js/tools'; const parser = new CommaSeparatedListParser({ trim: true, // Remove whitespace removeDuplicates: false // Keep duplicates}); const llmOutput = 'TypeScript, Python, Go, Rust, JavaScript';const items = parser.parse(llmOutput);console.log(items);// ['TypeScript', 'Python', 'Go', 'Rust', 'JavaScript'] // With deduplicationconst deduper = new CommaSeparatedListParser({ removeDuplicates: true });const dupes = 'apple, banana, apple, orange, banana';console.log(deduper.parse(dupes));// ['apple', 'banana', 'orange'] // Format instructions for the LLMconsole.log(parser.getFormatInstructions());// "Your response must be a comma-separated list of values.// Example: item1, item2, item3"Complete Example
import { createOrka } from '@orka-js/core';import { OpenAIAdapter } from '@orka-js/openai';import { StructuredOutputParser, AutoFixParser } from '@orka-js/tools';import { z } from 'zod'; const orka = createOrka({ llm: new OpenAIAdapter({ apiKey: process.env.OPENAI_API_KEY! }), vectorDB: /* ... */}); // Define schemaconst productSchema = z.object({ name: z.string(), price: z.number(), category: z.enum(['electronics', 'clothing', 'food']), inStock: z.boolean(), tags: z.array(z.string())}); // Create parser with auto-fixconst baseParser = StructuredOutputParser.fromZodSchema(productSchema);const parser = new AutoFixParser({ parser: baseParser, maxRetries: 2, llm: orka.getLLM()}); // Generate structured outputconst prompt = `Extract product information from this description:The iPhone 15 Pro costs $999 and is currently available. It's an electronics item with tags: smartphone, apple, 5g ${baseParser.getFormatInstructions()}`; const response = await orka.generate(prompt); // Parse with validation and auto-fixconst product = await parser.parseWithRetry(response); console.log(product);// {// name: 'iPhone 15 Pro',// price: 999,// category: 'electronics',// inStock: true,// tags: ['smartphone', 'apple', '5g']// }// ✅ Type-safe, validated, and auto-corrected if neededComparison
| Output Parser | Extraction Logic | Validation Strategy |
|---|---|---|
StructuredOutputEnterprise | Complex Type-safe ObjectsZod-driven mapping | Active Check |
AutoFixParserResilient | Self-healing DataRecursive LLM repair | Active Check |
JSONParserClassic | Standard Object DataStrict JSON parsing | Active Check |
XMLParserRobust | Tag-based Multi-fieldsStructural tag isolation | Active Check |
ListParserClean | Array CollectionsRegex-based splitting | Active Check |
CSVParserLegacy | Tabular Data StreamsColumnar extraction | Active Check |
CSVListLight | Quick Tags & EnumsComma-delimitated | Active Check |
Best Practices
1. Include Format Instructions
Always add parser.getFormatInstructions() to your prompts to guide the LLM.
2. Use Zod for Complex Schemas
StructuredOutputParser with Zod provides type safety, validation, and clear error messages.
3. Use AutoFix Sparingly
AutoFixParser makes extra LLM calls. Use it for critical data or when LLM outputs are unreliable.
Tree-shaking Imports
// ✅ Import only what you needimport { StructuredOutputParser } from '@orka-js/tools';import { AutoFixParser } from '@orka-js/tools';import { XMLParser } from '@orka-js/tools';import { CSVParser } from '@orka-js/tools';import { CommaSeparatedListParser } from '@orka-js/tools'; // ✅ Or import from indeximport { JSONParser, ListParser, XMLParser, CSVParser } from '@orka-js/tools';