PII Guard

Detect and redact sensitive personal information before sending data to LLM APIs.GDPR/RGPD compliantdata protection layer.

Regulatory Compliance

Ensures strict adherence to GDPR by anonymizing personal data before it hits cloud-based LLM APIs.

RGPD_GDPRLegal Safety

Data Leak Protection (DLP)

Sensitive identifiers like CC numbers, SSNs, and emails are scrubbed locally within your infrastructure.

Leak_PreventionData Integrity

Enterprise Readiness

Comprehensive logging and handling of sensitive data, meeting the highest security audit standards.

Audit_ReadyAudit Control

Customer Trust

Demonstrates responsible AI practices, building long-term trust through transparent data handling.

Trust_EngineEthical AI

Quick Start

import { PIIGuard, redactPII } from '@orka-js/core';
 
// Simple one-liner
const safeText = redactPII("Contact john@example.com or call 06 12 34 56 78");
// → "Contact [EMAIL] or call [PHONE]"
 
// Full control with PIIGuard instance
const guard = new PIIGuard({
  detectTypes: ['email', 'phone', 'credit_card', 'iban'],
  redactBeforeLLM: true,
});
 
const result = guard.detect("My card is 4111 1111 1111 1111");
console.log(result.redactedText);  // "My card is [CREDIT_CARD]"
console.log(result.matches);       // [{ type: 'credit_card', value: '4111...', ... }]

Detected PII Types

Electronic Mail

(email)

john@example.com, user.name@company.org

[EMAIL]

Contact Numbers

(phone)

(555) 123-4567, 06 12 34 56 78, +33 6 12 34 56 78

[PHONE]

Financial Assets

(credit_card)

4111111111111111, 5500 0000 0000 0004

[CREDIT_CARD]

Government ID

(ssn)

123-45-6789

[SSN]

Banking Details

(iban)

FR7630006000011234567890189

[IBAN]

Network Identifiers

(ip_address)

192.168.1.100, 2001:0db8:85a3::8a2e:0370:7334

[IP_ADDRESS]

Personal Dates

(date_of_birth)

15/03/1990, 1990-03-15

[DOB]

Configuration

const guard = new PIIGuard({
  // Enable/disable detection (default: true)
  enabled: true,
 
  // Types to detect (default: all common types)
  detectTypes: ['email', 'phone', 'credit_card', 'ssn', 'iban', 'ip_address'],
 
  // Auto-redact before LLM calls (default: true)
  redactBeforeLLM: true,
 
  // Use type-specific placeholders like [EMAIL] (default: true)
  useTypedPlaceholders: true,
 
  // Custom placeholder when useTypedPlaceholders is false
  redactionPlaceholder: '[REDACTED]',
 
  // Minimum confidence threshold 0-1 (default: 0.8)
  minConfidence: 0.8,
 
  // Throw error instead of redacting (strict mode)
  throwOnPII: false,
 
  // Callback when PII is detected
  onPIIDetected: (matches) => {
    console.warn('PII detected:', matches.map(m => m.type));
  },
});

Integration with Orka

Use PIIGuard to protect data before sending to LLM APIs:

import { PIIGuard } from '@orka-js/core';
import { OpenAIAdapter } from '@orka-js/openai';
 
const guard = new PIIGuard({
  detectTypes: ['email', 'phone', 'credit_card', 'iban'],
  onPIIDetected: (matches) => {
    // Log for audit trail
    console.log(`[AUDIT] Redacted ${matches.length} PII items`);
  },
});
 
const llm = new OpenAIAdapter({ apiKey: process.env.OPENAI_API_KEY! });
 
// Protect user input before LLM call
async function safeAsk(userQuestion: string) {
  const safeQuestion = guard.processForLLM(userQuestion);
 
  const result = await llm.generate(safeQuestion, {
    systemPrompt: "You are a helpful assistant.",
  });
 
  return result.content;
}
 
// Example usage
const answer = await safeAsk(
  "My email is john@company.com and my card is 4111111111111111"
);
// LLM receives: "My email is [EMAIL] and my card is [CREDIT_CARD]"

Custom Patterns

Add organization-specific patterns for internal IDs, codes, or other sensitive data:

const guard = new PIIGuard({
  customPatterns: [
    {
      name: 'employee_id',
      pattern: /EMP-\d{6}/g,
      redactWith: '[EMPLOYEE_ID]',
      confidence: 0.95,
    },
    {
      name: 'internal_project',
      pattern: /PRJ-[A-Z]{2}-\d{4}/g,
      redactWith: '[PROJECT_CODE]',
    },
    {
      name: 'french_secu',
      pattern: /[12]\s?\d{2}\s?\d{2}\s?\d{2}\s?\d{3}\s?\d{3}\s?\d{2}/g,
      redactWith: '[SECU_NUMBER]',
      confidence: 0.9,
    },
  ],
});
 
const result = guard.detect("Employee EMP-123456 on project PRJ-FR-2024");
// → "Employee [EMPLOYEE_ID] on project [PROJECT_CODE]"

Allow List (Exceptions)

Exclude specific patterns from redaction (e.g., company emails, public IPs):

const guard = new PIIGuard({
  allowList: [
    /support@company\.com/,      // Company support email
    /noreply@company\.com/,      // No-reply email
    /192\.168\.\d+\.\d+/,    // Internal network IPs
  ],
});
 
const result = guard.detect("Contact support@company.com or personal@gmail.com");
// → "Contact support@company.com or [EMAIL]"
// Company email preserved, personal email redacted

Strict Mode (Throw on PII)

For maximum security, throw an error instead of silently redacting:

import { PIIGuard, OrkaError, OrkaErrorCode } from '@orka-js/core';
 
const strictGuard = new PIIGuard({
  throwOnPII: true,
});
 
try {
  strictGuard.detect("My email is john@example.com");
} catch (error) {
  if (OrkaError.isOrkaError(error)) {
    console.error(`PII detected: ${error.message}`);
    // Handle: ask user to remove sensitive data
  }
}

API Reference

Full Analysis

Comprehensive audit of the string. Returns an object with all PII matches, their positions, and the redacted version.

detect(text)Deep Audit

Quick Masking

The standard filter. Returns only the sanitized text where all sensitive entities are replaced by placeholders.

redact(text)Data Scrubbing

Boolean Guard

Lightweight check. Returns true if any PII is detected. Perfect for fast-fail pre-processing logic.

containsPII(text)Validation

Adaptive Gateway

Intelligent wrapper. Redacts content based on the 'redactBeforeLLM' flag or returns the raw input.

processForLLM(text)Context Aware

Custom Extension

Inject new identification logic on the fly to detect domain-specific identifiers (e.g., internal IDs).

addPattern(ptn)Dynamic Rule

Exception Handling

Define safe patterns that should never be redacted, preventing false positives on non-sensitive data.

allowList(regex)Precision

Convenience Functions

import { redactPII, detectPII, createPIIGuard } from '@orka-js/core';
 
// Quick redaction
const safe = redactPII("Email: test@example.com");
 
// Quick detection
const result = detectPII("Phone: 06 12 34 56 78");
 
// Create configured instance
const guard = createPIIGuard({ detectTypes: ['email'] });

Best Practices

✅ Always use PIIGuard on user-provided input before LLM calls
✅ Log redacted PII counts for audit compliance
✅ Use allow lists sparingly - only for truly public data
✅ Consider throwOnPII for high-security contexts
✅ Add custom patterns for organization-specific sensitive data
✅ Test with real-world data samples to tune confidence thresholds