PII Guard
Detect and redact sensitive personal information before sending data to LLM APIs.GDPR/RGPD compliantdata protection layer.
Regulatory Compliance
Ensures strict adherence to GDPR by anonymizing personal data before it hits cloud-based LLM APIs.
RGPD_GDPRLegal SafetyData Leak Protection (DLP)
Sensitive identifiers like CC numbers, SSNs, and emails are scrubbed locally within your infrastructure.
Leak_PreventionData IntegrityEnterprise Readiness
Comprehensive logging and handling of sensitive data, meeting the highest security audit standards.
Audit_ReadyAudit ControlCustomer Trust
Demonstrates responsible AI practices, building long-term trust through transparent data handling.
Trust_EngineEthical AIQuick Start
import { PIIGuard, redactPII } from '@orka-js/core'; // Simple one-linerconst safeText = redactPII("Contact john@example.com or call 06 12 34 56 78");// → "Contact [EMAIL] or call [PHONE]" // Full control with PIIGuard instanceconst guard = new PIIGuard({ detectTypes: ['email', 'phone', 'credit_card', 'iban'], redactBeforeLLM: true,}); const result = guard.detect("My card is 4111 1111 1111 1111");console.log(result.redactedText); // "My card is [CREDIT_CARD]"console.log(result.matches); // [{ type: 'credit_card', value: '4111...', ... }]Detected PII Types
Electronic Mail
(email)john@example.com, user.name@company.org
[EMAIL]Contact Numbers
(phone)(555) 123-4567, 06 12 34 56 78, +33 6 12 34 56 78
[PHONE]Financial Assets
(credit_card)4111111111111111, 5500 0000 0000 0004
[CREDIT_CARD]Government ID
(ssn)123-45-6789
[SSN]Banking Details
(iban)FR7630006000011234567890189
[IBAN]Network Identifiers
(ip_address)192.168.1.100, 2001:0db8:85a3::8a2e:0370:7334
[IP_ADDRESS]Personal Dates
(date_of_birth)15/03/1990, 1990-03-15
[DOB]Configuration
const guard = new PIIGuard({ // Enable/disable detection (default: true) enabled: true, // Types to detect (default: all common types) detectTypes: ['email', 'phone', 'credit_card', 'ssn', 'iban', 'ip_address'], // Auto-redact before LLM calls (default: true) redactBeforeLLM: true, // Use type-specific placeholders like [EMAIL] (default: true) useTypedPlaceholders: true, // Custom placeholder when useTypedPlaceholders is false redactionPlaceholder: '[REDACTED]', // Minimum confidence threshold 0-1 (default: 0.8) minConfidence: 0.8, // Throw error instead of redacting (strict mode) throwOnPII: false, // Callback when PII is detected onPIIDetected: (matches) => { console.warn('PII detected:', matches.map(m => m.type)); },});Integration with Orka
Use PIIGuard to protect data before sending to LLM APIs:
import { PIIGuard } from '@orka-js/core';import { OpenAIAdapter } from '@orka-js/openai'; const guard = new PIIGuard({ detectTypes: ['email', 'phone', 'credit_card', 'iban'], onPIIDetected: (matches) => { // Log for audit trail console.log(`[AUDIT] Redacted ${matches.length} PII items`); },}); const llm = new OpenAIAdapter({ apiKey: process.env.OPENAI_API_KEY! }); // Protect user input before LLM callasync function safeAsk(userQuestion: string) { const safeQuestion = guard.processForLLM(userQuestion); const result = await llm.generate(safeQuestion, { systemPrompt: "You are a helpful assistant.", }); return result.content;} // Example usageconst answer = await safeAsk( "My email is john@company.com and my card is 4111111111111111");// LLM receives: "My email is [EMAIL] and my card is [CREDIT_CARD]"Custom Patterns
Add organization-specific patterns for internal IDs, codes, or other sensitive data:
const guard = new PIIGuard({ customPatterns: [ { name: 'employee_id', pattern: /EMP-\d{6}/g, redactWith: '[EMPLOYEE_ID]', confidence: 0.95, }, { name: 'internal_project', pattern: /PRJ-[A-Z]{2}-\d{4}/g, redactWith: '[PROJECT_CODE]', }, { name: 'french_secu', pattern: /[12]\s?\d{2}\s?\d{2}\s?\d{2}\s?\d{3}\s?\d{3}\s?\d{2}/g, redactWith: '[SECU_NUMBER]', confidence: 0.9, }, ],}); const result = guard.detect("Employee EMP-123456 on project PRJ-FR-2024");// → "Employee [EMPLOYEE_ID] on project [PROJECT_CODE]"Allow List (Exceptions)
Exclude specific patterns from redaction (e.g., company emails, public IPs):
const guard = new PIIGuard({ allowList: [ /support@company\.com/, // Company support email /noreply@company\.com/, // No-reply email /192\.168\.\d+\.\d+/, // Internal network IPs ],}); const result = guard.detect("Contact support@company.com or personal@gmail.com");// → "Contact support@company.com or [EMAIL]"// Company email preserved, personal email redactedStrict Mode (Throw on PII)
For maximum security, throw an error instead of silently redacting:
import { PIIGuard, OrkaError, OrkaErrorCode } from '@orka-js/core'; const strictGuard = new PIIGuard({ throwOnPII: true,}); try { strictGuard.detect("My email is john@example.com");} catch (error) { if (OrkaError.isOrkaError(error)) { console.error(`PII detected: ${error.message}`); // Handle: ask user to remove sensitive data }}API Reference
Full Analysis
Comprehensive audit of the string. Returns an object with all PII matches, their positions, and the redacted version.
detect(text)Deep AuditQuick Masking
The standard filter. Returns only the sanitized text where all sensitive entities are replaced by placeholders.
redact(text)Data ScrubbingBoolean Guard
Lightweight check. Returns true if any PII is detected. Perfect for fast-fail pre-processing logic.
containsPII(text)ValidationAdaptive Gateway
Intelligent wrapper. Redacts content based on the 'redactBeforeLLM' flag or returns the raw input.
processForLLM(text)Context AwareCustom Extension
Inject new identification logic on the fly to detect domain-specific identifiers (e.g., internal IDs).
addPattern(ptn)Dynamic RuleException Handling
Define safe patterns that should never be redacted, preventing false positives on non-sensitive data.
allowList(regex)PrecisionConvenience Functions
import { redactPII, detectPII, createPIIGuard } from '@orka-js/core'; // Quick redactionconst safe = redactPII("Email: test@example.com"); // Quick detectionconst result = detectPII("Phone: 06 12 34 56 78"); // Create configured instanceconst guard = createPIIGuard({ detectTypes: ['email'] });Best Practices
- ✅ Always use PIIGuard on user-provided input before LLM calls
- ✅ Log redacted PII counts for audit compliance
- ✅ Use allow lists sparingly - only for truly public data
- ✅ Consider throwOnPII for high-security contexts
- ✅ Add custom patterns for organization-specific sensitive data
- ✅ Test with real-world data samples to tune confidence thresholds