name: pii-detection description: Detect personally identifiable information (PII) in code, data, and agent outputs before processing or storage. Multi-language, GDPR-aware. version: "1.0.0" last-updated: "2026-04-17" model_tested: "claude-sonnet-4-6" category: privacy platforms: [claude-code, codex, gemini-cli, cursor, copilot, windsurf, cline] language: en geo_relevance: [global, eu, fr] priority: high dependencies: mcp: [] skills: [] apis: [] data: [] update_sources:
- url: "https://www.cnil.fr/fr/reglement-europeen-protection-donnees" check_frequency: "yearly" last_checked: "2026-04-21" license: MIT
PII Detection
DISCLAIMER: This skill provides detection patterns only. It does not guarantee complete PII identification. Always verify with a DPO for GDPR compliance decisions.
When to Use
- Before sending data to an LLM API
- Before logging or storing agent outputs
- During data pipeline design
- When reviewing code that handles user data
- Before sharing datasets or exports
PII Categories
High Sensitivity (must always detect)
| Category | Patterns | Examples |
|---|---|---|
\b[\w.-]+@[\w.-]+\.\w+\b | user@example.com | |
| Phone | \+?\d[\d\s\-().]{7,}\d | +33 6 12 34 56 78 |
| SSN (FR) | \b[12]\s?\d{2}\s?\d{2}\s?\d{2}\s?\d{3}\s?\d{3}\s?\d{2}\b | 1 85 05 75 123 456 78 |
| SSN (US) | \b\d{3}-\d{2}-\d{4}\b | 123-45-6789 |
| Credit Card | \b\d{4}[\s-]?\d{4}[\s-]?\d{4}[\s-]?\d{4}\b | 4111-1111-1111-1111 |
| IBAN | \b[A-Z]{2}\d{2}[\s]?[\dA-Z]{4}[\s]?[\dA-Z]{4}[\s]?[\dA-Z]{4} | FR76 1234 5678 9012 |
| Passport | Context-dependent | 12AB34567 |
Medium Sensitivity (detect in context)
| Category | Patterns | Notes |
|---|---|---|
| Full name | Named entity recognition | Combine first + last name |
| Address | Street + city + postal code | Multi-format per country |
| Date of birth | Date near age/birth context | Not all dates are PII |
| IP Address | \b\d{1,3}(\.\d{1,3}){3}\b | IPv4; check if user-linked |
| Vehicle plate | Country-specific format | FR: AA-123-BB |
Low Sensitivity (flag only)
| Category | Notes |
|---|---|
| Age | PII only when combinable |
| Gender | PII only when combinable |
| Location (city) | PII only when precise |
| Job title | PII only with employer |
Detection Process
- Scan the input text for high-sensitivity patterns
- Flag medium-sensitivity items with context check
- Report findings with location and category
- Recommend action: redact, anonymize, or approve
Remediation Actions
| Action | When | Method |
|---|---|---|
| Redact | Before logging | Replace with [REDACTED] |
| Pseudonymize | Before analysis | Replace with consistent fake (e.g., hash-based) |
| Anonymize | Before sharing | Remove irreversibly (k-anonymity, differential privacy) |
| Encrypt | Before storage | AES-256 at rest, TLS in transit |
| Approve | Legitimate purpose | Document legal basis (GDPR Art. 6) |
GDPR Special Categories (Article 9)
These require explicit consent or legal basis:
- Racial or ethnic origin
- Political opinions
- Religious beliefs
- Trade union membership
- Genetic data
- Biometric data (for identification)
- Health data
- Sex life or sexual orientation
Detection: Use keyword lists + context analysis. Never store without documented legal basis.
Output Format
PII SCAN REPORT
Source: {file/variable/output}
Items found: {count}
[HIGH] Email: user@company.com (line 42)
[HIGH] Phone: +33 6 12 34 56 78 (line 67)
[MED] Full name: "Jean Dupont" (line 12, with address context)
[LOW] City: "Paris" (line 15, no combination risk)
Recommendation: Redact HIGH items before processing.
GDPR basis required: YES (personal data detected)
What This Skill Does NOT Do
- Does not encrypt or store data (patterns only)
- Does not determine GDPR legal basis (legal decision)
- Does not handle consent management (UI/UX concern)
- Does not replace a DPO assessment