Data Protection

Overview

Data protection spans both compliance requirements and technical controls. Organizations must protect data not only to meet regulatory obligations but also to maintain customer trust and limit business risk. Privacy by design — the principle of embedding privacy considerations into every stage of system development — is mandatory under GDPR and is increasingly becoming the expectation under other frameworks. Effective data protection requires understanding what data you have, where it lives, how it flows, who can access it, and how long it must be retained.

Data Classification

Data classification assigns sensitivity levels to information assets, enabling organizations to apply proportionate security controls. Every piece of data should be classified, and the classification should drive technical and procedural safeguards.

Level	Description	Examples	Controls
Public	Information intended for public disclosure; no adverse impact if exposed	Marketing materials, public documentation, open-source code, press releases	No special controls required; integrity protections recommended
Internal	Information for internal use only; minor impact if disclosed	Internal policies, meeting notes, project plans, org charts	Access restricted to employees; basic access controls; no public sharing
Confidential	Sensitive business information; significant impact if disclosed	Customer lists, financial reports, contracts, source code, employee records	Encryption at rest and in transit; role-based access control; audit logging; NDA required for third-party access
Restricted	Highly sensitive data; severe regulatory, legal, or business impact if disclosed	PII, PHI, payment card data (PCI), trade secrets, cryptographic keys	Strong encryption (AES-256); strict need-to-know access; MFA required; full audit trail; data loss prevention (DLP); breach notification obligations

Encryption Requirements

At Rest

Data at rest — stored in databases, file systems, object storage, or backups — must be encrypted to protect against unauthorized access in the event of physical theft, unauthorized backup access, or misconfigured storage permissions.

Requirement	Details
Algorithm	AES-256 (AES-256-GCM preferred for authenticated encryption)
Transparent encryption	Use platform-native encryption (AWS KMS + S3 SSE, Azure Storage Service Encryption, GCP CMEK) for minimal application changes
Key storage	Keys must be stored separately from encrypted data; use dedicated key management services (AWS KMS, Azure Key Vault, HashiCorp Vault, GCP Cloud KMS)
Key rotation	Rotate encryption keys at least annually; automate rotation where possible; retain old keys for decrypting historical data
Database encryption	Enable Transparent Data Encryption (TDE) for relational databases; use field-level encryption for highly sensitive columns
Backup encryption	All backups must be encrypted with the same rigor as primary data; test restoration from encrypted backups regularly

In Transit

Data in transit — moving between clients and servers, between services, or between data centers — must be protected from eavesdropping and tampering.

Requirement	Details
Protocol	TLS 1.2 as minimum; TLS 1.3 preferred; disable TLS 1.0/1.1 and SSL entirely
HTTPS everywhere	All API endpoints, web interfaces, and webhooks must use HTTPS; redirect HTTP to HTTPS; set HSTS headers
Certificate validation	Validate server certificates in all clients; do not disable certificate verification in production; pin certificates for mobile apps and critical service-to-service calls where appropriate
Internal traffic	Encrypt service-to-service communication within the network using mTLS or a service mesh (Istio, Linkerd); do not assume internal networks are trusted
Certificate management	Use automated certificate management (Let's Encrypt, AWS ACM) to prevent expiration-related outages; monitor certificate expiry

Privacy Regulations

The following table compares the major privacy regulations that most organizations must consider. This is a summary — consult legal counsel for jurisdiction-specific compliance requirements.

Regulation	Jurisdiction	Key Requirements	Penalties
GDPR	European Union (and EEA)	Lawful basis for processing (consent, contract, legitimate interest); data subject rights (access, rectification, erasure, portability); Data Protection Officer (DPO) required for large-scale processing; breach notification to supervisory authority within 72 hours; Data Protection Impact Assessments (DPIA) for high-risk processing; privacy by design and by default	Up to 4% of global annual revenue or 20 million EUR, whichever is higher
CCPA/CPRA	California, USA	Right to know what data is collected; right to delete personal information; right to opt-out of sale/sharing of personal information; right to non-discrimination for exercising rights; data minimization and purpose limitation (CPRA); risk assessments for high-risk processing (CPRA)	Up to $7,500 per intentional violation; $2,500 per unintentional violation; private right of action for data breaches ($100-$750 per consumer per incident)
HIPAA	United States (healthcare)	Administrative, physical, and technical safeguards for Protected Health Information (PHI); Business Associate Agreements (BAA) required for third-party processors; access logs and audit controls; minimum necessary standard for PHI access; breach notification to individuals and HHS	Up to $1.5 million per year per violation category; criminal penalties for willful neglect (up to $250,000 and imprisonment)

PII Handling

Personally Identifiable Information (PII) requires specific technical and procedural controls throughout its lifecycle.

Data Minimization

Collect only the personal data that is strictly necessary for the stated purpose. Review data collection forms, API request schemas, and database schemas to identify and eliminate unnecessary fields. Ask: "Do we need this data to provide the service?" If not, do not collect it.

Purpose Limitation

Use personal data only for the purpose for which it was collected. If a new use case arises, assess whether it is compatible with the original purpose or whether new consent is required. Document the purpose for each data element in a data inventory.

Tokenization and Pseudonymization

Tokenization: Replace sensitive data with non-sensitive tokens that map back to the original data via a secure token vault. Tokens have no mathematical relationship to the original data.
Pseudonymization: Replace identifying fields with pseudonyms; the data can be re-identified with a separate key. GDPR recognizes pseudonymization as a risk-reduction measure but still treats pseudonymized data as personal data.

Right to Erasure (Right to be Forgotten)

Implement a process to delete or anonymize all personal data associated with a data subject upon verified request. This must cover primary databases, backups, caches, logs, analytics systems, and third-party processors. Key considerations:

Define what "erasure" means for each data store (hard delete vs. cryptographic erasure vs. anonymization).
Propagate deletion requests to all downstream systems and third-party processors.
Maintain a record of the deletion request and its execution (without retaining the deleted data itself).
Handle exceptions for legal holds and regulatory retention requirements.

Data Subject Access Requests (DSAR)

Build automated or semi-automated workflows to respond to DSARs within regulatory timelines (30 days under GDPR). The system must be able to locate, extract, and export all personal data associated with a given identity across all data stores.

Data Masking and Tokenization

Technique	When to Use	Details
Tokenization	When the original data must be retrievable by authorized systems (e.g., payment processing)	Non-reversible without access to the token vault; tokens can preserve format (e.g., same length as a credit card number); ideal for PCI DSS scope reduction
Format-Preserving Encryption (FPE)	When encrypted data must retain the same format as the original (e.g., SSN, phone number)	Encrypts data while preserving length and character set; useful for legacy systems that validate field formats; uses FF1 or FF3-1 algorithms
Dynamic Masking	When data must be masked in real-time based on the requesting user's role or context	Applied at query time; the underlying data is not modified; useful for analytics and support access where full data is not needed
Static Masking	When creating non-production copies of data (e.g., for testing or development environments)	Applied once to create a masked copy; original data structure is preserved; irreversible; ensures test environments contain no real PII

Key decisions:

Use tokenization when you need to reverse the process for legitimate business operations.
Use static masking for non-production environments — never use real PII in development or testing.
Use dynamic masking for production access control scenarios where different users need different levels of data visibility.

Data Retention

Data retention policies define how long data is kept and what happens when the retention period expires. Retaining data longer than necessary increases risk, storage costs, and regulatory exposure.

Define Retention Policies

Establish retention periods based on regulatory requirements, business needs, and contractual obligations.
Document retention policies in a data retention schedule that covers every data category.
Ensure retention periods are specific (e.g., "3 years after account closure") rather than vague (e.g., "as long as needed").

Automated Deletion

Implement automated processes to delete or anonymize data when the retention period expires.
Use database-level TTL (Time-to-Live) features, scheduled jobs, or lifecycle policies (e.g., S3 lifecycle rules) to enforce retention.
Verify automated deletion is working through regular audits.

Legal Hold

Implement a legal hold process that can suspend automated deletion for data relevant to litigation, regulatory investigations, or audits.
Legal holds must be granular — hold only the specific data relevant to the matter, not all data for a given user or category.
Track all active legal holds and release them promptly when they are no longer needed.

Backup Considerations

Retention policies must account for backup cycles. Data deleted from primary storage may still exist in backups.
Define how long backups are retained and how deletion requests are handled for data in backups (e.g., delete upon next backup rotation vs. cryptographic erasure of the backup encryption key).
Regularly test the ability to restore from backups and verify that retained data is still usable.

Best Practices

Maintain a comprehensive data inventory — you cannot protect data you do not know about; catalog all personal and sensitive data, where it is stored, how it flows, and who has access.
Apply privacy by design from the start — embed data protection into architecture decisions, not as a retrofit; conduct Data Protection Impact Assessments (DPIAs) for new systems that process personal data.
Encrypt by default — enable encryption at rest and in transit for all data stores and communication channels; treat encryption as baseline, not an add-on for sensitive data only.
Implement least-privilege access to sensitive data — restrict access to personal and confidential data based on role and business need; review access permissions regularly and revoke unnecessary grants.
Automate compliance processes — manual compliance processes do not scale; automate DSAR fulfillment, retention enforcement, consent management, and breach detection to reduce human error and response times.
Test your data deletion capabilities — regularly verify that erasure requests result in actual deletion across all systems, including backups, caches, logs, and third-party processors.
Monitor for data exfiltration — deploy Data Loss Prevention (DLP) tools to detect unauthorized data transfers via email, cloud storage, USB, and API calls; alert on anomalous data access patterns.
Train all employees on data handling — security and privacy training should cover data classification, acceptable use, incident reporting, and regulatory obligations; tailor training to roles (developers, support, executives).

ナビゲーション

Skillsとは？

リンク

Data Protection

Data Protection

Overview

Data Classification

Encryption Requirements

At Rest

In Transit

Privacy Regulations

PII Handling

Data Masking and Tokenization

Data Retention

Best Practices

関連スキル(🌐 Web開発)