Data Tokenization

Table of Contents

What is data tokenization?
Types
Benefits
Impacts
Data tokenization vs. data masking

What is data tokenization?

Data tokenization is a process applied to datasets to protect sensitive information, most commonly data at rest. This technique replaces sensitive data with non-sensitive, stand-in data known as token data. The token data will be in the same format as the original data. For example, tokenized credit card numbers would still show in a sixteen-digit format.

With data tokenization, non-sensitive token data remains in the dataset, while the token’s reference to the original sensitive data is often stored securely outside of the system in a token server. When the original sensitive data is needed again, the token data’s relationship with the original sensitive data can be looked up on the token server; this is called a detokenization process.

Types of data tokenization

Companies have two options for data tokenization, which differ in the speeds needed for detokenization.

Vault tokenization: Vault tokenization stores the relationship data between the original sensitive data and its corresponding token in a secure, separate token server vault to be referenced when the original data is needed. Detokenizing this data can take a while, so if detokenization has to happen at scale, companies can consider vaultless tokenization.
Vaultless tokenization: Vaultless tokenization does not utilize a token server vault but rather keeps the data where it is but applies tokenization using cryptographic devices. Companies might choose this method to avoid long processing times to lookup relationships in a token server vault.

Benefits of using data tokenization

Data tokenization techniques are most commonly used in the payment processing industry. The Payment Card Industry Data Security Standard (PCI-DSS) requires that sensitive data such as credit card numbers be protected and that tokenization is recognized as a method to achieve this. However, data tokenization can be used to protect any kind of sensitive data.

One common use case for data tokenization methods is protecting patients' health information.

Companies use data tokenization to:

Meet industry security standards: Companies use data tokenization to meet industry security standard requirements, such as protecting sensitive payment data to meet PCI-DSS compliance requirements.
Reduce data misuse: Tokenized data removes a risk factor of misused or abused data. For example, without access to the token vault or cryptographic devices to detokenize data, the data is rendered useless outside of its system. For example, a bad actor cannot make purchases using tokenized credit card information.
Improve customer confidence: Customers want to know that important information such as their payment information is securely stored. Data tokenization can help customers feel confident that the companies they do business with protect their data.

Impacts of using data tokenization

The most common impact of using data tokenization techniques is to secure sensitive information.

Reduce threat vectors: Tokenizing data reduces the ability of bad actors to misuse sensitive data.
Reduce the need for advanced security controls: Using tokenized data can reduce the amount of sensitive data that requires more advanced security controls.

Data tokenization vs. data masking

Data tokenization is more commonly used to protect data at rest. This technique can introduce wait times when having to reference the relationship between the token data and the original sensitive data in the token server vault.

Data masking is more commonly used to protect data in use, most commonly in production environments for testing or in applications used by employees with limited access to actual data.

Merry Marwig, CIPP/US

Merry Marwig is a senior research analyst at G2 focused on the privacy and data security software markets. Using G2’s dynamic research based on unbiased user reviews, Merry helps companies best understand what privacy and security products and services are available to protect their core businesses, their data, their people, and ultimately their customers, brand, and reputation. Merry's coverage areas include: data privacy platforms, data subject access requests (DSAR), identity verification, identity and access management, multi-factor authentication, risk-based authentication, confidentiality software, data security, email security, and more.

Data Tokenization

What is data tokenization?

Types of data tokenization

Benefits of using data tokenization

Companies use data tokenization to:

Impacts of using data tokenization

Data tokenization vs. data masking

Recommended Articles

Tokenization vs. Encryption: What’s Best for Data Security?

by Sagar Joshi

User Role and Access Management

by Merry Marwig, CIPP/US

The Evolution of Privacy Enhancing Technologies (PETs) Trends in 2022

by Merry Marwig, CIPP/US

Tokenization vs. Encryption: What’s Best for Data Security?

by Sagar Joshi

User Role and Access Management

by Merry Marwig, CIPP/US