Nice to meet you.

Enter your email to receive our weekly G2 Tea newsletter with the hottest marketing news, trends, and expert opinions.

Data Tokenization

July 22, 2022

What is data tokenization?

Data tokenization is a process applied to datasets to protect sensitive information, most commonly data at rest. This technique replaces sensitive data with non-sensitive, stand-in data known as token data. The token data will be in the same format as the original data. For example, tokenized credit card numbers would still show in a sixteen-digit format.

With data tokenization, non-sensitive token data remains in the dataset, while the token’s reference to the original sensitive data is often stored securely outside of the system in a token server. When the original sensitive data is needed again, the token data’s relationship with the original sensitive data can be looked up on the token server; this is called a detokenization process.

Types of data tokenization

Companies have two options for data tokenization, which differ in the speeds needed for detokenization.

  • Vault tokenization: Vault tokenization stores the relationship data between the original sensitive data and its corresponding token in a secure, separate token server vault to be referenced when the original data is needed. Detokenizing this data can take a while, so if detokenization has to happen at scale, companies can consider vaultless tokenization.
  • Vaultless tokenization: Vaultless tokenization does not utilize a token server vault but rather keeps the data where it is but applies tokenization using cryptographic devices. Companies might choose this method to avoid long processing times to lookup relationships in a token server vault.

Benefits of using data tokenization

Data tokenization techniques are most commonly used in the payment processing industry. The Payment Card Industry Data Security Standard (PCI-DSS) requires that sensitive data such as credit card numbers be protected and that tokenization is recognized as a method to achieve this. However, data tokenization can be used to protect any kind of sensitive data.

One common use case for data tokenization methods is protecting patients' health information.

Impacts of using data tokenization

The most common impact of using data tokenization techniques is to secure sensitive information. 

  • Reduce threat vectors: Tokenizing data reduces the ability of bad actors to misuse sensitive data.
  • Reduce the need for advanced security controls: Using tokenized data can reduce the amount of sensitive data that requires more advanced security controls.

Data tokenization vs. data masking

Data tokenization is more commonly used to protect data at rest. This technique can introduce wait times when having to reference the relationship between the token data and the original sensitive data in the token server vault.

Data masking is more commonly used to protect data in use, most commonly in production environments for testing or in applications used by employees with limited access to actual data.


Get this exclusive AI content editing guide.

By downloading this guide, you are also subscribing to the weekly G2 Tea newsletter to receive marketing news and trends. You can learn more about G2's privacy policy here.