In this blog, we’d like to give you an overview of the steps required to tokenize sensitive data, migrate the data to the cloud, and then de-tokenize the data once in the cloud.
We are using our BDM platform and the cloud provider is AWS.
Step 1. The pipeline
Let’s start with the pipeline. In our example, we’re getting data from an Oracle data warehouse which includes sensitive employees’ data, such as emails, credit cards details, and banking information among other data.
We can select a column we can define a routine tokenization for the data therein. Once the routine tokenization has been implemented, the email format will still exist, but the content will be masked (for more on this topic, see here ).
Each column in the table is tokenized individually and done via ‘no-code’ and in a user-friendly UI. This is advantageous for organisations, as developers are not required to perform the tokenization.
In our example below, we are saving our anonymized data as a CSV output to an S3 bucket, and then we run the job.
Step 2. The integrity of a dataset
By using BDM, the platform allows you to preserve the integrity of a dataset while moving data to the cloud. This is ideal for departments that are concerned about data migration such as HR.
Typically, a HR Department will have a lot of sensitive data covering BICs, IBANs, emails, social security numbers, along with contractual details. This, in turn, can make Data Protection Officers (DPOs) become wary of moving data to the cloud and insist on sensitive data being removed before the migration.
However, with BDM, you can migrate all your data safely and securely via ‘in memory’ tokenization to the cloud. And once that data is on the cloud, any organisation can take advantage of, for example, the analytics which cloud platforms offer.
Also, the data can be shared with third parties such as marketing companies in an entirely safe and trustworthy manner.
Another feature of BDM is the ability to have ‘opt-out data’. Similar to the opt-out clause that people can avail of when it comes to marketing information, clients can choose what data is migrated to the cloud or not.
Step 3. Migration – the view from AWS
Once the data has been migrated, we can check the results in AWS, which is the destination for our example.
The below screenshot is the view from AWS, and it details the successful migration. Note how the email column is tokenized while other columns, such as people’s names, are not.
Furthermore, BDM allows you to de-tokenize your data in memory and write it back to your cloud destination, which is all captured as part of your data lineage.
Data lineage is the process of recording - and the visualisation - of the data flow as it moves from the pipeline through its various stages. As tokenization is part of this process, it too is captured and represented within the lineage.
For over a decade, Bluemetrix has worked with some of the largest health and financial organisations in the world and has worked on over 400 data lake projects. Connect with us here today and learn how to automate your data lake operation and management with trusted, governed, and cleaned data.
Our experts at this online event, ‘Secure Your Cloud Migration By Leveraging In-Memory Tokenization’ will share their insights and best practices for securely moving your data into the cloud. You can watch it on-demand here.
Comentarios