Accelerate Nested JSON Data Extraction into Snowflake with Bluemetrix and IDERA
Bluemetrix & IDERA
Extracting Your Data from JSON Files Has Never Been Easier
You have complex nested JSON files containing valuable data, and you want to make this data quickly available for analysis while ensuring governance policies are adhered to. Putting JSON data raw into columns in your Snowflake databases makes it hard to apply standard BI tools and provides little governance.
Bluemetrix, IDERA and Collibra automate the export of data from nested JSON files into a tabular format with a Foreign Key Relationship being created between the tables. With powerful and seamless integration, BI tool users now have the power to effortlessly execute standard queries against the normalised relational data, expediting access to your data in just a few clicks.
Bluemetrix + IDERA + Collibra
Apply Governance Policies to JSON Data
Transform your hierarchical JSON data into structured tables while automatically applying governance policies as data is exported. BDM will automatically read the data policies in Collibra that are associated with each JSON source file and apply these policies as the data is extracted and transformed.
If there is PII data in the source JSON file, and the data policy specifies that all PII data needs to be Pseudo-anonymised before it is used, BDM will tokenize the PII data after it has read it from the source JSON file and before it writes it to the Target Table.
It can also capture GDPR compliance data and generate business Meta-Data as the data is extracted from JSON, helping record your compliance with GDPR rules and adding further to your data by making it more useable for your final BI audience.
How it works
Using ER/Studio, we will take a series of Nested JSON files and convert their hierarchical structure into tables in Snowflake. We will enhance this model by adding primary keys and moving it through the process to 3rd Normal Form, while also mapping our business glossary for PII, Sensitive Data, etc. with the Collibra Data Governance Catalogue.
For all JSON data in the pipeline, BDM checks for any Tags and Standards that govern the source
When each Tag and Standard is found, the system determines if a corresponding rule exists in BDM, e.g., all JSON data that has a PII Tag must be tokenized on Write
Before the Pipeline executes a Write, all Tag and Standard-based rules are automatically applied to the pipeline, e.g., for all JSON data that are tagged PII, BDM automatically applies a Tokenization function to the data before it is written to its destination
In this way, BDM will automatically enforce and execute the Collibra Data Policies associated with each JSON data in the pipeline
Bluemetrix & IDERA
From JSON to Actionable Insights
Automatically apply governance policy in BDM
Data policies in Collibra will be automatically implemented and enforced on the JSON data
No-Code Engineering Built for Snowflake
Manage end-to-end ETL pipelines and automate the integrity of your Snowflake dataset at every stage of the data lifecycle – without writing a single line of code
Streamline Data Access and Sharing
Bluemetrix makes it easy for data owners to change policies in Collibra, safe in the knowledge that they will be implemented in BDM
Advanced security to PII data
Secure data tokenization and masking capabilities enable scalable and efficient protection for sensitive, regulated data.
Integration of ETL, modelling and governance tools in single solution, fostering collaboration between different data stakeholders