Client
A global healthcare company
Industry
Healthcare
Services
Security Services
Tech
Databases, SalesForce, Informatica Cloud, Informatica TDM, Python

Challenge

In the wake of data privacy laws, data loss prevention was a big concern for a global healthcare company with complex IT infrastructure needs. They were looking for a top provider of data masking services to obfuscate huge volumes of personally identifiable information (PII) in multiple formats sitting in both production and non-production environments. The company needed a complete healthcare solution to comply with their internal controls, the GDPR and other data privacy legislation.

Specifically, our team took on the following challenges:
Provide Data Masking as a Service to protect sensitive data in the client’s complex ecosystem consisting of relational databases, cloud apps and unstructured sources
Create an ML-driven solution to find and mask PII for the staging/development environments

Solution

A turnkey process performing data masking operations on demand to meet development and testing needs across the enterprise
A custom Python app using an ML model for processing data sources upon request, providing a list of sensitivity attributes, such as the level of sensitivity and the PII type, and suggesting an appropriate masking technique based on the identified attributes
A data masking framework based on Informatica TDM that performs anonymization procedures
A Python bootstrap application for masking SalesForce data by automatically generating jobs in Informatica Cloud
Rules for sensitive data identification
Data dictionaries that can be used for data substitution
ML-Driven Solution for Healthcare
healthsecurity

Impact

With sensitive data shielded from unauthorized access, the company has quickly achieved compliance with the GDPR and other data privacy laws
The automated provisioning of masked data to developers and testers has dramatically cut time to create non-production environments
Millions of rows of data, including structured, semi-structured and unstructured data, are processed swiftly, driving process efficiency
Costs on Salesforce data transfer from production to sandbox have lowered

Latest projects