Huw Price, Vice President, Continuous Delivery, CA Technologies
As copies of live production data are still being used in most testing environments, it becomes a significant security blind spot that places both companies and their customers at risk. Data masking is one way of overcoming this, suggested Huw Price, Vice President, Continuous Delivery, CA Technologies — who is also the co-founder of Grid-Tools, a provider of enterprise test data management. CIO Asia finds out from him how data masking can help improve overall data security.
CIO Asia: Why is it important to protect testing data?
CA Technologies' Price: Today, a lot of money is spent on security systems and on firewalls. Massive infrastructures are put in place, which involves auditors and security officers. Magically, the data is copied into the development environment and all of it goes out of the door. It is literally like opening the gate and putting it into an entirely insecure world.
There are several problems associated with this. Firstly, the majority of data breaches come from within an organisation. That's where the focus should be. One is also dealing with a lot of outsourcing partners and vendors. Once it's in the devlopment worlds, there are a lot of places where the informtion can leak out to. This is a massive business problem. For instance, outsourcing is a very established model, which means one is going to share production data and they have to conform to the regulation standards as per different locations. Disparities in regulatory standards and not meeting them can create business problems.
The financial sector is very patchy due to the mishmash of live data, which may be homegrown, masked or synthetic. This randomness not only affects the regulations associated with data loss and fines, but also makes development difficult.
Developers have access to various sensitive information -- whether it is live production data or payrolls. Some other industries that face risks include travel, healthcare, and retail, where one can have access to the true costs of operations and may talk to competitors.
One needs to undertsand that it's not just personal data; it is business data.
What is data masking, and how should businesses approach it?
Data masking is essentially changing the names of people, as well as the information about them such as their address and date of birth. However, a person's data may also consist of their account numbers and transactions -- which are all personal information. The biggest problem here is thus to define the boundaries of personal information.
The data masking that is happening today is not good enough. The key to approaching data masking is for businesses to know how much masking is needed. From a statistics point of view, as well as from a competitive point of view, it is interesting.
A situation where someone gets access to data and puts it on WikiLeaks is very much of a possibility. They can also profit from such acts if/as the data wasn't masked.
Masking is a very dangerous and difficult job to do. The tricky part is that if you mask one name on a system, it has to match on another system. Therefore, there is the endless problem of deciding which system needs to be masked first. For example, if one masks the customer management system, then the entire Salesforce may not match.
But today, there are tools and technologies which can help with this. One needs to be organised and structured and also have a good management support.
In the real world, one has to live with masking. You'll end up with a hybrid approach where you have masking, but you also do generation. Thus, you are synthetically adding additional data into the system to supplement data to make it richer and to seriously improve your development and testing.
People are already doing this, and this can help one find problems and the cost of those problems.
This is a very important reason to move to a more synthetic data entry approach. In fact, there is a bank in India which has adopted a complete synthetic approach. There are also various government departments which are fully synthetic.
Sign up for Computerworld eNewsletters.