Notes on the Oracle webinar about Secure Development and Test Environments with Oracle Data Masking.
I am interested because I have been at several sites where data masking was originally implemented but had fallen into disuse because of the following reasons:
- The data masking had become broken and no one was able to repair it.
- Developers argued that they could not write correct code without seeing production data.
- The data masking was too complicated or clumsy to use properly.
The agenda was
- IOUG Data Security Survey: Top Risk
- Database Defense in Depth
- Data Discovery and Modeling
- Oracle Data Masking Features
- Data Subsetting
IOUG Data Security Survey: Top Risk
The IOUG Survey says that 59% Data Breach Likely or Unsure. 33% have no idea whether a data breach would occur in the next 12 months. The identified top risk was production data in non-production environments. Production data is used because developers and testers argue that all use-cases are covered.
Database Security Defense in Depth
The Database Security Defense in Depth model from Oracle covers the following ares:
- Mitigate Database Bypass
- Prevent Application Bypass
- Consolidate Auditing and Compliance Reporting
- Monitor Database Traffic and Block Threats
- Protect All Database Environments
- Sensitive data discovery for production
- Mask data for nonproduction development & test
- Secure database lifecycle management, configuration scanning, patch automation
FAST—Find, Assess, Secure, and Test
Data Discovery and Modeling
Data discovery and modeling is needed find out where the sensitive information is stored. The application data model is just the start. Sensitive information can be stored in non-obvious places such as comment fields. The discovery tool:
- Scans application schemas to model relationships between tables and columns
- Extract data relationships from Oracle Applications meta-data
- Oracle e-Business Suite
- Oracle Fusion Applications
- Store referential relationships stored in repository
- Enables test data operations such as data subsetting, masking
Sensitive data is identified through patterns. If it looks like a Social Secuity Number (SSN), then it is probably a SSN. Two (2) problems I see are false positives (numbers that look SSNs but are not), and false negatives (strings that do not look like e-mail addresses, but are—user at oracle dot com).
There seems to a lot of reliance on the application vendor providing meta-data and templates in order to discover sensitive data. This could be a problem for sites that use other vendors such as SAP for their applications.
There are three (3) ways of discovering sensitive data:
- Discover the sensitive columns using patterns
- Create Sensitive Column Types using Regular Expression to define the pattern for column name, comment, or data
- Run a discovery job to scan the database and discover the sensitive columns
- Import from masking templates
- Import the masking template for Oracle Applications such as Fusion Applications.
- Sensitive columns will be automatically tagged
- Set columns as sensitive Manually
- Add columns directly in the sensitive column list
Oracle Data Masking Features
This data about sensitive columns is maintained within the OEM repository. This means it is independent of the test and development environments. And it allows for site-wide rules about sensitive data.
The data masking is really a set of data replacement algorithms. RI is automatically preserved. However, in my experience, not all the RI rules are stored in the database. These are using buried inside the application, and in the developer’s mind. This is where the data replacement will most likely fail.
Oracle Data Masking is integrated with Real Application Testing and Test Data Management (slide #12).
The mask routines are extensible to enable customisation of business rules. The example given is the generation of a check digit for credit card numbers.
Some of the sophisticated masking techniques:
- Compound Mask
- Sets of related columns masked together e.g. Address, City, State, Zip, Phone
- Condition-based Masking
- Specify separate mask format for each condition, e.g. driver’s license format for each state
- SQL-expression based masking
- Use SQl functions…
There is an interesting feature called Key-baed Reversible Masking which is new in OEM 12c. This means that the masking is deterministic, unique, and key-based. The unmasking recovers the masked data back to its original value with the same key (slide #15). The example given involves off-shore billing processing in which customer data is hidden.
Integration with Real Application Testing is integrated with Data Masking by making bind data, AWR data, and Workload Capture Files (slide #16).
The data masking is done in a staging database before cloning to test or development. Role separation is built into Oracle Data Masking: Security Admin and DBA.
Oracle Data Masking can work with non-Oracle databases by using an Oracle database as the staging database (slide #19).
The Data Subsetting of Test Data Management:
- A relationally intact and yet fractional representation of production data for test and development purposes
- Reduce the storage overhead created by production data copies in various application environments
- Allow developers to perform real world application development by using production-class data
Model-based data subsetting is very dependent on explicit data relationships. These relationships cannot be hidden away.
Performance is cited as extracting 200GB from 1TB in about one (1) hour.