Delphix Cloud Economics for on Premise


Image courtesy of

Cloud Economics

Many customers are examining how they can leverage public clouds for reduced costs and increased agility.

Public Clouds save costs through automation, higher server utilisation, lower storage costs and consumption based charging.

Public clouds are seen by many as a great way to secure expanded capacity to handle demand spikes for seasonal or overflow activities. Allowing you to keep your on-premise IT resources trimmed to handle only Business as Usual (BAU) activity.

What about running BAU on Cloud? What is the cost per database on Cloud versus on premise, factoring in database server, network, storage and license costs?   Some customers have done this calculation and the answer may surprise you.

Delphix Economics

Let’s examine how Delphix enables the same core cost saving measures as Cloud solutions.

1. Automation

Delphix automates the entire end-to-end process of data synchronisation to provisioning. These, policy driven, automated workflows can also include Data Masking and Replication.   End users can manage their own full copy data sets but are limited to pre-allocated server resources to avoid consumption creep.

2. Server Utilisation

The two primary methods to increase server utilisation are density (increased concurrent instances) and hoteling (drop in environments).

Delphix enables increased density by allowing you to run more database instances on the same database server. Customers have reported 30% increases in concurrent instances per server.   Often a server is maxed out on memory but Delphix allows you to reduce the memory allocation per instance without compromising performance due to the caching effect of the Delphix shared cache.

Delphix enables hoteling by allowing an environment to be provisioned and de-provisioned quickly and automatically. This allows customers to pool existing siloed and implicitly underutilised database servers and share these servers between teams and across time zones. Test servers can be kept busier by supporting tests from different projects and sprints, simply by lifting the constraint on data provisioning.

Take a look at your current test server utilisation, imagine if you could increase that by 30% or better.

3. Storage

The best way to save storage costs is to avoid using storage altogether.   Delphix reduces non-production storage consumption by at least 70%.

Here’s the math:

Source copy = 50%
Non-production copies = 4

Storage consumption for 4 copies = 50% / 400% = 1/8 = 12.5%
Storage saving = 100% - 12.5% = 87.5%.

As with most of these calculations it depends.  In reality the savings are generally so substantial that you can use the highest performing storage available because you don’t have to buy much of it.  You then get the additional benefit of increased test throughput allowing more tests for increased quality and reduced release cycles.

The Nirvana

The best solution for most customers will be an on premise private cloud for BAU activity linked to a public cloud, or two, for burst activity.

Delphix will make your private cloud as economical as a public cloud with the added advantage of supporting true agile development for database based applications, i.e. the really important ones.

For public cloud, Delphix supports AWS and Azure.  Delphix will seamlessly replicate data between on-premise and cloud, cloud to cloud and cloud to on-premise.  Allowing you to switch workloads between clouds without worrying about moving data.


Delphix Secure Data Subset Replication

Screen Shot 2017-05-03 at 12.04.47

VLDB Data Subsets

Many Delphix customers are virtualising enormous, Very Large Databases (VLDBs) with some tables running into 10s of billions of rows.   Occasionally these databases grow because there is no archiving and purging process in place, but increasingly the current active working set of data is growing due to increasing transaction levels and consolidation of databases of record.   The transactional databases of today are now substantially larger than the data warehouses of yesterday.

In general with Delphix there is no requirement to provision data subsets since Delphix virtual databases can be provisioned in a few minutes regardless of size and with minimal storage overhead. However for some customers there is a requirement to provision only the latest few days, weeks or months of data for faster functional testing.

Delphix Selective Data Replication

Delphix Replication is used to provide resilience, scalability, data migration and data segregation.   With Delphix Replication you can choose to replicate any set of virtual databases and this is used to ensure only secure, masked, data is replicated to a second ‘replica’ Delphix Engine.

This same capability allows us to replicate a virtual database which has been trimmed to reduce it to a more manageable size, a subset if you will. For some databases like Oracle dropping tables and deleting data does not release space so you usually have to drop tablespaces and corresponding datafiles which is of course much easier if you have partitioned or transportable tablespaces.

What’s the Recipe?

The steps are relatively straightforward.

  1. Create a VDB on the primary Delphix Engine using a configure clone hook to run a script to trim the VDB
  2. Configure a masking job to execute after the trim process.
  3. Bounce the VDB once provisioned
  4. Create a Replication job with Selective Data Distribution enabled.
  5. Choose the trimmed and masked VDB to be replicated.
  6. Run the replication job.

Additional advantages of this approach are that you require even less storage assigned to the replication Delphix Engine and you only have to mask the trimmed database which will make the masking process execute proportionately faster to the reduction in size ratio.

To Summarize

If you have substantial databases you can create full virtual copies of these in production for production support, reporting and ETL.  You can replicate these full virtual copies to a replica engine for full volume functionality and performance testing.

With Delphix Selective Data Distribution Replication you also have the option to replicate secure data subsets for faster functionality testing.

Since a Delphix Replication Engine can be located on-premise or on-cloud you have the choice of what to replicate, how much to replicate and where to replicate.


Delphix and Data Subsets


Image courtesy of

Who needs Data Subsets?

With Delphix there is generally no need for data subsets.  If the reason for creating data subsets is to save storage and improve provisioning times then Delphix Virtual Databases (VDBs) avoid that.  Occasionally customers want to use data subsets to improve performance of tests involving table scans and hence want to use smaller tables.  The most common examples are batch jobs or reports.   For these tests one option could be to consider using more restrictive WHERE clauses or partitioned tables.  These changes can be hidden from the application by using database views.

However if you still want to proceed with data subsets Delphix can help with that too.

Agile Data Subsets

Almost all data subsetting involves creating a standalone relationally intact set of data which means you need to understand the relationships between all the tables needed to support your tests.   This for many is a huge stumbling block and is why the cost of creating subsets is high and hence the frequency of generation is low.

The good new is that it is much easier to create a data subset in a VDB since you do not need to subset all the tables as all the tables are present and none of them incur a storage overhead.   You only need to subset the larger tables, usually the transaction and history tables that are involved in full table or slow scans.

For partitioned tables you can just drop the unwanted partitions.  For non-partitioned tables you can either delete rows, or more efficiently create a replacement table with the desired rows.

The only occasion when you need to factor in table relationships is if you are subsetting two tables where one has a dependency on the other.    That is easiest done using foreign keys, and if they do not already exist you can create them for the purposes of subsetting. Since this is a VDB, a personal full read write copy, you can manipulate it however you require.   You do not have to factor in any other relationships.

Secure Data Subsets

If you started your data subsetting process using a pre-masked VDB then your data subset is already secured.  If not then you can run the same Delphix Masking job against you data subset VDB.

To cap it all

Having streamlined your data subsetting process you can now create as many copies on demand as required using your data subset VDB as a parent VDB from which child VDBs can be created.

Once you have done all your tests against data subsets you can then run a full volume test against a copy of the original VDB.

Delphix for Performance Testing


Image courtesy of

There will be Change

Many customers ask whether it is possible to use Delphix for performance tuning and testing and the answer is a definitive yes.   It is a sensible question because database files presented via NFS over Ethernet from a compressed, de-duplicated shared copy is quite different to a dedicated physical copy presented directly from a SAN over a Fibre Channel network.

Inevitably the performance will be different.   The performance is almost always different between production and non-production environments even in pre-production environments which are often configured using the same deployment architecture and components as production.

Consistency is the key to Quality

For all performance testing you must first perform a baseline test on the test infrastructure and you then must ensure you can reproduce those results consistently when no changes are applied.  The infrastructure must have a guaranteed quality of service and the testing tool must be able to present the workload consistently.  Consistent workloads are usually achieved using sequential regression test suites, batch jobs or automated testing tools which can present an identical workload repeatedly.

Tuning requires many Iterations

So we have established that you cannot measure the effects of a change unless you have a consistent environment on which to test that change in isolation.  In addition performance tuning is a highly iterative process, you should only test one change at a time and each change needs to be validated not just for the intended improvement but also for any undesirable side effects.  That’s a lot of test runs generally requiring automation of the environment set up and automated restore of the data to a known point in time.

Delphix for performance Tuning

Many customers use Delphix for performance tuning but almost all are initially a little skeptical about the baseline performance, everyone assuming it will be worse than their current non-production physical storage.  This is not the case.

On numerous occasions we have proven that if you deploy Delphix with the recommended storage requirements (IOPS and latencies) and network requirements (throughput and latencies) the performance of Delphix virtual databases will be the same and often better than your physical storage.

So now the baseline performance of a virtual database will be similar to that of a physical database.  That’s a good start, let the iterations begin.

Oracle customers generally use AWR comparison reports to analyse the results of any tuning changes comparing the baseline AWR with the tuned run AWR.  Once satisfied they then run a performance regression test to ensure the candidate change does not regress performance for other code paths.

With Delphix, restoring the data to a known state, bookmarking the data changes to pinpoint schema evolution is trivial and near instantaneous.  Automating the entire environment and data provisioning for testing also enables higher frequency testing performed earlier in the release cycle and higher test server utilisation.

To cap it all

Finally once your tuning changes appear good to go you likely want to run a final live-like test on your pre-production environment.   Delphix can provision a physical copy of a database directly from the candidate virtual copy to your physical storage.    This is particularly useful if you have made substantial changes to the test database, such as data masking or a SAP refresh and do not want to run that entire process again.

Delphix and Machine Intelligence

Image courtesy of

Image courtesy of

There are several examples of semi-automated workflows which lend themselves to using machine intelligence techniques to improve the precision of the automated components and reduce the final stage manual effort common in these workflows.

For example the process of detecting Money Laundering requires anomaly detection, looking for some peculiar or unusual activity which goes against the patterns of normal behaviour.  Anti-Money Laundering software scans recent transactions to look for candidate anomalies based on rules that are built and maintained by specialists.  Any candidate anomalies identified then have to be investigated in person and each of these investigations incurs a cost.

The goal of the workflow is to identify all suspect anomalies whilst minimising false positives which incur unnecessary costs.   The good news is these systems have been in play for some time and have built a valuable library of data, including the original set of transactions for analysis, the subset of transactions previous identified as candidate anomalies and the results of the manual investigation to identify actual money laundering activity.

This library of data is ideal for use by machine learning algorithms which can be trained based on input and output data.   In this case we have two input and output stages.

Automatically provisioning each set of data from the library is a non trivial task, in particular for the original set of transactions which may number several million.   Delphix is ideally suited for this task as Delphix can be used to automatically collect, locate and provision sets of data on demand in minutes.   For machine intelligence this means that any newly refined algorithm can be validated against the full history of past analysis to ensure that the candidate algorithm performs as well or better on all known sets of data.