Data Warehouse test concept

Testing a data warehouse system begins with the creation of a conscientious testing concept. What your test concept should include, which test methods and test tools are available to you, we show you below.

Create a meaningful test concept

To test your Data Warehouse system, there's no universal recipes. Every system and every project is individual and must therefore be tested individually. Sometimes there are technical conditions and sometimes it is just the budget that shows the limits. But in every case you should create a conscientious test concept from the beginning, using the following methods:

Unit tests

During development, individual pieces of code are tested for their correct functioning without regard to their dependencies. Thus can check, e.g. single data transformation tasks or even MDX formulas.

Integration tests

Do the individual components of the system work together properly? This can only be verified by providing them in a test environment and reviewing them across the board.

Data plausibility check

This process checks if the data is accurate and credible. By involving the business user you increase the value of this test method.

Acceptance testing

This type of testing has two objectives with respect to the user. On the one hand, it ensures that the data provided to the user meet the expectations. On the other hand, this ensures that the tools provided to the user meet the expectations.

Regression testing

Testing a function over and over again increases the certainty that no errors will creep in during development that affect an already tested part of the system. Adhering to short release cycles in agile projects is not possible without regression testing.

In addition, regression testing in the productive operation of the system can be very useful to regularly monitor the quality of the data, thereby providing an early warning system for any problems that occur.

This is how the life insurer tests Swiss Life

Swiss Life AG

Performance testing

This testing process is an important aspect of user acceptance and therefore can not simply be ignored. Does loading of data lead to a slowdown of the systems? Does the parallel queriy of data lead to a slowdown of the systems?

Specific tests for your data

Compare number of records

Compare the number of data records in the source systems with those in the data warehouse for a first overview.

Check business-keys

Which records are missing or too much respectively duplicated you only learn when you review business keys.

Validate source data

Do not allow erroneous data to be loaded into your data warehouse by checking that the data provided is error-free, even during staging.

Check scopes of validity

Would you load a customer record into your data warehouse that has a date of birth around 1850? Validate information that is subject to restrictions early.

Timeline tests

Be sure to check, if your implementation of historicized data in Slowly Changing Dimensions (SCD 2) is correct.
Are there any gaps? Are there any overlaps or date rotators?

Distribution tests

Classifiable data should be checked for their distribution. Were around 40% of your sales made on product group A and only 10% on product group B?

Check aggregates and KPI's

You can mathematically check business logic in the data warehouse or in analysis systems such as OLAP cubes and, if necessary, compare them with values from your source systems.

Plausibilize data

If you involve business users to your tests, you'll benefit of their knowledge. Check, for example, whether the November and December sales differ in a credible manner.

Performance testing

Define reference queries or record real queries from your users to constantly monitor their performance.

Sören Schubert talks about the test concept of Ifolor AG

Use Case DWH und ETL testen

BiG EVAL takes over the manual testing of a ETL-process and data warehouse components. This happens through purposeful checking and comparing of data from source systems, the data warehouse, staging areas and all other components. Just as you would do it manually - but just automated.