![]() ![]() Aside of the security issue, the problem with production data is that does not cover functionality of ETL sufficiently (typically about 40% of business rules isnt covered by production data sample) and it takes too much of time to process. ETL is typically tested on production data. More precisely, testing isnt problem, problem is how to get reasonable test data. As Nick stated, this was a lot of work and of little real value) (We also had a separate set of packages that would unit test each QA test by means of creating dummy tables, populating them, running the test then confirming the appropriate audit record was written. That way we could provide a comprehensive list of all the tests and checks we had done each running of the process to satisfy the governance peoples. We would make use of these views to validate our package transformations.Įach of these checks created a record in our special audit table. In addition, as part of our development process we would create views that had the end result of any transformations we were doing. Most of these tests were SQL statements the used the built in schema objects of our database, so they were not to onerous to create. Ensuring all SK fields were populated (-1 instead of nulls).Checking specific transforms (eg: all date values changed to appropriate SK value, all string values RTrimed).Comparing record counts between source|destination.The end package would check the results of any transformations. Schema checks ensured that any schema changes that did not get applied during Continuous integration were detected. Data Sanity involved checking for duplicate or missing data caused by a lack of Referential Integrity in the source systems. The starting package would do db schema and data sanity checks. What worked for us was have each ETL solution start and end with a QA/Test package.Īnything unexpected discovered by these packages was logged into an audit table and a Fail Package event was then raised to stop the entire Job - We figured it was better to run with yesterdays good data than risk reporting against possible bad 'today' data. We recently worked on a project where the governance board demanded 'You must have Unit Tests' and so we tried our best. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |