Testing A Datawarehouse

Testing a Datawarehouse - an industrial challenge
by Harry M. Sneed ANECON GmbH, Vienna Harry.Sneed@T-Online.de Abstract: This contribution is an experience report on system testing and in particular on the testing of a datawarehouse system. Datawarehouses are large databases used solely for querying and reporting purposes. The datawarehouse in question here was dedicated to fulfilling the reporting requirements of the BASEL-II agreement on the provision of auditing data by the banks, the European equivalent of the SarbaneOxley Act. The purpose of the testing project was to prove that the contents of the datawarehouse are correct in accordance with the rules specified to fill them. In the end, the only way to achieve this was to rewrite the rule specifications in a machine readable form and to transform them into post assertions, which could be interpreted by a data verification tool for comparison of the actual data contents with the expected data contents. The testing project was never fully completed, since not all of the rules could be properly transformed. Keywords: System Testing, Datawarehouse Testing, Data Transformation Rules, Post Conditions Assertions, Formal Verification. expect. They might even scan selected database contents to see how they are affected by the test transactions. If any reports are generated, they will trigger the jobs to produce them and check their contents. If there are any doubts about the results, they will consult with the analysts or with the end users. The checking of the output is done on a spot check basis. Through intuition or by means of some inerpt domain knowledge the tester is able to interpret what is correct and what is not. At the center of conventional system testing is the concept of a use case. A system is considered to be functionally tested when all of its use cases with all of their usage variants have been tested. The use case is also the basis for defining test cases [1]. One use case may have several test cases, one for each alternate part through the use case. Seldom are all of the possible results checked. To do so would require too much time and effort. This conventional system testing approach has been well defined in the pertinent testing literature [2]. Sometimes it is recommended to automate the filling up and the checking of the user interface by some kind of capture/replay tool to expediate the test [3] and sometimes it is recommended not to do that, since automation only clouds the issue and diverts the tester from his responsability for the accuracy of the test [4]. ANECON had always used the conventional test approach before. It was believed that a similar approach would work for a datawarehouse project. The problem was only one of finding enough testers to run the jobs and check the results. This belief was not going to be true.
1. Conventional System Testing

The project described in this paper presented both a theoretical and a practical challenge to the testing team of the ANECON GmbH in Vienna. ANECON has been doing system testing on a contract basis for five years now. Since then, they have tested a savings and loan system in a client/server architecture, tested an ERP system installation in a hospital, tested a web-based dues collection system and tested an e-government internet portal. All of these projects were more or less successful, if there is anything like a success criteria for system testing. This project proved to be different. The problem with this project as with most testing projects stems from the fact that there was no adequate language available to specify the requirements. If requirements are properly specified they can be properly verified. In conventional testing projects it is the tester who bridges the gap between informal specifications and formal verification. In the case of datawarehouse testing that is not possible due to the nature of the problem. The conventional way to test a system is to have a battery of testers sitting at the user interfaces, each testing a subset of the systems use cases. The testers submit predetermined inputs and check to see that the resulting output is what they should
2. Datawarehouse System Testing

A Datawarehouse consists of hundreds of related relational database tables each with a number of attributes. A single database table corresponds to some logical entity in the real world such as an account. If the database is normalized, the number of tables will be large but the number of attributes per table will be small, perhaps no more than 40. The goal is to allow the user to be able to join the data together in any thinkable combination of attributes in order to satisfy any possible information request. Thus, the ultimate element of processing is the individual attribute which can be combined together with any other attributes. An attribute corresponds to column of a table, whereas the lines contain all attributes pertaining to one
Proceedings of the Testing: Academic & Industrial Conference Practice And Research Techniques (TAIC PART'06) 0-7695-2672-1/06 $20.00 2006
IEEE
Authorized licensed use limited to: JSS ACADEMY OF TECHNICAL EDUCATION. Downloaded on May 10,2010 at 04:00:02 UTC from IEEE Xplore. Restrictions apply.
instance of the entity being described. Each instance of an entity, i.e. each line of a table, must be uniquely identifiable and distinguishable from all other lines of that table. For that purpose, one or more columns serve as a unique identifier. In the Datawarehouse in question there were 266 tables with more than 11.000 attributes, giving an average of some 40 attributes per table. Each attribute was to have a rule describing how that attribute is derived. Attributes can be taken from operational data, they can be computed or they can be set to a constant value. How this is to be accomplished is described in the attribute rule. For this datawarehouse, the input data was delivered from the various bank subsidaries scattered throughout Eastern Europe. Since the local bank analysts were the only ones familiar with their data, it was up to them to provide a data model of their operational data together with an English language description of the individual data elements. These models were merged by the central bank analysts to map the operational data on to the attributes of the datawarehouse. This lead to the so called mapping rules of which there were in the end, some 5317. The goal of datawarehouse testing is to demonstrate that the contents of the datawarehouse are correct. To achieve this means checking the attributes against their rules, i.e. comparing actual values with expected values based on the mapping rules. This could be done with a random sample of all attributes, with a subset of critical attributes, or with the complete set of attributes [5]. As was discovered, the effort of checking even a small subset of attributes, is so great manually that even statistical testing becomes very expensive. On the other hand, if the rule verification is done automatically, then it might as well be done for all the attributes, since there is no additional price to pay.
4. Specifying the Data Transformation Rules

Unfortunately the data transformation rules for this project were formulated in English prose by the bank analysts. The media used to do this, was an Excel table, generated from the E/A modelling tool Erwin. In this table was a line for each attribute and a column for each attribute descriptor. The attribution descriptors were the name of the logical entity = table name the purpose of the table the name of the attribute the purpose of the attribute the type of attribute the significance of the attribute and the rule through which the attribute is derived. The rules were texts of several lines describing how this particular attribute was to be computed. At first glance, it appeared to be a hopeless task to try and interpret the rules. Many were indeed long algorithms expressed in a pigeon English form by non English speakers. However, a closer analysis revealed that there were really only seven basic templets used for over 90% of the rules. These seven basic rule templets were a simple 1:1 assignment of a source value from an input file to this attribute a simple 1:1 assignment of a given constant value to this attribute a m:1 assignment of a set of alternate constant values to this attribute a m:1 assignment of a set of concatenated source and constant values from one input source to this attribute a m:1 assignment of a set of concatenated source attributes from several different input files to this attribute a 1:1 arithmtic assignment of a computed value to this attribute a 1:1 assignment of the result of some standard function to this attribute. These assignments could be conditional or nonconditional. A conditional assignment was qualified by a boolean expression based on values contained in the input source files and on constant values. A special condition was the successful completion of an SQL join operation. Otherwise the conditions were nested, logical expressions with if, else and otherwise clauses. (see sample 1) Unfortunately, the assignments and their conditions were formulated in a slightly different mode throughout the rule table. However, they were similar enough to be parsed and recognized. The great majority - some 3950 - could be processed as
3. The Challenge of defining a Rule Specification Language

Due to the volumne of data and the complexity of the mapping rules, there is no real alternative to test automation in testing a datawarehouse. It has to be automated to reach any degree of coverage and reliability. The problem here is how to bridge the gap between informally stated rules and formal verification criteria. This is where academia could be able to help. What is needed is a data rule specification language, simple enough to be used by non technical domain experts but formal enough to be parsed and converted into stringent verification conditions. Once the language has been defined it should be usable as a basis for both generating test data and validating test results.
IEEE
they were. The remainder had to reformulated in a semi formal syntax. The syntax was as follows assign <Filename>. <Attribute> for a 1:1 assignment of an input value assign <Constant> for a 1:1 assignment of a constant value assign <constant>!<constant> for a set of alternate values assign Table.Attribute | <constant> | Table. Attributen for a concatenation assignment assign join Table_A.Attribute_A1 | Table_A.Attribute_A2 with Table_B.Attribute_B1 | Table_B.Attribute_B2 for concatenating values from different data sources assign Table_A.Attribute_A1 + Table_B.Attribute_B1 * 2; for arithmetic expressions. There was no nesting of clauses nor precedence rules here, so the arithmetic expression was resolved in a simple left to right sequence assign Function (Param_1, Param_2, Param_3) whereby the parameters could be attributes in source input files, i.e. Table_C.Attribute_C1, or constant values, i.e. 10.5 With these assignment expressions, enhanced by an if expression in the form if (Table_A.Attribute_A1 <oper> <operand>) whereby <oper> := , < , >, <=, >=, <>, etc. and <operand> = <Table>.<Attribute> or <constant> more than 4700 rules could be resolved automatically and converted into post condition assertions, which could then be tested. Of these rules some 750 were manually adjusted requiring more than 150 hours of effort. The fact that not more mapping rules were adjusted was due not only to the limited time available for the test, but also to the informal nature of the rules. Some rules were formulated in such a confusing manner which defied reformulation. There was simply no way to formalize them. It was a fault of the project that the rules were not properly defined to begin with. Had they been formulated at least in some semi formal form, it would have been possible to automatically process them immediately without having to spend valuable tester time in rewriting them. (see sample 2)
output data. It has been used to in previous projects for selective regression testing [7]. Basically it compares the values of attributes in a new database or new outputs with the values, which existed for the same attributes in a previous database, or output. For every entity, i.e. datbase table, file or report, a set of assertions is written, associating the new attributes with the old ones or with manipulations on the old ones. The assertions are of the form assert new.Attr_1 = old.Attr_2; assert new.Attr_2 > old.Attr_3; assert new.Attr_4 < old.Attr_4; Instead of comparing a new value with an old value, it was also possible to compare a new value with a constant, a set of constants, or a computed value as depicted below assert new.Attr_2 = 100.50; assert new.Attr_2 = A ! B ! C ! D!; assert new.Attr_2 = old.Attr.2 + 100 / 2; For this project, the assert statement was extended to include concatenations assert new.Attr_3 = old.Attr_3|-|old.Attr_4; Any assertion could become conditional by qualifying it with an if clause in the form assert new.Attr_4 = old.Attr_5 if (old. Attr_5 > 100 & old. Attr_5 < 200 ); There could be any number of and conditions. Logical or conditions were not allowed. They are expressed in another form, namely by assigning different assertions to the same attribute assert new.Attr_5 = old.Attr_6; assert new.Attr_5 < 100; assert new.Attr_5 > 200; If the value of Attr_5 fulfills any of these assertions, it is considered to be correct. The assertions are grouped together into assertion procedures, one per entity, and qualified by a key condition. The key condition matches the keys of the new or output entity with those of the old or input entity. if ( new.key_1 = old.key_1 & new.key_2 = old. key_2 .); assert new.Attr_1 = old.Attr_2; assert new.Attr_ 2 = 0; assert new.Attr_3 = old.Attr_3 + 5 ; end; For XML and WSDL there can be several entity types included in any one output report or response. Therefore, there must be a separate set of assertions, for every entity type. These assertions are qualified by the object name if (object = This_object & new.key_1 = old.key_1 ) assert new.Tab.Attr = old.Tab.Attr; end_object;
5. Post Assertions for validating test results

Assertions have a long tradition in software testing going back to the mid 1970s. [6] Mostly they have been used for module testing. However, this author has used them to validate the results of systems. The tool DataTest uses assertions both to generate new test data as well as to validate the contents of
IEEE
A comparison job is started after every test, which complies the assertion into internal symbol tables and then processes the database tables or output files one by one, comparing the content of each asserted attribute against the result of the assertion. Attributes which do not comply with their assertions are reported as incorrect. In this way, the results of a test can be automatically validated without having to manually scan through and check them. This method is both faster and more reliable. A typical assertion procedure is depicted among the samples at the end. (see sample 3).
a key condition table with the names and types of keys to be matched a comparison table assigning which new attributes are to be compared with which old attributes and/or constants a constant table containing an entry for each constant value used as an operand in the assertions a condition table containing all of the conditions to be fulfilled for the assertions to be executed. A pointer links the conditions to the assertions to which they apply
6. The Tools for Dataware House Testing

For the datawarehouse test four tools were required. GenTest for transforming the mapping rules into assertion procedures GenSQL for generating SQL procedures from the assertion procedure AsrtComp for checking and compiling the assertion procedures DataVal for validating the database contents GenTest had the task of reading the rule table, extracting the rules, parsing them and converting them into assertions which could then be compared. It generated an assertion script for each target database table. As a by product GenTest also generated test case specifications, one for each rule to be stored in the Mercury Testcase Repository. Since this was done fully automatically, there was no extra cost to the project. Finally it produced a statistical report on the number of entities, attributes and rules processed. This was important for measuring the degree of data coverage. (see samples 4 & 5) GenSQL was a follow-up tool to GenTest. It parsed each generated assertion procedure to determine which attributes were required from what tables. It then generated two sets of SQL query procedures, one for selecting data from n input tables and joining them together to create a single CSV file for each datawarehouse table, and another for selecting data from the target datawarehouse table and down loading it into a CSV file. These two CSV files were to be the inputs to the data validation. (see sample 6) AsrtComp is a tool for checking the syntax of the assertion procedures, which could also be manually edited, and compiling them into intermediate symbol tables. There can be five tables for each entity to be validated a header table for identifying the objects to be validated and the number of entries in each table
DataVal is the final tool in the set. After reading the symbol tables it first processes the old CSV file, i.e. the inputs, and stores the values into a temporary database with their keys as an index. It takes the attribute tags from the first line of the CSV file and subequently counts the columns to associate the values with the tags. It then processes the new CSV file, i.e. the outputs, and matches each new record by key to an existing old record. If a match is found the contents of the new record are compared with the values of the old record or with the constant in the symbol table or with computed values or with concatenated values or with a set of alternate constant values or with the lower and upper bounds of a range, depending on the conditions. Thus, there are many ways to verify the correctness of an output value. If no match is found, the old record is considered to be missing. After processing all new records, a second search is made of all the old records in the temporary database to see if they were compared or not. If not they are considered to be missing in the new file. A protocoll lists out all of the incorrect data values, i.e. those that do not comply with their assertions, as well as all missing records. A set of test statistics summarizes the percentage of records missing and incorrect attributes. (see sample 7)
7. The Dataware House Test Process

Tim Koomen and Martin Pol have emphasized the necessity of a well defined test process [8]. An efficient test combines automated test tools with a proven process. Together this adds up to high test productivity. The process for testing this datawarehouse system was, if anything, well defined and automated. It consisted of 8 steps of which 4 were fully automated Step 1: The mapping rules which did not comply with the language standard had to be reformulated. This was the most time consuming task of the whole process and could have been avoided if the rules have been properly specified in the first place. Step 2: The mapping rules were automatically converted into assertions by the GenTest tool
IEEE
Step 3: The SQL procedures were automatically generated from the assertions by the GenSQL tool Step 4: The input attributes for each target datawarehouse table were selected from the input tables, joined and downloaded into a CSV file Step 5: The output attributes for each target datawarehouse table were downloaded into a CSV file Step 6: The assertion procedures were compiled by the tool AsrtComp Step 7: The input and output CSV files were matched and the contents of the output file verified against the post assertions by the tool DataVal. Step 8: The testers examined the data validation results and reported any errors
contents have to be verified against the specifications. The challenge here is to provide a specification language which will accomodate both goals. Once the mapping rules have been specified it is the job of the tools to run the tests and verify the database contents. The role of the tester can be compared to that of an engineer on a roboter assembly line, monitoring the work of the robots and only intervening when something goes wrong. For this, he should understand the function of the robots, without having to do the work himself. Such is the case in datawarehouse testing. In the datawarehouse described here only 88% of the attributes were actually tested. This resulted from the fact that 12% of the rules were not verifiable. Nevertheless, those that could be verified were verified and more than 200 incorrect values could be identified. This project is a good example of improvising to make the best of a bad situation. It is always difficult to assess the success of a test project. The only objective way of doing it, it is to compare the errors found in testing with the errors which come up later in production. Since this datawarehouse system has yet to go into production, it is impossible to know how many errors might come up there. If they do it will be because of incomplete and inconsistent rules. The specification langauge problem remains the source of most software system errors and in the case of datawarehouse systems, particularly so. It is here where academia could make a significant contribution.
Datawarehouse Test Process

Original W ritten by the Rule Specs bank analyst Reformulating & Enhancing of the rules Formal Rule Specs Tester Tester formalizes sets up the rules Test data Assertion Generator Assertion Procedures 1 per DB-Entity assign old.Attribut Test Cases if<condition> in semi-formaler prose Tester checks results Exception Report Tester executes test System under Test
Source Data
DataTest Test Result Validator
Target data
Tester reports If (new.key=old.key) errors assert new.Attribut=old.Attribut if (<condition>); Error assert new.Attribut=old.Attribut + wert*wert; Reports
References:
[01] Bach, James: Reframing Requirements Analysis, in IEEE Computer, Vol. 32, No. 6, 2000, p. 113 [02] Hutcheson, Maggie.: Software Testing Fundamentals, John Wiley & Sons, Indianapolis, 2003, p. 12 [03] Fewster,M./Graham,D.: Software Test Automation, Addison-Wesley, New York, 1999, p. 248 [04] Kaner, C. / Bach, J. / Pettichord, B. : Lessons learned in Software Testing, John Wiley & Sons, New York, 2002, p. 111 [05] Dyer, Michael: Statistical Testing in The Cleanroom Approach to Quality Software Development, John Wiley & Son, New York, 1992, p. 123 [06] Taylor, R.: Assertions in Programming Languages, in Proc. of NCC, Chicago, 1978, p. 105. [07] Sneed, H.: Selective Regression Testing of a Host to DotNet Migration, submitted to ICSM2006, IEEE Computer Society, Philadelphia, Sept., 2006 [08] Koomen, T./ Pol, M.: Improving the Test Process, John Wiley & Sons, London, 1999, p. 7
In the end a report came out with the data errors which were then fed into the error tracking system by the testers. Once the rules had been reformulated the whole testing process could be repeated within a day. Normally such a test cycle would require at least 10 days. So the automation lead in this case to a significant improvement in test productivity.
8. Conclusions from the Test Project

This project demonstarted how important it is to have a proper specification language for testing. In testing an online system or a web-based system testers must simulate the end users. There are limits to what you can automate. Creative testing plays an important role, since it is impossible to forsee all modes in which the system could be used. Often the tester has to improvise. Therefore, the human tester has a significant role to play. That is not true of a datawarehouse test. Here the problem is to prove that the contents of the datawarehouse are 100% correct. For that the contents have to be 100% specified and the actual
IEEE
Samples:
Sample 1: An extract from the rule table before the rule has been converted
DR_INTEREST_ID;"Link to TB0_ACCOUNT_INTEREST.Debit interest conditions applicable to the account."; If REIACD in field DICT(debit) has a value other than 0, account is linked to interest group. The following then applies: REINTD / KEY (Position 3-4) (Interest Type) 2 Alpha REINTD / KEY (Position 5-9) (Interest Subtype) 5 Alpha REINTD / KEY (Position 1012)(Currency) 3 Alphas The above Key fields are concatenated in ID. If in REIACD the DICT values are zeroes, the account interest condition has to be extracted from the ACCNTAB: ACCNTAB / DRIB (debit Base Rate Code) ACCNTAB / DRIS (debit Spread Rate) If both of <> 0 value, extract ACCOUNT_ID If ACCNTAB / DRIS is available (<> 0), extract the same as for ACCOUNT_ID If only DRIB of <> value, extract DRIB
Sample 2: An extract from the rule table after the rule has been converted
DR_INTEREST_ID;"Link to TB0_ACCOUNT_INTEREST. Debit interest conditions applicable to the account."; " ? assign REIACD/DICT | REIACD/DCST | ACCNTAB/CCY | 'D' if REIACD/DICT (debit) <> '0', assign ACCNTAB/CONCAT if ACCNTAB/DRIS <> '0', assign ACCNTAB/CCY|ACCNTAB/DRIB if ACCNTAB/DRIB <> '0', assign '*nomap*' if REIACD/DICT = '00' and ACCNTAB/DRIS = '0' and ACCNTAB/DRIB = '00' assign ACCNTAB/CNUM|CCY|ACNO|BRCA if other. (18-digit account Id made up of CNUM length 6, leading zeros length 4, leading zeros +ACSQ length 2, leading zeros +BRCA concatenated).";
Sample 3: A generated assertion procedure

file: ACCOUNT; // This comparison procedure assumes that the old ACCOUNT file contains the following attributes: // from ST_ACCOUNT Table ,ACCOUNT_ID;ATYP;STYP;RETB;CRIB;CRIS;DRIB,DRIS;PRFC;OD;FACT;FCNO // from REIACD Table REIACD.CICT;REIACD.DICT;REIACD.DACP;REIACD.CACP // from FCLTYFM Table FCLTYFM.RVDT if ( new.ACCOUNT_ID = old.concat_acct_id ); assert new.GROUP_ACCOUNT_TYPE_ID = old.atyp if (old.atyp = "G"); assert new.GROUP_ACCOUNT_TYPE_ID = "S" if (old.atyp = "R" & old.styp = "S"); assert new.GROUP_ACCOUNT_TYPE_ID = "C" if (old.atyp = "R" & old.styp = "C"); assert new.GROUP_ACCOUNT_TYPE_ID = "L" if (old.atyp = "R" & old.styp = "L"); assert new.STATUS_BLOCKED = "1" if (old.retb = X"04"); assert new.STATUS_BLOCKED = "2" if (old.retb = X"02"); assert new.STATUS_BLOCKED = "3" if (old.retb = X"06"); assert new.STATUS_BLOCKED = "*n.a.*" if (other); assert new.CR_INTEREST_ID = old.crib if (old.cict = "0"); assert new.DR_INTEREST_ID = old.drib; assert new.DR_INTEREST_LIQU_ACCT = old.dacp; assert new.CR_INTEREST_LIQU_ACCT = old.cacp; assert new.PROFIT_CENTRE_DESCRIPTION = old.prfc; assert new.OVERDRAFT_START_DATE = "2005.05.15" if (old.od <> "0"); assert new.OVERDRAFT_REVIEW_DATE = old.rvdt if (old.fact <> "0" & old.fcno <> "0") ; assert new.OVERDRAFT_REVIEW_DATE = "NULL" if (old.fact = "0") ; assert new.PRIMARY_ACCOUNT_INDICATOR = "P" ! "S" ! "*n.a.*"; assert new.STATUS_INDICATOR = "inactiv" if (old.retb = X"3") ; assert new.STATUS_INDICATOR = "inactiv" if (old.retb = X"1") ; assert new.STATUS_INDICATOR = "closed" if (old.reci = "C") ; assert new.STATUS_INDICATOR = "active" if (other); end;
IEEE
Sample 4: A section from the assertion generation log

+-------------------------------------------------------------------+ |&New Assert Script: Asrt-011 = ST_ACCOUNTING_STANDARD +-------------------------------------------------------------------+ | New Attribute ACCOUNTING_STANDARD_CODE has no rule! | New Attribute APPLICATION_ID has a rule that was not processable | Rule = text file | New Attribute QUALITY_INDICATOR has no rule! | New Attribute SOURCE_SYSTEM_ID has a rule that was not processable | Rule = text file +-------------------------------------------------------------------+ | Number of new Attributes = 013 | Number of Rules = 011 | Number of Basel Rules = 003 | Number of complex Rules = 005 | Number of complexBasel Rules = 002 | Number of generated Keys = 002 | Number of generated Asserts = 009 | Number of unrecognized Rules = 002 +-------------------------------------------------------------------+
Sample 5: The generation statistics

+-------------------------------------------------------------------+ | Assertion Generation completed ! | Number of E/R Rule Lines processed = 11384 | Number of new Tables processed = 266 | Number of new Attributes processed = 10910 | Number of old Tables processed = 132 | Number of old Attributes processed = 00891 | Number of new Attributes without a rule = 05573 | Number of Transformation Rules specified = 05337 | Number of Transformation Rules processed = 04742 | Number of Basel-II Rules recognized = 00890 | Number of Basel-II Rules processed = 00852 | Number of Complex Rules recognized = 00853 | Number of Complex Rules processed = 00562 | Number of Complex Basel Rules recognized = 00157 +-------------------------------------------------------------------+ | Number of Assert Scripts generated = 00266 | Number of Assertions generated = 04897 | Number of Table Selectons generated = 00181 | Number of Assert Keys generated = 00023 | Number of Assert Conditions generated = 00308 | Number of Assert Concatenates generated = 00103 | Number of Assert Alternates generated = 00365 | Number of Assert Arithmetic generated = 00009 | Number of Test Cases generated = 05337 +-------------------------------------------------------------------+
Sample 6: A generated select procedure for the incoming data

// SQL Select for Stage File: ST_ACCOUNT_INTEREST_RATE SELECT // SQL Select for MiDas File:REINTD SELECT ACCOUNT_ID, KEY, BAL2, FROM REINTD JOIN // SQL Select for MiDas File:SDBANKPD SELECT
IEEE
BJRDNB, FROM SDBANKPD JOIN
// SQL Select for MiDas File:ACCNTAB SELECT

LCD, FROM ACCNTAB JOIN END SELECT; // End of Select for: ST_ACCOUNT_INTEREST_RATE
Sample 7: A section of the data validation report

+----------------------------------------------------------------------------------------+ | File/Table Comparison Report | | Key Fields of Record(new,old) | +----------------------------------------------------------------------------------------+ | New:ACCOUNT_ID | | Old:concat_acct_id | +----------------------------------------------+-----------------------------------------+ | RecKey:100000USD114601001 | duplicate key in old File/Table | +----------------------------------------------+-----------------------------------------+ | RecKey:100000ATS104501001 | | | New: GROUP_ACCOUNT_TYPE_ID | G | | Old: Constant_Value | L | +----------------------------------------------+-----------------------------------------+ | RecKey:100000ATS104501001 | | | New: CR_INTEREST_ID | XXX00 | | Old: crib | 0 | +----------------------------------------------+-----------------------------------------+ | RecKey:100000ATS104501001 | | | New: DR_INTEREST_LIQU_ACCT | 0 | | Old: dacp | 1 | +----------------------------------------------+-----------------------------------------+ | RecKey:100000ATS104501001 | | | New: OVERDRAFT_START_DATE | NULL | | Old: Constant_Value | 2005.05.15 | +----------------------------------------------+-----------------------------------------+ | RecKey:100000ATS104501001 | | | New: OVERDRAFT_REVIEW_DATE | 9999-08-01_00:00:00 | | Old: Constant_Value | NULL | +----------------------------------------------+-----------------------------------------+ | RecKey:100000ATS104501001 | | | New: APPLICATION_ID | GL | | Old: Constant_Value | RE | +----------------------------------------------+-----------------------------------------+ | RecKey:100000ATS104601001 | | | New: OVERDRAFT_REVIEW_DATE | 9999-08-01_00:00:00 | | Old: Constant_Value | NULL | +----------------------------------------------+-----------------------------------------+ | RecKey:100000ATS104701001 | | | New: GROUP_ACCOUNT_TYPE_ID | G | | Old: Constant_Value | C | +----------------------------------------------+-----------------------------------------+ | RecKey:100000ATS196501100 | missing from the new File/Table | +----------------------------------------------+-----------------------------------------+ | Total Number of old Records checked: 91 | | Number of old Records found in new File: 08 | | Number of old Records with duplicate Keys: 72 | | Number of old Records not in new Table: 11 | | Total Number of new Records checked: 59 | | Number of new Records found in old File: 08 | | Number of new Records with alternate Keys: 00 | | Number of new Records not in old File: 51 | | Total Number of Fields checked: 93 | | Total Number of non-Matching Fields: 46 | | Percentage of matching Fields: 51 % | | Percentage of matching Records: 14 % | +----------------------------------------------------------------------------------------+
IEEE

Testing A Datawarehouse

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Testing A Datawarehouse

Caricato da

Copyright:

Formati disponibili

Testing a Datawarehouse - an industrial challenge

1. Conventional System Testing

2. Datawarehouse System Testing

4. Specifying the Data Transformation Rules

3. The Challenge of defining a Rule Specification Language

5. Post Assertions for validating test results

6. The Tools for Dataware House Testing

7. The Dataware House Test Process

Datawarehouse Test Process

DataTest Test Result Validator

8. Conclusions from the Test Project

Sample 3: A generated assertion procedure

Sample 4: A section from the assertion generation log

Sample 5: The generation statistics

Sample 6: A generated select procedure for the incoming data

BJRDNB, FROM SDBANKPD JOIN

// SQL Select for MiDas File:ACCNTAB SELECT

Sample 7: A section of the data validation report

Potrebbero piacerti anche