Sei sulla pagina 1di 6

Topology is used to define a complete representation of information system, it contains the data sources and schema(Source and target)

information like user name and password Physical Architecture (Technologies and agents),context, Logical Architecture, Languages, Repositories and Generic Actions. -->Agents are used to carry out the integration tasks at run time. Languages specify the keywords that exist for each technology. -->Actions are used to generate the data definition language (DDL)Scripts. --ODI connects to a data server by using JDBC or JNDI. --A logical schema is a single alias for different physical schema that have similar data structures based on the same technology but in diff contexts. --A context is an ODI object that represents a 'situation' where as similar group of resource appears. --A context maps logical resources onto their implementations as physical resources. --Agents orchestrates the entire process of integration. it carries out data transformations by sending generated code to the relevant technologies. this could mean sending sql stmts to a data server, shell commands to the os or even simple mail transfer protocol commands to an email server. --Create Data server --Create Physical Schema --Create Contexts --Create Logical Architecture --Create the Project Name Project Contains --Folder Packages Interfaces Procedures --Variables, sequences, user functions --KMs --Markers --Create Folder --Import KMs --Create Model and import tables using reverse engineer. --Model is a group of several data stores. A KM is a code template containing the sequences of commands necessary to implement a data integration task. Types of KMs. RKM--Reverse-Engineering--Retrieves the structure of a data model from a database. It is needed only for customized reverse engineering. LKM--Loading--Assembles data from source data stores to the staging area CKM--Check-- checks data in a data store for errors statically or during an integration process IKM--Integration--Uses a given strategy to populate the target data store from the staging JKM--Journalizing--Set up a system for changed data capture to reduce the amount of data that needs to be processed. SKM--Data services--Deploys data services that provide access to data in data stores.
for Flat file the KMs required are LKM File to SQL and IKM SQL to File Append.

CKM Oracle IKM Oracle Incremental Update IKM SQL Control Append LKM File to SQL

LKM SQL to Oracle --Before integration --Source to stating Area--> Only one LKM IKM Oracle Incremental Update DESCRIPTION: - Integrates data into an Oracle target table in incremental update mode. - Inexistent rows are inserted; already existing rows are updated. - Data can be controlled. Invalid data is isolated in the Error Table and can be recycled. - When using this module with a journalized source table, it is possible to synchronize deletions. REQUIREMENTS: - The Update Key defined in the interface is mandatory. RESTRICTIONS: - When working with journalized data, if the "Synchronize deletions from journal" is executed, the deleted rows on the target are committed - The TRUNCATE option cannot work if the target table is referenced by another table (foreign key) - The FLOW_CONTROL and STATIC_CONTROL options call the Check Knowledge Module to isolate invalid data (if no CKM is set, an error occurs). Both options must be set to NO in the case when an Integration Interface populates a TEMPORARY target datastore. - The option FLOW_TABLE_OPTION is set by default to NOLOGGING. Set it to whitespace if the interface runs on an Oracle 7 database - Deletes are committed regardless of the COMMIT option - The ANALYZE_TARGET option will evaluate correct statistics only if COMMIT is set to Yes. Otherwise, the IKM will gather statistics based on old data. - Default UPDATE option is TRUE, which means by default it's assumed that there is at least one nonkey column specified in a target datastore. IKM SQL Control Append Description : - Integration Knowledge Module - Integrates data in any ISO-92 compliant database target table in truncate / insert (append) mode. - Data can be controlled. Invalid data is isolated in the Error Table and can be recycled. Restrictions: - When working with journalized data, if the "Synchronize deletions from journal" is executed, the deleted rows on the target are committed regardless of the COMMIT option - The TRUNCATE option is not implemented on all RDBMs - The TRUNCATE option cannot work if the target table is referenced by another table (foreign key) - When using the RECYCLE_ERRORS option, you have to set an Update Key for your interface. - When using this module with a journalized source table, data are automatically filtered to not include source deletions. - The FLOW_CONTROL and STATIC_CONTROL options call the Check Knowledge Module to isolate invalid data (if no CKM is set, an error occurs). Both options must be set to NO in the case when an Integration Interface populates a TEMPORARY target datastore. LKM SQL to Oracle Description: - Loading Knowledge Module

- Loads data from any ISO-92 database to an Oracle target database. - This module uses ODI Agent to read selected data from the database and write the result in the Oracle target temporary table created dynamically. - When using this module on a journalized source table, the Journaling table is first updated to flag the records consumed and then cleaned from these records at the end of the interface. Restrictions: - Option WORK_TABLE_OPTIONS should be cleared for an Oracle 7 database CKM Oracle DESCRIPTION - Check Knowledge Module for Oracle - This module controls the validity of the constraints of a Datastore and rejects the invalid records in an error table. It can be used for static controls as well as flow controls. - This module creates non unique index on the I$ table before checking AK and PK and an index on the E$ table before removing erroneous records from I$ table. RESTRICTIONS - Data cleansing can be performed only if an update key is defined on the controlled table. - This Knowledge Module uses Oracle RowId column for data cleansing. OPTIONS (Refer to Option descriptions for more information on each option) - DROP_ERROR_TABLE: When this option is set to YES, the error table will be dropped each time a control is performed on the target table. This means that any rejected records, identified and stored during previous control operations, will be lost. --Global KMs will be used for all projects instead of only one project --Interface is an ODI object that loads one target datastore with data from one or more data sources. --A data set represents the data flow coming from a group of datastores. several data sets can be merged into the inteface target datastore by using set based operators such as union and intersect. --Staging Area is a separate ,dedicated area in an RDBMS where ODI creates its temporary objects and executes some of transformation rules. --Execute location --how a rule is implemented depends on the technology of its execution location. Procedure to create the Sequence --CREATE SEQUENCE "SRCODI"."SEQ_FAMILY_ID" MINVALUE 1 MAXVALUE 9999999999999999999999999 INCREMENT BY 1 START WITH 1 NOCACHE ORDER NOCYCLE Common KM Options Insert/Update --> Should data be inserted/Updated in the target Commit--> Should the interface commit the insert/update Flow Control-->Should data in the flow be checked Static Control --> should data in the target be checked after the execution of the interface Truncate DELETE All Delete Temporary Objects Incremental Update -->Insert/Update Append -->Only Insert CDC

JKM Oracle Simple km is required 1.Right click table -->CDC-->Add to CDC 2. Right click table -->CDC-->Start Journal 3.Chnage the data in table 4. click table -->CDC-->Journal Data Substations Method <% =odiRef.method_name(parameters)%>

Flow Control and Static Control


Flow Control

- If enabled this option uses the CKM selected and applies before loading into the target thus avoiding wrong data to get loaded What actually happening in the above flow is that , After loading the data into I$,

a check table is created (SNP_CHECK_TAB) , deleting previous error table and previous errors as ODI generally does. Now it creates a new Error table , and check for Primary key unique constraints , other constraints and conditions defined in Database or Model level ODI conditions and Not Null check for each column marked as Not null. If records violate the above constraints and conditions, it adds the required records into E$ table and add an entry of it into SNP_CHECK_TAB with information about schema, error message , count etc. Finally the other records are inserted and updated as per the KM and logic.

E$ Table

In the E$ table , the completed columns and the error message including error count and constraint name is record with associated ROW_ID. SNP_CHECK_TAB

Static Control

- If enabled this option used the CKM selected and applied after loading into the target.

Static and flow control options are used to check integrity of the data. STATIC option will check the integrity of the existing records in your datastore and move the error records into the error table. That means, say if u have datastore A containing 1000 records and wanted to check the audit of existing records then you can select STATIC option. But in FLOW control will check the integrity of the new records. You can use this option to check the integrity of the data before loading into the target table. In this way you can ensure that all the newly inserted records in target table is validated and good records and bad records will be moved into the error table. You can define any number of constraints in your target table, those constraints will be validated at that time of execution.

Introduction to Data Integrity Control


Data integrity control is essential in ensuring the overall consistency of the data in your information system's applications. Application data is not always valid for the constraints and declarative rules imposed by the information system. You may, for instance, find orders with no customer, or order lines with no product, and so forth. Oracle Data Integrator provides a working environment to detect these constraint violations and to store them for recycling or reporting purposes. There are two different types of controls: Static Control and Flow Control. We will examine the differences between the two. Static Control Static Control implies the existence of rules that are used to verify the integrity of your application data. Some of these rules (referred to as constraints) may already be implemented in your data servers (using primary keys, reference constraints, etc.) With Oracle Data Integrator, you can enhance the quality of your data by defining and checking additional constraints, without declaring them directly in your servers. This procedure is called Static Control since it allows you to perform checks directly on existing or static - data. Flow Control The information systems targeted by transformation and integration processes often implement their own declarative rules. The Flow Control function is used to verify an application's incoming data according to these constraints before loading the data into these targets. The flow control procedure is detailed in the "Interfaces" chapter. Benefits

The main advantages of performing data integrity checks are the following:

Increased productivity by using the target database for its entire life cycle. Business rule violations in the data slow down application programming throughout the target database's life-cycle. Cleaning the transferred data can therefore reduce application programming time. Validation of the target database's model. The rule violations detected do not always imply insufficient source data integrity. They may reveal a degree of incompleteness in the target model. Migrating the data before an application is rewritten makes it possible to validate a new data model while providing a test database in line with reality. Improved quality of service for the end-users.

Ensuring data integrity is not always a simple task. Indeed, it requires that any data violating declarative rules must be isolated and recycled. This implies the development of complex programming, in particular when the target database incorporates a mechanism for verifying integrity constraints. In terms of operational constraints, it is most efficient to implement a method for correcting erroneous data (on the source, target, or recycled flows) and then to reuse this method throughout the enterprise.

Potrebbero piacerti anche