Sei sulla pagina 1di 10

EMT – EIM Migration Tool ©

Data Migration Solution Overview - Version 1.2

External Adapter Integration : Cassandra Adapter

Product names, designations, logos, and symbols may be trademarks or registered trademarks of their respective owners.
EIM Migration Tool © (EMT)
“EMT is a Software Solution designed to reduce costs and complexity of large scale data migration projects”

• EMT is the result of many years of experiences in complex and performance demanding data migration projects providing a
Migration Solution decoupled and independent from the migration strategies.

• EMT addresses organizations to deliver a rapid and successful project implementation due to the capability of combining a
migration software layer with a proven Process Development Methodology:

• Data Extraction, Data Validation and Data Transformation processes can be developed using EMT procedures and
EMT PL-SQL templates.
• EMT Process Development Methodology allows achieving the best performances during the parallel execution.
• Upload processes are orchestrated by EMT engine and do not require development activities.

• EMT uses migration interface tables called “EIM” to execute and distribute CRUD operations into multiple target systems.

• EIM processes are integrated with:

• Siebel Enterprise Server


• Oracle Database Server
• Apache Cassandra (*)

(*) Integrated via “EMT External Adapter Interface”: Additional Adapters with a common interface can be used to upload different target systems.
EMT - Architecture
EMT can be integrated in many Migration Architectures since is fully configurable.

EMT engine is developed in PL-SQL and runs on Oracle ©


Database Server (Version 10.0.6.2 or above).

A single EMT instance is able to upload a maximum number of


5 Target Systems using the EIM upload component.

The usage of dedicated EMT Oracle Database Server


(Staging Area DB) is highly recommended for Operational
Architecture:

• Staging database allows executing data extraction,


data validation and data transformation processes
without transactional impacts on the target
databases.

Multiple EMT instances can be installed on the same Oracle


Database Server (Non Operational Architecture).
EMT – Main Features
• Automatic migration process (Start, Stop, Restart Migration flow).

• High performance and scalability:

• Data Distribution
• Pipeline Process Execution Thanks to EMT parallel processing:
• Parallel Processing. Migration performances are ruled only by the
• Deadlock Prevention. migration process with lower performance rate.
• Low Data Contention.

• High quality process log.

• Hierarchy structures processing (E.g. Large Account Corporation, Corporate Discount, ..)

• Error Handling and Tracking of migration execution results.

• Automatic Recovery and Rollback.

• Migration Execution Statistics, Performance and Execution Reports.

• Configuration Checks.

• Configuration Versioning and Deploy.


EMT Cassandra Adapter
EMT - Cassandra Adapter
Java Class implemented using DataStax © Driver 3.6.0

• It’s executed by EMT engine as EIM process enabling Parallel Execution mode and multiple EIM tasks.

• It uses EIM tables as interface to Cassandra Tables executing the following operations:

• Insert
• Update
• Upsert (Insert or Update)
• Delete (Rollback)
• Reconcile

• Updates the execution result for each record.

Process: UPL_EIM_CASS_1

B1
Operation: Insert
Parallel Degree: 3
Target: Table_Cass1

EIM
Task 1
• BATCH_NUM
1-10
Task 1 – EIM Batch 1..10 - EMT Cassandra Adapter OP: Insert
EIM
Task 2
• BATCH_NUM
11-20 Task 2 – EIM Batch 11..20 - EMT Cassandra Adapter OP: Insert EIM Task Execution
EIM
Task 3
• BATCH_NUM
21-30 Task 3 – EIM Batch 21..30 - EMT Cassandra Adapter OP: Insert
EMT Cassandra Adapter – Architecture & Flow
1. EMT Cassandra Adapter is executed by EIM tasks based on the EIM Process
configuration.

2. Retrieve the Default Target Configuration or specific Process Configuration


(Override Default Configuration) to build Cassandra Cluster interface:

• Connection Configuration Policy


• Authentication
• Target Keyspace
• Load Balance Policies
• Pooling Options
• Session Options
• Consistency Level
• Synchronous/Asynchronous execution
• Skip Null Values (Evict Tombstones)
• Time To Live
• Protocol and Socket
• SSL
• Compression
• Timeouts

3. Build Cassandra Cluster Interface and establish connection.

4. Read Source EIM records.

5. Prepare statement and execute Cassandra operation in Synchronous or


Asynchronous mode (Insert, Update, Upsert, Delete, Reconcile).

6. Update Execution Results on EIM Table.


EMT Cassandra Adapter – Data Mapping

Data Mapping is implemented translating Cassandra Tables into ORACLE tables.

• Translation consists in defining an ORACLE Base Table with 1:1 Cassandra Column Mapping and converting Cassandra Datatypes
into ORACLE Datatypes. Primary Key must be the same for both tables.

• EMT Cassandra Adapter uses pre-built implicit datatype Cast solving (whenever possible) the conversion between ORACLE
Datatypes and Cassandra Datatypes.

• In most of the cases, the implicit EMT datatype Cast do not require additional mapping configuration activity.

• Using EMT procedures, EIM table or Multiple EIM tables are generated to define the mapping between EMT and Cassandra.

• EIM mapping Tables are created into a dedicated Schema. Base tables are used only for EIM table creation.
EMT Cassandra Adapter – Data Mapping
EMT Cassandra Adapter uses additional Data Mapping Configuration to manage the following scenarios:

• Cassandra Long Names and Oracle Keywords


• Cassandra Target Tables Name or Column Identifier can exceed 35 characters or conflicts with Oracle keywords (E.g. Column Name “ID” ).
In those cases and additional configuration must be used:

• Table Long Name Mapping (Oracle Table Name  Cassandra Table Name)
• Column Long Name Mapping (Oracle Column Name  Cassandra Column Name)

• Cassandra Functions

• Target Columns updated with Cassandra functions are configured using the Cassandra column name and the Cassandra function.
Those columns are not mapped into EIM table columns.
• E.g. 1: CassandraTable.timestamp_column  toUnixTimestamp(now())
• E.g. 2: CassandraTable.counter_likes  counter_likes + 1

• EMT Cast Functions

• EMT Datatype Cast functions are used to define the datatype mapping for Columns overriding the implicit mapping:
• E.g.1 : EIMTable1.duration_column  EMT_DUR_SEC() - Cast Duration from nanoseconds (Default) to seconds.
• EMT_CQL()

• EMT_CQL() function enables CQL parse to process target columns having data types:
• Collection (List, Map, Set)
• User Defined Types

E.g.1 : EIMTable1.column_map  EMT_CQL()

EIM_TABLE1.COLUMN_MAP - Type VARCHAR2 or (N)CLOB CASSANADRA_TABLE1.COLUMN_MAP - Type map<int, text>


--------------------------------------------------------------------------------------  ------------------------------------------------------------------------------------------
{ 1000: 'Element Map N.1', 1001: 'Element Map N.2‘ } { 1000: 'Element Map N.1', 1001: 'Element Map N.2‘ }
EMT – COPYRIGHT NOTICE
Information in this document is subject to change without notice and may contain inaccuracies or typographical errors.

No part of this document may be reproduced, stored in, or introduced into a retrieval system, or transmitted in any form or
by any means (electronic, mechanical, photocopying, recording, or otherwise), or for any purpose, without the express
written permission of Gecob It Consulting.

Except as expressly provided in any written license agreement from Gecob It Consulting, the furnishing of this document
does not give you any license to trademarks, copyrights, or other intellectual property.

Product names, designations, logos, and symbols may be trademarks or registered trademarks of their respective owners.

For further information please contact us at the following e-mail address: emt@gecobitconsulting.com

EMT ©2010 Gecob It Consulting Ltd.

Potrebbero piacerti anche