Sei sulla pagina 1di 11

Integrating IBM® U2 and DB2® UDB

September 2005
Version 2.0

Note: All the information contained in this document is based on publicly available information and is subject
to change. IBM disclaims all warranties as to the accuracy, completeness, or adequacy of such information.
IBM shall have no liability for errors, omissions or inadequacies in the information contained herein or for

© Copyright IBM Corp. 2005 Page 1 of 11


interpretations thereof.

Table of Contents
INTRODUCTION ............................................................................................................................. 3

BUSINESS BENEFITS ................................................................................................................... 3

U2 ARCHITECTURE....................................................................................................................... 4

INTEGRATION CHALLENGES...................................................................................................... 4
DATA MODEL DIFFERENCES ............................................................................................................... 5
SCHEMA MAPPING ............................................................................................................................. 5
I/O STRATEGIES ................................................................................................................................ 5
INTEGRATED ADE.............................................................................................................................. 5
VIRTUAL ATTRIBUTES ......................................................................................................................... 6
EXTERNAL DATABASE ACCESS ................................................................................................ 7

MAPPING AND PERFORMANCE CONSIDERATIONS................................................................ 7

CONCLUSION ................................................................................................................................ 9

APPENDIX A: IBM WEB SITES ................................................................................................... 11

© Copyright IBM Corp. 2005 Page 2 of 11


Integrating IBM U2 and DB2 UDB

Introduction

The U2 business unit joined IBM as part of the July 2001 acquisition of Informix®. In a letter sent to
Informix customers shortly after the acquisition, Janet Perna stated “I want to make it clear that while you
may wish to consider DB2 for major new applications, there will be no forced migration to DB2. In
addition, IBM software, including DB2, runs on major operating systems and hardware platforms from HP,
Sun, Compaq, IBM, Microsoft and many others. You can rest assured that we will continue to deliver and
support IBM Information Management products on all major hardware platforms.”
This stance by IBM to continue to enhance and support Informix, U2, and DB2 on all major hardware
platforms continues today. Since July 2001, the IBM U2 group has released eight (8) new product
versions, signed several new partners and is pursuing a progressive roadmap for product enhancements.
Please see the IBM U2 web site at:
http://www.ibm.com/software/data/u2
Download the updated portfolio white paper at:
ftp://ftp.software.ibm.com/software/data/informix/pubs/whitepapers/informix-roadmap080204.pdf

As part of a larger organization like IBM, U2 benefits from a wide array of resources and technologies. U2
leverages products within the IBM Software Group portfolio, such as WebSphere MQ, WebSphere Studio
Application Developer, and DB2.
For those customers who wish to explore new application development, application integration, or even
full migration to DB2 UDB, this white paper explores the enabling technologies and programs, future
directions and best practices in achieving that goal.

Business Benefits

By providing External Database Access for U2, customers have the flexibility to deploy some or all of their
data in DB2. This can yield business benefits from both a marketing and technical perspective.
Existing applications can run unchanged, storing the data in U2 files or, with minimal changes, can store
and access data in an external database. New modules can be developed which leverage specific
functionality of DB2 while allowing data to be easily exchanged, leveraging existing applications while
integrating easily with data from the new modules. Over time, existing modules can be re-engineered
incrementally to a new architecture directly accessing DB2.
IBM values its U2 partners and is committed to providing the technology they need to remain competitive.
IBM wants U2 partners to remain in the IBM family and to continue to benefit from their current
technology selection as well as be able to leverage other technology available from the IBM portfolio.

© Copyright IBM Corp. 2005 Page 3 of 11


Integrating IBM U2 and DB2 UDB

U2 Architecture

The U2 database architecture has some unique characteristics that create challenges when integrating
with other Relational Database Management Systems (RDBMS.) U2 databases comprise an integrated
development environment and data storage using a nested relational data model.
The U2 application environment supports single process server-centric applications, client/server
applications and N-tier applications including web deployments. U2 Business Partners currently deploy in
any combination of these modes. IBM U2 supports ANSI-SQL as well as industry standard interfaces
such as JDBC, OLEDB and ODBC, SSL (Secure Socket Layer), XML, and a SOAP Client API.
Figure 1 below depicts a high level picture of a U2 application environment showing the U2 application
along with the user interfaces and possible connection mediums.

VB, VB.NET, ASP.NET, C#, J#, Delphi, Java, J2EE, etc.
Presentation (client)
UO.NET
Business Logic (client) Telnet SQL

UOJ
UO

wIntegrate / ODBC,OLEDB,
SBClient JDBC

Application Server
Business Logic (n-Tier)

U2 RedBack /
SOAP U2 Web
Server SB+ Server Builder

Presentation (server) U2 Basic U2 Query U2 SQL


Business Logic (server)
U2 VM Engine
Data Access(server)
U2U2 Data
Data Store
Storage

Figure 1
The application may have been developed as pure BASIC, a combination of BASIC and SB+ or pure
SB+. These applications may use every facility that the U2 environment provides, including the nested
relational data model, direct I/O, loosely coupled schema, native query facilities, centralized device
control, Recoverable File System (RFS), etc. Applications rely on the performance that the U2 databases
provide when serving up data due to both the nested data model and direct I/O access. Applications may
include the optional U2 transaction-processing semantics.

Integration Challenges

In this section we will examine the key challenges to integrating with or migrating to DB2 from U2. While
theoretically possible to simply rewrite an existing application to target a new database, the time to
market and time to return on investment (ROI) make this approach a very costly one. A strategy that
allows integration and incremental development of new modules or migration of existing modules over
time provides significant advantages in leveraging existing investment and avoiding an interruption in
business momentum.

© Copyright IBM Corp. 2005 Page 4 of 11


Integrating IBM U2 and DB2 UDB

Data Model Differences


While one of the most obvious differences between U2 and DB2 is the data model; this turns out to be
less of an issue than one might imagine. U2 uses a nested relational data model which allows multi-
valued and multi-subvalued data to be stored in a single column. In essence U2 supports multiple nested
tables within a given table, effectively providing materialized joins, which can yield large benefits in
performance for queries and online transaction processing. While many RDBMS provide some type of
nested table support, there may be limitations in the number of nested tables or the implementation may
not exactly match U2 functionality in a number of ways. Our experience has shown that if a U2 database
is mapped out to simple 1NF tables, the number of tables can increase dramatically if fully normalized.
However, it is possible to denormalize a database or to use RDBMS indexing functionality to mitigate the
impact to a certain extent.

Schema Mapping
In order to map a U2 database into an RDBMS database, it is necessary to have a good understanding of
the U2 schema both at the physical and the logical level. U2 has a very loosely coupled schema, which
makes it very flexible, but also presents a challenge when mapping is required.
U2 tables consist of a data component and an associated dictionary. The dictionary provides formatting
information for the query tools but is optional and it is not used by U2 to enforce either data typing or data
integrity. Applications may store any type of data in any column; each row is stored as a dynamic string
array. There may be no dictionary or there may be multiple dictionaries that are defined for a given table.
Within a dictionary, there may be zero, one or multiple definitions for a given column. The information
stored to aid in formatting reports may not contain enough information to accurately determine the column
definition in a first normal form (1NF) RDBMS. Current schema mapping techniques use the formatting
properties defined in the dictionary to derive the corresponding relational column data types. In addition,
while there is some capacity for declaring referential integrity, this functionality is rarely used in a U2
database.
In a 1NF RDBMS, such as DB2, each table is associated with one and only one schema. Each column is
strongly typed and referential integrity can be declared and enforced.
Some U2 business partners augment the U2 data dictionary and exercise strict programming discipline to
maintain very good schema information that can be used to more easily map to a 1NF RDBMS schema.
Tools can be created to assist in the schema discovery and cleansing process to provide and maintain an
accurate mapping of U2 to RDBMS schema.

I/O Strategies
U2 allows both direct I/O and record set-based access to the data in its tables in contrast with a 1NF
RDBMS, which only allow set-based access. It is possible to modularize database I/O routines and
translate direct I/O to set-based depending on the target data storage choice: U2 or RDBMS.
The U2 Query languages allow some very complex constructs that may not be possible to translate to a
single SQL Query in the target RDBMS environment. Either these would have to be translated into
multiple SQL queries or the work would have to be performed in the U2 run engine depending on the
results of optimization research.

Integrated ADE
U2 provides not just a data storage mechanism using a nested data model and loosely coupled schema,
but also a tightly integrated development environment that consists of a command environment and
virtual machine in which to execute compiled BASIC pcode (similar to the functionality of byte code in
Java). U2 BASIC is optimized to efficiently access and process data stored in U2 tables. U2 BASIC

© Copyright IBM Corp. 2005 Page 5 of 11


Integrating IBM U2 and DB2 UDB

natively understands nested data model and is optimized for string processing to handle the dynamic
arrays that represent the data records. U2 BASIC can be used to develop server centric applications; it
can also be used as a stored procedure language for user defined functions and remote procedures. The
advantage to this tightly coupled development language is seen in the performance and scalability
achieved by U2 applications. However, its uniqueness presents a challenge to integration and migration
efforts.
There is no simple way to convert U2 BASIC into another language, so moving business logic out of U2
into a client or application server language would require rewriting this logic in a new language. Partners
can rewrite their business application in another language in a client/server model or, as the industry
trend is moving, at the application server level. This could be Java to be run in a J2EE Application Server
such as IBM WebSphere or Apache. Or, it could be any one of a number of Microsoft Languages such as
VB.NET, C#, J# etc. that compile to the Command Language Runtime (CLR) and run in Microsoft’s
Internet Information Server (IIS).
As an alternative or interim step, it is possible to develop database drivers that, at the I/O level, can
redirect the I/O to an external or alternate data store such as DB2 using either XML or SQL interfaces to
store, access or update the data. This allows applications to run their existing application business logic
without a major rewrite, giving time for an incremental rewrite into another language if that is desired.

Virtual Attributes
U2 supports the concept of “virtual attributes” which are roughly equivalent to “user defined functions” in a
1NF RDBMS. Virtual attributes can perform simple string or mathematical functions or specialized nested
data model functions or can call U2 BASIC subroutines (aka stored procedures.) These U2 BASIC stored
procedures can take full advantage of all of the optimized functionality that U2 offers. A U2 stored
procedure can do anything a U2 BASIC program can do including but not limited to updating other tables.
While most RDBMS do support user defined functions and stored procedures, they would not match the
exact functionality provided by U2 as they would have no nested data model functions and the stored
procedure language might have limitations on table updates that are not found in U2. In addition, each
RDBMS has its own stored procedure language and, as noted above, there is no simple way to do
language-to-language translation from U2 BASIC or any other language.
Another unique feature of U2 is the ability to create indexes on virtual attributes. This allows for the
results of these “user defined functions” to be calculated and stored in an index for fast access.
The approach here can be to optimize where the function is performed. It may be possible to translate
very simple virtual attributes that use standard string or math functions and let the target database
environment execute that functionality. It may be that some functions or stored procedures cannot be
translated and so the data should be returned to the U2 environment before the function is executed in its
native environment. Tools can be developed to do very simple language-to-language translations if no
U2-specific functionality is required. Over time, a given application could be re-engineered to depend less
on user-defined functions to allow maximum deployment flexibility.

© Copyright IBM Corp. 2005 Page 6 of 11


Integrating IBM U2 and DB2 UDB

External Database Access

In response to customer interest in integrating with, developing new modules on and/or incrementally
migrating to DB2, the IBM U2 group developed a feature known as External Database Access (EDA).
This allows existing applications to have a choice of storing some or all of their data in U2 tables or DB2
tables. The functionality will be rolled out in phases, beginning with UniData 7.1, and optimized as need
be with each subsequent release.
In addition, we have developed the foundation for an open driver framework to provide database drivers
to store, access and update external databases from an existing U2 application. UniData 7.1 provides an
SQL-based database driver for DB2. Subsequent projects include performance improvements and
additional options for handling runtime mapping anomalies. The next phase will be to develop an XML-
based driver for DB2. This will provide the basis for a common API that can be used by third-parties to
develop additional drivers.

To address the integration challenges noted above, we are providing the following key functionality in
External Database Access:
Common Modularized I/O Interface – The database must use a consistent, modularized I/O layer to
access the data stores. This would be strengthened at the database engine level. By doing so, a driver
framework can be tied in to re-direct I/O depending on the target database.
Driver Framework and APIs – The driver framework provides a consistent mechanism to build database
drivers with published APIs so that others can develop their own drivers.
Schema Mapping and Data Migration Tools – Define the schema in DB2 and map the U2 tables into
DB2 tables. In addition to cleansing any undefined, partially defined or multiply defined columns, the EDA
Schema Manger will provide several options for virtual attribute mapping.
Once the schema is defined and mapped the data needs to be moved from U2 to DB2. This can be done
selectively on a table-by-table basis as customers’ needs may dictate. .
DB2 Database Driver – An SQL-based driver that allows data to be read, stored, and updated within the
IBM DB2 database.
Optimized Run-Time Mapping and Execution – Insofar as possible a U2 application should run as
normal with the U2 engine, at run time, mapping the data into one or more RDBMS tables and optimizing
which functionality should be executed in the U2 environment vs. the RDBMS environment on an
optimized basis.

Mapping and Performance Considerations

Before using the tools provided to configure a database for EDA, there are some items to consider and
‘best practices’ to follow. The will help ensure the best possible performance in DB2.
Mapping Considerations
Because U2 data can contain complex structures, mapping them to DB2 should be an iterative process. It
is advised that the database be examined first for anomalies that cannot properly translate to 1NF tables.
Mapping of very complex data should be avoided in order to achieve acceptable performance within DB2.
Administrators should choose to convert only those files that must reside in DB2, and the mapping should

© Copyright IBM Corp. 2005 Page 7 of 11


Integrating IBM U2 and DB2 UDB

only be of the fields required to be visible from the external database.


If there is more than one dictionary item that references a field to be mapped to DB2, choose only one or
create one that is most appropriate. This is because the same column should only occur once in DB2.
U2 supports a variety of conversion methods that can be used to format data for display. Ensure that all
fields to be mapped for DB2 contain any necessary conversion codes EDA to create enforceable data
types in DB2.
Column widths in U2 are for display purposes and do not enforce storage widths. Ensure the proper data
width is used in the mapped field or truncation of data may occur.
U2 dictionaries reference fields that may contain one or more values, which in turn may contain one or
more subvalues. When fields contain such multivalued and/or multisubvalued data, it is highly
recommended that along with associations, dictionaries should also contain the proper definition for the
type of field. In the case of UniData, this is S for single value, MV for multivalued, or MS for
multisubvalued. This aids in the creation of properly defined tables within the external database.
Mapping multisubvalued or multisubvalued fields will create several tables within the external database,
thus care should be exercised to reduce the frequency of Cartesian joins. When such mapping is
necessary, it is recommended that the multivalued and multisubvalued fields first be properly associated
with appropriate dictionary phrases defined within U2. This permits the creation of sub tables that have
properly defined relationships with their primary table in DB2.
When converting U2 data files to DB2 tables, the EDA conversion process must create the new table
using the mapped fields as columns with pre-defined widths. For this reason, one must take note of the
total column size, not including CLOB, to ensure it does not exceed the maximum bytes allowed by the
default buffer pool page size of 4K within DB2. Following this practice will avoid a failure with the
CREATE TABLE statement. A future release of the EDA Schema Manager will be enhanced to allow the
option of using a different buffer pool and tablespace.

Figure 2 below shows one possible mapping structure between a U2 file and a DB2 table:
ORDERS File
U2 File
ID ORD_DATE ORD_TIME CLIENT_NO PROD_NO COLOR QTY PRICE
816 10/25/2003 11:30AM 10045 10060 gray 126 $39.97
black 203 $39.97
10070 silver 144 $34.97
black 144 $34.97
969 10/24/2003 10:00AM 9988 56080 black 50 $3.99
red 50 $3.99
blue 50 $3.99

MV MS MS MS
ID ORD_DATE ORD_TIME CLIENT_NO Line_items
816 10/25/2003 11:30AM 10045
969 10/24/2003 10:00AM 9988 ID MV_POS PROD_NO
816 1 10060 ID MV_POS MS_POS COLOR QTY PRICE
PK 816 2 10070 816 1 1 gray 126 $39.97
969 1 56080 816 1 2 black 203 $39.97
FK 816 2 1 silver 144 $34.97
816 2 2 black 144 $34.97
969 1 1 black 50 $3.99
PK 969 1 2 red 50 $3.99
MV = Mulitvalue PK = Primary Key
MS = Multisubvalue FK = Foreign Key
FK DB2 Tables

PK

© Copyright IBM Corp. 2005 Page 8 of 11


Integrating IBM U2 and DB2 UDB

Figure 2
Performance Considerations
EDA leverages the DB2 Client to establish connections between U2 and DB2. Thus, the architecture is
that of a client-server paradigm, where the U2 database is the client and DB2 is the server. Any instance
where there is a client and a server, there is an expectation of performance degradation when compared
to directly accessing U2. An example of this is with an application that uses UniObjects for Java to
connect to U2. The minimum degradation to expect in this case is 5X that of a server-centric application.
Mapping of complex U2 data structures should be kept to a minimum to keep performance at an
acceptable level. Mapping only those attributes that are required to be used on the external database will
help maintain application performance by keeping the number of joins required to minimum.
As stated previously, moving data to DB2 should be an iterative process. It is very important to test
application performance with each iteration. Out of the box, U2 is highly performant. DB2 will likely need
to be tuned for performance as more and more of the data resides within it. It is highly recommended that
a DB2 performance expert be contacted to resolve such issues.

Conclusion

IBM is committed to continued enhancement and support of the IBM U2 databases so that U2 customers
can continue to be competitive in their business. We are confident that customers can continue to
leverage their investment in their U2-based solution for years to come and successfully compete in the
marketplace.
A U2 partner may wish to develop new solutions or new modules for their existing solutions using DB2 or
to incrementally migrate their existing solution to DB2. The addition of external database access
functionality to U2 provides a technical solution along with the expertise to address any challenges that
arise. IBM is also committed to making the decision to deliver your solution on an IBM database, a
financially viable decision.

© Copyright IBM Corp. 2005 Page 9 of 11


Integrating IBM U2 and DB2 UDB

© Copyright IBM Corporation 2005. All rights reserved.


U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP
Schedule Contract with IBM Corp.
The following are trademarks or registered trademarks of IBM Corporation in the United States, other
countries, or both: IBM, Informix, UniData, UniVerse, and WebSphere.
Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the
United States, other countries, or both.
Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other
countries, or both.
UNIX is a registered trademark of The Open Group in the United States and other countries.
Linux is a trademark of Linus Torvalds in the United States, other countries, or both.
Other company, product, or service names may be trademarks or service marks of others.

© Copyright IBM Corp. 2005 Page 10 of 11


Integrating IBM U2 and DB2 UDB

Appendix A: IBM Web Sites

IBM U2
http://www.ibm.com/software/data/u2
DB2 UDB for Linux®, UNIX®, and Windows®
http://www.ibm.com/software/data/db2/udb/
Downloads for DB2
http://www14.software.ibm.com/webapp/download/category.jsp?s=c&cat=data
IBM Learning Services Home page
http://www.ibm.com/services/learning/
IBM Technical Conferences
http://www.ibm.com/services/learning/conf/
IBM Certifications
http://www.ibm.com/education/certify/
IBM Redbooks
http://www.redbooks.ibm.com

© Copyright IBM Corp. 2005 Page 11 of 11

Potrebbero piacerti anche