Sei sulla pagina 1di 26

What Does Ab Initio Mean?

Ab Initio is a Latin phrase that translates to from first


principles or from the beginning.
Ab Initio work from the first principles to find the best
solutions to the enterprise computing problems. This is the
only way to design S/W that simultaneously meets the
challenges of robustness, scalability, and performance.
Ab Initio is so flexible and easy to use that it can add value to
your organization almost immediately that is from the
beginning.

10/15/08

1
Information Technology

Components of Ab Initio
The Co>Operating System
Unlimited scalability
Data parallelism results in speedups proportional to the H/W
resources provided: double the no of CPUs and execution time
is halved.
Flexibility
The Co>Operating system provides powerful and efficient data
transformation engine and an open component model for
extending and customizing Ab Initios functionality
Portability
The Co>Operating system runs heterogeneously across a huge
variety of OS and H/W platforms from OS/390 on mainframes to
10 different implementations of UNIX to Windows NT and
Windows 2000
10/15/08

2
Information Technology

The Graphical Development Environment


The GDE lets one to create applications by dragging and
dropping components onto a canvas, configuring them with
familiar, intuitive point-and-click operations, and connecting
them into executable flowcharts. The Co>Operating system
executes these flowcharts directly.
Integration with systems like SAS, Trillium, Oracle Financials
and many others is very much possible with Ab Initio

10/15/08

3
Information Technology

The component library


Powerful
The sorting components are as powerful and as efficient as
dedicated sorting packages and the data transformation
components deliver best of breed transformation capability
Extensible
A new component can be created from virtually any program,
permitting reuse and integration of legacy codes and third
party products
Metadata driven
The components adapt at run time to the record formats and
business rules controlling their behavior
10/15/08

4
Information Technology

Topics for discussion

Using Sub Graphs and Macros


How and When to use Limit/Ramp?
How to handle Multiple Filter Conditions?
Difference between Rollup and Aggregator?
Questions

10/15/08

5
Information Technology

About Subgraphs and Macros


Ab Initio provides three facilities for developing your graphs
beyond simply inserting pre-built components
Custom Components
Subgraphs
Macros

10/15/08

6
Information Technology

Custom Components

If the solution to the task is a single executable you need to use


a custom component. You can write a new program or script
with a specific purpose in mind.

You can build custom components from existing programs or


shell scripts and integrate them into the Ab Initio development
environments.

custom component consists of two elements:


1. Your program or shell script
2. A program specification file

10/15/08

7
Information Technology

Custom Components

Program specification files provide the Co>Operating System


with the information it needs to run your program or shell
script. Write program specification files as files with .mpc
extensions

A program specification file describes your program's


command line arguments, ports, parameters, and other
attributes

10/15/08

8
Information Technology

Subgraph

If you can construct the solution to the task from Ab Initio prebuilt components, and you can keep the number and
arrangement of components static from one run of the graph to
another, you can use a Subgraph. Of the three facilities, a
Subgraph is the easiest to use

When you use a subgraph you can define the component


parameters at runtime. You can change the value of the
parameters from one run of the graph to another, but the
number and arrangement of the components themselves
remain constant

You can save a subgraph in the component organizer of the


GDE so that it can be used in other graphs. Save it in ../ab
initio/ab initio gde/components/my components

10/15/08

9
Information Technology

Macro

You need a macro if the solution to the task requires that you
change, from one run of the graph to another, any of the
following:
The number of components
Which components you use
The order in which you connect the components
In a macro, the components, the flows that connect them, and
their parameters become runtime parameters of the graph. You
can change some or all of them from one run of the graph to
another

10/15/08

10
Information Technology

Macro
For example, you can use a macro to perform tasks similar to the
following:

You need to divide data records by month into separate files.

At one run of the graph you need four output files: one for each
of the last three months, and one for all earlier records.

At a later run of the same graph, you need six output files: one
for each of the last five months, and one for all earlier records.

In other words, the number of output files you need from one
run of the graph to another varies.

10/15/08

11
Information Technology

Macro
A macro consists of two parts:
3.

A program specification file, also known as a .mpc file

5.

A .ksh script that builds a graph fragment, using the mp


commands of the Shell Development environment

In the program specification file for a macro, you must use


the mpname line. It must begin with @, which tells the
Co>Operating System that this is a macro, and then it must
specify the complete path to the .ksh script that the macro
uses

10/15/08

12
Information Technology

Macro

10/15/08

13
Information Technology

Macro

10/15/08

14
Information Technology

Macro

10/15/08

15
Information Technology

Ramp and Limit


When the reject-threshold parameter of a component is set to
Use limit/ramp the ramp and limit parameters become available.
The component then uses these parameters together in a
formula to control the components tolerance for reject events:

The ramp parameter contains a float that represents a rate of


reject events in the number of records processed.
The limit parameter contains an integer that represents a
number of reject events.
The component stops the execution of the graph if the number
of reject events exceeds the result of the following formula:
limit + (ramp * number_of_records_processed_so_far)

10/15/08

16
Information Technology

Ramp and Limit

The default values are - for ramp:0.0 and limit 0

To specify an absolute limit on the number of reject events, set


limit to that absolute number and set ramp to 0

Specify a modest limit (such as 1000) to avoid unnecessarily


stopping the execution of the graph due to a run of bad records
near the start of the data set. For example, with a ramp of 0.01
and a limit of 0, any reject event in the first hundred causes the
component to stop the execution of the graph.

10/15/08

17
Information Technology

Aggregator Vs Rollup

In aggregator to calculate the sum, average and other aggregator


functions we have to write our own logic in the aggregator transform.

For every group by calculation it creates an aggregate record and


generates summary aggregate records at each level.

In rollup the easiest way to write a rollup transformation is to use the


aggregate functions (count, sum, min, max, avg, product, first, last).
These functions can be specified only in the grid mode. You cannot
use the aggregation functions in the Transform Editor's Text Mode.

Any rollup aggregate function can be made conditional by supplying


a second argument specifying the condition. The condition is
expressed as an assertion: it it evaluates true, the function will be
applied to the current record; if it evaluates false, the function will
not be applied (but the rollup will not be affected in any other way).
The condition argument is optional.

10/15/08

18
Information Technology

Aggregator Vs Rollup
Ex : sum(in.transaction_amt, in.customer_id==16)
would cause only those records whose customer_id value is 16 to
have their transaction_amt values summed.

Rollup does not produce summary records.


In both the components we have sorted-input parameter which has 2
forms
5.
Input must be sorted or grouped (default)
Accepts only sorted input. When this option is enabled rollup
sends the aggregate record after processing all the records in a
group to the out port
7.
In memory: Input need not be sorted
accepts ungrouped input. When this option is enabled all the input
records are processed and then the aggregate records are sent to
the out port

10/15/08

19
Information Technology

How to handle Multiple Filter Conditions


It can be done in multiple ways
3.
4.

Using Partition By Expression


Using Reformat

Using Partition By Expression


In option 1 you can directly take multiple output files and connect
them to the out port of Partition By Expression component.
In the transform function of the component you can write the logic
similar to this
if (ID=="AA") 0

else if (ID=="BB") 1
else if (ID=="CC") 2
else if (ID=="DD") 3
10/15/08

20
Information Technology

How to handle Multiple Filter Conditions

10/15/08

21
Information Technology

How to handle Multiple Filter Conditions

10/15/08

22
Information Technology

Using Reformat
2.
3.
4.
4.

5.

Open the properties page of Reformat component


Go to the parameters tab
Click on the count variable and change it to a
number that is equal to your no of filter conditions
The Ab Initio GDE creates those many no of out
ports on the reformat component. Now you can
carefully connect each of the ports to an output
file.
It also created those many no of transform
functions on the properties tab so that you can
specify your filter condition over there.

10/15/08

23
Information Technology

Using Reformat

10/15/08

24
Information Technology

Using Reformat

10/15/08

25
Information Technology

Q&A

?
10/15/08

?
?

?
?

?
?

?
?

?
26

Information Technology

Potrebbero piacerti anche