Sei sulla pagina 1di 3

ETL TOOLs ETL Tools are meant to extract, transform and load the data into Data Warehouse

for decision making. Before the evolution of ETL Tools, the above mentioned ETL process was done manually by using SQL code created by programmers. This task was tedious and cumbersome in many cases since it involved many resources, complex coding and more work hours. On top of it, maintaining the code placed a great challenge among the programmers. These difficulties are eliminated by ETL Tools since they are very powerful and they offer many advantages in all stages of ETL process starting from extraction, data cleansing, data profiling, transformation, debuggging and loading into data warehouse when compared to the old method. There are a number of ETL tools available in the market to do ETL process the data according to business/technical requirements. Following are some those. Pentaho Kettle Informatica PowerCenter Inaplex Inaport Talend Parameters: Some are the parameters to compare etl tools Total Cost of Ownership Total Cost of Ownership means the over all cost for a certain product.This can mean initia ordering, licensing servicing, support, training, consulting,and any other additional payments that need to be made before the product is in full use.Commercial Open Source products are typically free to use, but the support,training and consulting are what companies need to pay for. Risk There are always risks with projects, especially big projects. The risks for projects failing are: Going over budget Going over schedule Not completing the requirements or expectations of the customers

Open Source products have much lower risk then Commercial ones since they do not restrict the use of their products by pricey licenses.

Comparison of ETL Tools


Parameters Ease of use Data quality Pentaho kettle Informatica power center Most easy GUI Easy GUI,require require little appropriate training training has DQ features Its product in its GUI, Informatica Data also has some Quality has many additional DQ features. modules after subscribing Java-connector It is the fastest slows it down. tool. It has an requires manual advanced tweaking. Can be PushDown clustered by option that placed on many localizes machines to transformation reduce network tasks depending traffic. on how busy the machine is. Can connect to mainframes, flat all the current files, excel files databases, flat and web files,xml files, services. It can excel files and also export as a web services. web service. offers support from UK ,US and consultancy partner in Hongkong one 1Ghz CPU and 512mbs ram World support Inaplx inaport Not well GUI does have features. Talend Does have GUI as an add-on DQ has DQ features in its GUI,

Speed

does not use any special techniques to improve speed

It requires manual tweaking and prior knowledge of the specific data source to reduce network traffic and processing. flat files, xml files, excel files and webservices, but is reliant on Java drivers to connect to those data sources. Offers support but mainly resides in US

Connectivity

Support

Can connect to any(windows) connection. usually gets its data from outlook, ACT and excel files. wide Mainly resides in UK

Space required

Platform required

two CPUs with 1Gb ram for Standard Edition Server Is a stand-alone Windows, java engine that Solaris, HP-UX, can run on any IBM-UX, Redhat, machine SUSE linux that can run java. Low risk Medium commercial open-source suite

one CPU with one 1Ghz CPU 50mbs ram. I and 512mbs ram Can run on any windows platform that has .NET 2.0 installed Creates a java file or perl file that can be run on any machine with very little resource Low risk medium open-source data integration tool

Risk Cost effectively Type

High risk Medium risk High cost then medium other tools commercial data BI integration suite

Ease of Use All of the ETL tools, apart from Inaport, have GUI to simplify the development process. Having a good GUI also reduces the time to train and use the tools. Support Nowadays, all software products have support and all of the ETL tool providers offer support. Speed The speed of ETL tools depends largely on the data that needs to be transferred over the network and the processing power involved in transforming the data. Data Quality Data Quality is fast becoming the most important feature in any data integration tool. Connectivity In most cases, ETL tools transfer data from legacy systems. Their connectivity is very important to the usefulness of the ETL tools. Conclusion: By the comparing some of etl tools it is concluded that informatica and pentaho are good enough then other tools nd have wide vriety of products.informatica has larg vriety of products handling bussines processes and commercially have a place at market but its expensive then pentaho and have more risk in failing projects then pentaho. It is proved by MySQL and many of companies by their case studies that pentaho can handle small to large scale systems.Pentaho is gaining fast momentum with businesses that would not have considered using open source products before.

Potrebbero piacerti anche