Sei sulla pagina 1di 14

Data Stage FAQS-2

what is data modelling Data Modelling is designing of the data content and structure of the database. The data model documents the structure of and interrelationships between the data - it is presented as a combination of simple diagrams and written definitions and is independent of any DBMS software or hardware considerations. What is the difference between connecting to RDBMS using ODBC and native drivers? Rdbms is to connect to databasr through the oracle . Where as ODBC is to connect to microsft and where native is to conenct java with oracle. How did u connect with DB2 in your last project? Most of the times the data was sent to us in the form of flat files. The data is dumped and sent to us. In some cases were we need to connect to DB2 for look-ups as an instance then we used ODBC drivers to connect to DB2 (or) DB2-UDB depending the situation and availability. Certainly DB2-UDB is better in terms of performance as you know the native drivers are always better than ODBC drivers. 'iSeries Access ODBC Driver 9.00.02.02' - ODBC drivers to connect to AS400/DB2. Which command will no longer be supported after Oracle 8.1? A. CONNECT INTERNAL B. CONNECT/ AS SYSDBA C. CONNECT/ AS SYSTEM D. CONNECT/ AS SYSOPER Ans:A You used the password file utility to create a password file as follows: $orapwd file=$ORACLE_HOME/dbs/orapwDB01 password=orapass entries=5 You created a user and granted only the SYSDBA privilege to that user as follows: CREATE USER dba_user IDENTIFIED BY dba_pass; GRANT sysdba TO dba_user; The user attempts to connect to the database as follows: connect dba_user/orapass as sysdba; Why does the connection fail? A. The DBA privilege had not been granted to dba_user. B. The SYSOPER privilege had not been granted to dba_user. C. The user did not provide the password dba_pass to connect as SYSDBA. D. The information about dba_user has not been stored in the password file. Ans: c What is a Data Dictionary ? The data dictionary of an ORACLE database is a set of tables and views that are used as a readonly reference about the database. It stores information about both the logical and physical structure of the database, the valid users of an ORACLE database, integrity constraints defined for tables in the database and space allocated for a schema object and how much of it is being used. : What is OOPS? OOPs is the new concept of programming ,parallel to Procedure oriented programming.It were intorduced in late 80's.It consider the programming simulated to real world objects.It help in programming approach in order to built robust,user friendly and efficient softwares and provide the efficient way to maintain real world softwares. - Organize a program around its data (object)& set well define interface to that data. i.e. objects and a set of well defined interfaces to that data opps which contains the properties of inheritance encapsulation and polymorphysim basically that is called oops . and the abreivition is object oriented language . which applied in c++ fully what is oop Languages called "Pure" OO languages, because everything in them is treated consistently an

object, from primitives such as characters and punctuation, all the way up to whole classes, prototypes, blocks, modules, etc. They were designed specifically to facilitate, even enforce, OO methods. Examples: Smalltalk RE: What is OOPS? oops is object oriented programming.here we will wrie program easiest manner when compare to c.in oops we can use polrmorpism,inheritance etc... answer oops is a programming language organised around object rather than action and data rather than logic RE: What is OOPS? OOPs is an Object Oriented Programming language,which is the extension of Procedure Oriented Programming language.OOps reduce the code of the program because of the extensive feature of Polymorphism.OOps have many properties such as DataHiding,Inheritence,Data Absraction,Data Encapsulation and many more. can we use shared container as lookup in datastage server jobs? ya,we can use shared container as lookup in server jobs. whereever we can use same lookup in multiple places,on that time we will develop lookup in shared containers,then we will use shared containers as lookup. what are validations you perform after creating jobs in designer. what r the different type of errors u faced during loading and how u solve them Check for Parameters. and check for inputfiles are existed or not and also check for input tables existed or not and also usernames,datasource names,passwords like that What is the difference between Datastage and Datastage TX? Does the BibhudataStage Oracle plug-in better than OCI plug-in coming from DataStage? What is the BibhudataStage If data is partitioned in your job on key 1 and then you aggregate on key 2, what issues could arise? If your running 4 ways parallel and you have 10 stages on the canvas, how many processes does datastage how can you do incremental load in datastage? You can create a table where u can store the last successfull refresh time for each table/Dimension. Then in the source query take the delta of the last successful and sysdate should give you incremental load. RE: how can you do incremental load in datastage? Incremental load means daily load. when ever you are selecting data from source, select the records which are loaded or updated between the timestamp of lastsuccessful load and todays load start date and time. for this u have to pass parameters for those two dates. store the last rundate and time in a file and read the parameter through job parameters and state second argument as currentdate and time. RE: Does Enterprise Edition only add the parallel proc... DataStage Standard Edition was previously called DataStage and DataStage Server Edition. DataStage Enterprise Edition was originally called Orchestrate, then renamed to Parallel Extender when purchased by Ascential. DataStage Enterprise: Server jobs, sequence jobs, parallel jobs. The enterprise edition offers parallel processing features for scalable high volume solutions. Designed originally for Unix, it now supports Windows, Linux and Unix System Services on mainframes. DataStage Enterprise MVS: Server jobs, sequence jobs, parallel jobs, mvs jobs. MVS jobs are jobs designed using an alternative set of stages that are generated into cobol/JCL code and are transferred to a mainframe to be compiled and run. Jobs are developed on a Unix or Windows server transferred to the mainframe to be compiled and run. The first two versions share the same Designer interface but have a different set of design stages depending on the type of job you are working on. Parallel jobs have parallel stages but also accept some server stages via a container. Server jobs only accept server stages, MVS jobs only accept MVS stages. There are

some stages that are common to all types (such as aggregation) but they tend to have different fields and options within that stage. RE: Does Enterprise Edition only add the parallel proc... Row Merger, Row splitter are only present in parallel Stage . How can you implement Complex Jobs in datastage? what do u mean by complex jobs. if u used more than 15 stages in a job and if you used 10 lookup tables in a job then u can call it as a complex job how can u implement slowly changed dimensions in datastage? explain? 2) can u join flat file and database in datastage?how? Yes, we can join a flat file and database in an indirect way. First create a job which can populate the data from database into a Sequential file and name it as Seq_First. Take the flat file which you are having and use a Merge Stage to join these two files. You have various join types in Merge Stage like Pure Inner Join, Left Outer Join, Right Outer Join etc., You can use any one of these which suits your requirements. Yes, we can do it in an indirect way. First create a job which can populate the data from database into a Sequencial file and name it as Seq_First1. Take the flat file which you are having and use a Merge Stage to join the two files. You have various join types in Merge Stage like Pure Inner Join, Left Outer Join, Right Outer Join etc., You can use any one of these which suits your requirements. SCDs are three typesType 1- Modify the changeType 2- Version the modified changeType 3Historical versioning of modified change by adding a new column to update the changed data what is trouble shhoting in server jobs ? what are the diff kinds of errors encountered while running ? Hi, write the routine in C or C++, create the object file and place object in lib directory. now open disigner and goto routines configure the path and routine names there are 3 kind of routines is there in Datastage. 1.server routines which will used in server jobs. these routines will write in BASIC Language 2.parlell routines which will used in parlell jobs These routines will write in C/C++ Language 3.mainframe routines which will used in mainframe jobs what is the meaning of the following..1)If an input file has an excessive number of rows and can be split-up ? what is the mean of Try to have the constraints in the 'Selection' criteria of the jobs itself. This will eliminate the unnecessary records even getting in before joins are made? It probably means that u can put the selection criterai in the where clause,i.e whatever data u need to filter ,filter it out inthe SQL ,rather than carrying it forward and then filtering it out. How can ETL excel file to Datamart? take the source file(excel file) in the .csv format and apply the conditions which satisfies the datamart. Create a DSN in control panel using microsoft excel drivers. uthen u can read the excel file from ODBC stage. what is Data stage Multi-byte, Single-byte file conversions?how we use that conversions in data stag ? what is difference between serverjobs & paraller jobs? Here is the diff Server jobs. These are available if you have installed DataStage Server. They run on the DataStage Server, connecting to other data sources as necessary. Parallel jobs. These are only available if you have installed Enterprise Edition. These run on DataStage servers that are SMP, MPP, or cluster systems. They can also run on a separate z/OS (USS) machine if required. The Parallel jobs are also available if you have Datastage 6.0 PX, or Datastage 7.0 versions installed. The Parallel jobs are especially usefule if you have large amounts of data to process.

Server jobs: These are compiled and run on DataStage Server Parallel jobs: These are available only if you have Enterprise Edition installed. These are compiled and run on a DataStage Unix Server, and can be run in parallel on SMP, MPP, and cluster systems. what is merge ?and how to use merge? merge is nothing but a filter conditions that have been used for filter condition Merge is a stage that is available in both parallel and server jobs. The merge stage is used to join two tables(server/parallel) or two tables/datasets(parallel). Merge requires that the master table/dataset and the update table/dataset to be sorted. Merge is performed on a key field, and the key field is mandatory in the master and update dataset/table. hi i dont think so. actually the merge stage in parallel job mainly used to merge the two or more data sets. it will take one master ds file and n number of update ds files. the output will one one final ds file +number of reject ds files as there r update files maily for join Merge stage is used to merge two flat files in server jobs. how we use NLS function in Datastage? what are advantages of NLS function? where we can use that one? explain briefly? Dear User,As per the manuals and documents, We have different level of interfaces. Can you be more specific? Like Teradata interface operators, DB2 interface operators,Oracle Interface operators and SAS-Interface operators.Orchestrate National Language Support (NLS) makes it possible for you toprocess data in international languages using Unicode character sets.International Components for Unicode (ICU) libraries support NLS functionalityin Orchestrate.Operator NLS Functionality* Teradata Interface Operators * switch Operator * filter Operator * The DB2 Interface Operators * The Oracle Interface Operators* The SAS-Interface Operators * transform Operator * modify Operator * import and export Operators * generator Operator Should you need any further assistance pls let me know. I shall share as much as i can.You can email me at venkatdba2000@yahoo.com or venkata.veluri@gmail.comRegardsVenkat By using NLS function we can do the following - Process the data in a wide range of languages - Use Local formats for dates, times and money - Sort the data according to the local rules If NLS is installed, various extra features appear in the product. For Server jobs, NLS is implemented in DataStage Server engine For Parallel jobs, NLS is implemented using the ICU library. What is APT_CONFIG in datastage? Please do read the manuals supplied with datastage. anyaways, the APT_CONFIG_FILE (not just APT_CONFIG) is the configuration file that defines the nodes, (the scratch area, temp area) for the specific project. Datastage understands the architecture of the system through this file(APT_CONFIG_FILE). For example this file consists information of node names, disk storage information...etc. APT_CONFIG is just an environment variable used to idetify the *.apt file. Dont confuse that with *.apt file that has the node's information and Configuration of SMP/MMP server. what is the OCI? and how to use the ETL Tools? OCI means orabulk data which used client having bulk data its retrive time is much more ie., your used to orabulk data the divided and retrived OCI doesn't mean the orabulk data. It actually uses the "Oracle Call Interface" of the oracle to load the data. It is kind of the lowest level of Oracle being used for loading the data. How can I connect my DB2 database on AS400 to DataStage? Do I need to use ODBC 1st to open the database connectivity and then use an adapter for just connecting between the two? Thanks alot of any replies.? You need to configure the ODBC connectivity for database (DB2 or AS400) in the datastage. How can I extract data from DB2 (on IBM iSeries) to the data warehouse via Datastage as the ETL tool. I mean do I first need to use ODBC to create connectivity and use an adapter for

the extraction and transformation of data? Thanks so much if anybody could provide an answer? From db2 stage, we can extract the data in ETL You would need to install ODBC drivers to connect to DB2 instance (does not come with regular drivers that we try to install, use CD provided for DB2 installation, that would have ODBC drivers to connect to DB2) and then try out. How Can I connect directly to the DB2 from DataStage. what is merge and how it can be done plz explain with simple example taking 2 tables .......? Merge is used to join two tables.It takes the Key columns sort them in Ascending or descending order.Let us consider two table i.e Emp,Dept.If we want to join these two tables we are having DeptNo as a common Key so we can give that column name as key and sort Deptno in ascending order and can join those two tables. please list out the versions of datastage Parallel , server editions and in which year they are realized? Please do fish for such kind of info from the net. what happends out put of hash file is connected to transformer .. what error it throughs? Can you please explain your question in detail. what are the ERRORS your are facing in connecting hash file to transformer. Please be proper in your question. What is version Control? Version Control stores different versions of DS jobs runs different versions of same job reverts to previos version of a job view version histories Hi, What are the Repository Tables in DataStage and What are they? Dear User. A datawarehouse is a repository(centralized as well as distributed) of Data, able to answer any adhoc,analytical,historical or complex queries.Metadata is data about data. Examples of metadata include data element descriptions, data type descriptions, attribute/property descriptions, range/domain descriptions, and process/method descriptions. The repository environment encompasses all corporate metadata resources: database catalogs, data dictionaries, and navigation services. Metadata includes things like the name, length, valid values, and description of a data element. Metadata is stored in a data dictionary and repository. It insulates the data warehouse from changes in the schema of operational systems.In data stage I/O and Transfer , under interface tab: input , out put & transfer pages.U will have 4 tabs and the last one is build under that u can find the TABLE NAME .The DataStage client components are:AdministratorAdministers DataStage projects and conducts housekeeping on the serverDesignerCreates DataStage jobs that are compiled into executable programs DirectorUsed to run and monitor the DataStage jobsManagerAllows you to view and edit the contents of the repository.Sould ypu need any further assistance pls revert to this mail id venkatdba2000@yahoo.com or venkata.veluri@gmail.comRegardsVenkat What is ' insert for update ' in datastage ? how can we pass parameters to job by using file. ? You can do this, by passing parameters from unix file, and then calling the execution of a datastage job. the ds job has the parameters defined (which are passed by unix) How to pass parameters to a report ? do you have to register them with AOL ? You can define parameters in the define concurrent program form. There is no need to register the parameters with AOL. But you may have to register the value sets for those parameters. frist of all you have to register the parameters of your repot in AOL. setps 1.switch to SYSADMIN responsibility. 2 register the executable, navigation is Concurrent>program>executable 3 register the program for this,navigation is concurrent>program>define in this form there is a one button <PARAMETER> click and register your report parameter. Hi, There is no need to register your parameters.There can be verious different reports and for them you might need to set up thousand of different paramaters, how would register them all ?..

Anyways, in report builder in design time, use user parameter to set up the parameters which you either pass from concurrent program or can be picked up from profile options. Now, when you define your concurrent program to which the report is attached, define your parameters. remeber to use concurrent_request_id as one of the user parameters. You do not need to use this mandatory parameter in the define program screen How do you pass the parameters from one form to another form? To pass one or more parameters to a called form, the calling form must perform the following steps in a trigger or user named routine execute the create_parameter_list built-in function to programmatically. Create a parameter list to execute the add parameter built-in procedure to add one or more parameters list. Execute the call_form, New_form or run_product built_in procedure and include the name or id of the parameter list to be passed to the called form. How will you pass parameters in RMI. Why u serialize? Parameter are passed in RMI using Parameter marshalling. As RMI is used to invoke remote objects, many of the times objects, their references have passed across the network. Hence these objects need to be serialized. Can you pass data parameters to forms? What r the session parameters? Session parameters r like maping parameters,represent values U might want to change between sessions such as database connections or source files. Server manager also allows U to create userdefined session parameters.Following r user defined session parameters. Database connections Source file names: use this parameter when u want to change the name or location of session source file between session runs Target file name : Use this parameter when u want to change the name or location of session target file between session runs. Reject file name : Use this parameter when u want to change the name or location of session reject files between session runs. What is parameter file? Parameter file is to define the values for parameters and variables used in a session.A parameter file is a file created by text editor such as word pad or notepad. U can define the following values in parameter file Maping parameters Maping variables session parameters How do I simulate optional parameters to COM calls? You must use the Missing class and pass Missing.Value (in System.Reflection) for any values that have optional parameters. Ans: Yes,it is right. Answer: You must use the Missing class and pass Missing.Value (in System.Reflection) for any values that have optional parameters. What are the Built-ins used for sending Parameters to forms? You can pass parameter values to a form when an application executes the call_form, New_form, Open_form or Run_product. What is true regarding a shared, server-side parameter file for a Real Application Cluster database? A. It can contain parameters with distinct values for each distance. B. It can contain only parameters with identical values for each instance. C. It must contain an IFILE parameter for each instances individual parameter file. D. It must be located in the default location for the primary instances parameter file. where does unix script of datastage executes weather in clinet machine or in server.suppose if it eexcutes on server then it will execute ? Datastage jobs are executed in the server machines only. There is nothing that is stored in the client machine. How does the server identify and execute the server-side scripts within HTML code?

Including the RUNAT=SERVER attribute in the <SCRIPT> tag Use <% %> server script delimiter Suppose you updated the NIS ethers map on the server. If you want to manually update the changes on the NIS slave server, the ypxfr command should be used. So, In what machine should you run the ypxfr command? A.) On the NIS slave server. B.) On the NIS client system. C.) On the NIS master server, of course!. D.) On any other machine Answer: A What is the Server.MachineName does? Skill/Topic: Advanced A) Gets the Servers Machine Name B) Gets the Referred Web site name on the server C) Gets the Client Machine Name D) None A) Server's Machine Name is the answer You are the administrator for your company network. The network consists of 8 windows 2000 server computers, 200 windows Professional client computers and 10 UNIX servers. Windows 2000 is being used as your DNS server. Your DNS zone is configured as an active directory integrated zone and is configured to allow dynamic updates. Users report that they can successfully access the windows 2000 computers by host name but they cannot access the UNIX servers by host names. How can they correct the problem? *(A) b. Manually add the UNIX servers to the Windows 2000 domain (B) c. On the DNS server, manually create the zone file that contains records for the UNIX servers. (C) d. Configure a UNIX server to be a DNS server in the secondary zone. defaults nodes for datastage parallel Edition? default nodes is allways one Actually the Number of Nodes depend on the number of processors in your system.If your system is supporting two processors we will get two nodes by default. If your running 4 ways parallel and you have 10 stages on the canvas, how many processes does datastage create? Orchestrate Vs Datastage Parallel Extender? Orchestrate itself is an ETL tool with extensive parallel processing capabilities and running on UNIX platform. Datastage used Orchestrate with Datastage XE (Beta version of 6.0) to incorporate the parallel processing capabilities. Now Datastage has purchased Orchestrate and integrated it with Datastage XE and released a new version Datastage 6.0 i.e Parallel Extender. Does Enterprise Edition only add the parallel processing for better performance? Are any stages/transformations available in the enterprise edition only? DataStage Standard Edition was previously called DataStage and DataStage Server Edition. DataStage Enterprise Edition was originally called Orchestrate, then renamed to Parallel Extender when purchased by Ascential. DataStage Enterprise: Server jobs, sequence jobs, parallel jobs. The enterprise edition offers parallel processing features for scalable high volume solutions. Designed originally for Unix, it now supports Windows, Linux and Unix System Services on mainframes. DataStage Enterprise MVS: Server jobs, sequence jobs, parallel jobs, mvs jobs. MVS jobs are jobs designed using an alternative set of stages that are generated into cobol/JCL code and are transferred to a mainframe to be compiled and run. Jobs are developed on a Unix or Windows server transferred to the mainframe to be compiled and run. The first two versions share the same Designer interface but have a different set of design stages depending on the type of job you are working on. Parallel jobs have parallel stages but also accept some server stages via a container. Server jobs only accept server stages, MVS jobs only accept MVS stages. There are some stages that are common to all types (such as aggregation) but they tend to have different fields and options within that stage. Row Merger, Row splitter are only present in parallel Stage.

You set the PARALLEL_AUTOMATIC_TUNING initialization parameter to FALSE. What happens? (Choose two) A. Parallel execution buffers are allocated from the large pool. B. Parallel execution buffers are allocated from the shared pool. C. Only tables with a degree of parallelism specified will be scanned in parallel. D. A load balancing algorithm will be used to distribute evenly load across nodes in a multi instance PQ environment. If your running 4 ways parallel and you have 10 stages on the canvas, how many processes does datastage create? Types of Parallel Processing? Parallel Processing is broadly classified into 2 types. a) SMP - Symmetrical Multi Processing. b) MPP - Massive Parallel Processing. Then how about Pipeline and Partition Paralleism, are they also 2 types of Parallel processing? What is a Parallel Server option in ORACLE ? A configuration for loosely coupled systems where multiple instance share a single physical database is called Parallel Server. What are the datatypes a available in PL/SQL ? Some scalar data types such as NUMBER, VARCHAR2, DATE, CHAR, LONG, BOOLEAN. Some composite data types such as RECORD & TABLE. How many types of database triggers can be specified on a table ? What are they ? Insert Update Delete Before Row After Row o.k. o.k. o.k. o.k. o.k. o.k. o.k. o.k. o.k. o.k. o.k. o.k.

Before Statement After Statement

If FOR EACH ROW clause is specified, then the trigger for each Row affected by the statement. If WHEN clause is specified, the trigger fires according to the returned Boolean value. There are five types of database triggers 1. Row Level Trigger 2. Statement Level Trigger 3. Instead of Trigger 4. Schema Trigger 5. Database Trigger Out of these five types of triggers Only two are used for the table purpose, i.e. two row level trigger and statement level triggers are used for insert, update or/and delete operation on table What is an Exception ? What are types of Exception ? Exception is the error handling part of PL/SQL block. The types are Predefined and user defined. Some of Predefined exceptions are. CURSOR_ALREADY_OPEN DUP_VAL_ON_INDEX NO_DATA_FOUND TOO_MANY_ROWS INVALID_CURSOR INVALID_NUMBER LOGON_DENIED NOT_LOGGED_ON PROGRAM-ERROR STORAGE_ERROR TIMEOUT_ON_RESOURCE VALUE_ERROR ZERO_DIVIDE OTHERS. What Happens if RCP is disable ?

In such case Osh has to perform Import and export every time whenthe job runs and the processing time job is also increased... Runtime column propagation (RCP): If RCP is enabled for any job, and specifically for those stage whose output connects to the shared container input, then meta data will be propagated at run time, so there is no need to map it at design time. If RCP is disabled for the job, in such case OSH has to perform Import and export every time when the job runs and the processing time job is also increased. Is it possible to disable the parameter from while running the report? Yes On a dataless client a swap device and low memory, which will improve performance? A. Lower memory B. Disable tmpfs C. Disable swap D. None of the above Answer: D I want to process 3 files in sequentially one by one , how can i do that. while processing the files it should fetch files automatically . Question: I want to process 3 files in sequentially one by one , how can i do that. while processing the files it should fetch files automatically .Answer: hi If the metadata for all the files r same then create a job having file name as parameter, then use same job in routine and call the job with different file name...or u can create sequencer to use the job... Scenario based Question ........... Suppose that 4 job control by the sequencer like (job 1, job 2, job 3, job 4 )if job 1 have 10,000 row ,after run the job only 5000 data has been loaded in target table remaining are not loaded and your job going to be aborted then.. How can short out the problem.? Suppose job sequencer synchronies or control 4 job but job 1 have problem, in this condition should go director and check it what type of problem showing either data type problem, warning massage, job fail or job aborted, If job fail means data type problem or missing column action .So u should go Run window ->Click-> Tracing->Performance or In your target table ->general -> action-> select this option here two option (i) On Fail -- commit , Continue (ii) On Skip -- Commit, Continue. First u check how many data already load after then select on skip option then continue and what remaining position data not loaded then select On Fail , Continue ...... Again Run the job defiantly u get successful massage. What is the flow of loading data into fact & dimensional tables? Fact table - Table with Collection of Foreign Keys corresponding to the Primary Keys in Dimensional table. Consists of fields with numeric values. Dimension table - Table with Unique Primary Key. Load - Data should be first loaded into dimensional table. Based on the primary key values in dimensional table, the data should be loaded into Fact table. What is a Star Schema? A relational database schema organized around a central table (fact table) joined to a few smaller tables (dimension tables) using foreign key references. The fact table contains raw numeric items that represent relevant business facts (price, discount values, number of units sold, dollar value, etc.) Single fact table with N number of dimension tables Differences between star and snowflake schemas? Star schema A single fact table with N number of Dimension Snowflake schema Any dimensions with extended dimensions are know as snowflake schema star schema uses denormalized dimension tables,but in case of snowflake schema it uses normalized dimensions to avoid redundancy... What is the Batch Program and how can generate? Batch programe is the programe it's generate run time to maintain by the datastage it self but u can easy to change own the basis of your requirement (Extraction, Transformation,Loading) .Batch

programe are generate depands your job nature either simple job or sequencer job,You can see this programe on job controll option. How many places u can call Routines? Could you please help me with a set of questions on Parallel Extender? what is difference between data stage and informatica ? Here is a very good articles on these differences... whic hhelps to get an idea.. basically it's depends on what you are tring to accomplish what are the requirements for your ETL tool? Do you have large sequential files (1 million rows, for example) that need to be compared every day versus yesterday? If so, then ask how each vendor would do that. Think about what process they are going to do. Are they requiring you to load yesterdays file into a table and do lookups? If so, RUN!! Are they doing a match/merge routine that knows how to process this in sequential files? Then maybe they are the right one. It all depends on what you need the ETL to do. If you are small enough in your data sets, then either would probably be OK. how to implement routines in data stage,have any one has any material for data stage pl send to me? Hi, write the routine in C or C++, create the object file and place object in lib directory. now open disigner and goto routines configure the path and routine names there are 3 kind of routines is there in Datastage. 1.server routines which will used in server jobs. these routines will write in BASIC Language 2.parlell routines which will used in parlell jobs These routines will write in C/C++ Language 3.mainframe routines which will used in mainframe jobs what is the difference between hashed file stage and sequential file stage in relates to DataStage Server? In datastage server jobs,can we use sequential filestage for a lookup instead of hashed filestage.If yes ,then whats the advantage of a Hashed File stage over sequential filestage search is faster in hash files as you can directly get the address of record directly by hash algorithm as records are stored like that but in case of sequential file u must compare all the records. What r the rank caches? During the session ,the informatica server compares an inout row with rows in the datacache.If the input row out-ranks a stored row,the informatica server replaces the stored row with the input row.The informatica server stores group information in an index cache and row data in a data cache. How do you fix the error "OCI has fetched truncated data" in DataStage? Can we use Change capture stage to get the truncated data's.Members please confirm. What is Code Page used for? Code Page is used to identify characters that might be in different languages. If you are importin g Japanese data into mapping, u must select the Japanese code page of source data. Importance of Surrogate Key in Data warehousing? The concept of surrogate comes into play when there is slowely changing dimension in a table. In such condition there is a need of a key by which we can identify the changes made in the dimensions. These slowely changing dimensions can be of three type namely SCD1,SCD2,SCD3. These are sustem genereated key.Mainly they are just the sequence of numbers or can be alfanumeric values also. what's the difference between Datastage Developers and Datastage Designers. What are the skill's required for this.? datastage developer is one how will code the jobs.datastage designer is how will desgn the job, i mean he will deal with blue prints and he will design the jobs the stages that are required in developing the code. How do you merge two files in DS? Either used Copy command as a Before-job subroutine if the metadata of the 2 files are same or created a job to concatenate the 2 files into one if the metadata is different. What is "Common Language Runtime" (CLR)?

CLR is .NET equivalent of Java Virtual Machine (JVM). It is the runtime that converts a MSIL code into the host machine language code, which is then executed appropriately. The CLR is the execution engine for .NET Framework applications. It provides a number of services, including: Code management (loading and execution) Application memory isolation Verification of type safety Conversion of IL to native code. Access to metadata (enhanced type information) Managing memory for managed objects Enforcement of code access security Exception handling, including cross-language exceptions Interoperation between managed code, COM objects, and pre-existing DLL's (unmanaged code and data) Automation of object layout Support for developer services (profiling, debugging, and so on). CLR brings another benefit to Windows developers--high interoperability between components written in any other language ported to work with the CLR and .NET. The CLR promises common safe types, managed execution, and inheritance across languages. A programmer can now write code that inherits implementations of classes or components written in another language. The programmer's application can also use the exception base class to catch and throw exceptions or errors between code modules written in different languages for more robust error handling. How do we do the automation of dsjobs? "dsjobs" can be automated by using Shell scripts in UNIX system. We can call Datastage Batch Job from Command prompt using 'dsjob'. We can also pass all the parameters from command prompt. Then call this shell script in any of the market available schedulers. The 2nd option is schedule these jobs using Data Stage director. What is DS Director used for - did u use it? For monitoring job status, datastage director is used to run the jobs and validate the jobs. we can go to datastage director from datastage designer it self. Types of vies in Datastage Director? There are 3 types of views in Datastage Director a) Job View - Dates of Jobs Compiled. b) Log View - Status of Job last run c) Status View - Warning Messages, Event Messages, Program Generated Messages. What is DS Manager used for - did u use it? The Manager is a graphical tool that enables you to view and manage the contents of the DataStage Repository . datastage maneger is used to export and import purpose [/B] main use of export and import is sharing the jobs and projects one project to other project. What will happen after commit statement ? Cursor C1 is Select empno, ename from emp; Begin open C1; loop Fetch C1 into eno.ename; Exit When C1 %notfound;----commit; end loop; end;

The cursor having query as SELECT .... FOR UPDATE gets closed after COMMIT/ROLLBACK. The cursor having query as SELECT.... does not get closed even after COMMIT/ROLLBACK. 1.What are types of Hashed File? How do you eliminate duplicate rows? Hashed File is classified broadly into 2 types. a) Static - Sub divided into 17 types based on Primary Key Pattern. b) Dynamic - sub divided into 2 types i) Generic ii) Specific. Default Hased file is "Dynamic - Type Random 30 D" 2.How do you eliminate duplicate rows? Data Stage provides us with a stage Remove Duplicates in Enterprise edition. Using that stage we can eliminate the duplicates based on a key column. The Duplicates can be eliminated by loading thecorresponding data in the Hash file. Specify the columns on which u want to eliminate as the keys of hash. The Duplicates can be eliminated by loading thecorresponding data in the Hash file. Specify the columns on which u want to eliminate as the keys of hash. removal of duplicates done in two ways: 1. Use "Duplicate Data Removal" stage or 2. use group by on all the columns used in select , duplicates will go away. 3.What about System variables? DataStage provides a set of variables containing useful system information that you can access from a transform or routine. System variables are read-only. @DATE The internal date when the program started. See the Date function. @DAY The day of the month extracted from the value in @DATE. @FALSE The compiler replaces the value with 0. @FM A field mark, Char(254). @IM An item mark, Char(255). @INROWNUM Input row counter. For use in constrains and derivations in

Transformer stages. @OUTROWNUM Output row counter (per link). For use in derivations in Transformer stages. @LOGNAME The user login name. @MONTH The current extracted from the value in @DATE. @NULL The null value. @NULL.STR The internal representation of the null value, Char(128). @PATH The pathname of the current DataStage project. @SCHEMA The schema name of the current DataStage project. @SM A subvalue mark (a delimiter used in UniVerse files), Char(252). @SYSTEM.RETURN.CODE Status codes returned by system processes or commands. @TIME The internal time when the program started. See the Time function. @TM A text mark (a delimiter used in UniVerse files), Char(251). @TRUE The compiler replaces the value with 1. @USERNO The user number. @VM A value mark (a delimiter used in UniVerse files), Char(253). @WHO The name of the current DataStage project directory. @YEAR The current year extracted from @DATE. REJECTED Can be used in the constraint expression of a Transformer stage of an output link. REJECTED is initially TRUE, but is set to FALSE whenever an output link is successfully written. 4.What is DS Designer used for - did u use it? You use the Designer to build jobs by creating a visual design that models the flow and transformation of data from the data source through to the target warehouse. The Designer graphical interface lets you select stage icons, drop them onto the Designer work area, and add links. 5.What is DS Administrator used for - did u use it? The Administrator enables you to set up DataStage users, control the purging of the Repository, and, if National Language Support (NLS) is enabled, install and manage maps and locales. 1. How do you eliminate duplicate rows? Use Remove Duplicate Stage: It takes a single sorted data set as input, removes all duplicate records, and writes the results to an output data set.

2. how to create batches in Datastage from command prompt? 3. Dimensional modelling is again sub divided into 2 types.? a)Star Schema - Simple & Much Faster. Denormalized form. b)Snowflake Schema - Complex with more Granularity. More normalized form. 1. How will you call external function or subroutine from datastage? there is datastage option to call external programs . execSH 2. How do you pass filename as the parameter for a job? While job developement we can create a paramater 'FILE_NAME' and the value can be passed while running the job. 1. Go to DataStage Administrator->Projects->Properties->Environment->UserDefined. Here you can see a grid, where you can enter your parameter name and the corresponding the path of the file. 2. Go to the stage Tab of the job, select the NLS tab, click on the "Use Job Parameter" and select the parameter name which you have given in the above. The selected parameter name appears in the text box beside the "Use Job Parameter" button. Copy the parameter name from the text box and use it in your job. Keep the project default in the text box.

Potrebbero piacerti anche