Informatica Power Center
Interview Questions
1. List out DATABASES in present market?
Oracle
Teradata
SQL server
IBM DB2
Snowflake
Mango DB
Postgresql
MYSQL
2. List out Few ETL Tools in Present Market?
A. There are two types of ETL tools in Present Market.
Graphical based ETL tool
Informatica Power center
Datastage
Abinito
Bodi
Talend
Web based ETL tool (Cloud)
Informatica IICS
ADF (Azure Data Factory)
Glue
GCP(Google Cloud Platform)
3. What kind of employees will there in complete project and clearly explains their duties?
BA (Business Analyst):-He is going to discuss with the Client about requirement and translate to
technical team
Data Modeler :- ( Erwin) going to provide relationship between Database tables.
DB Admin: - Admin is going to create the tables as per Date Modeler Suggestion.
Architect: - Design flow of ETL Tools, in this design we can identify how many layers are
involved.
ETL Team Lead: - Assign work to ETL Developers.
ETL Developer: - (we are)
ETL Tester: - He will write and execute the test cases to identify the bugs.
4. What is ETL?
A. E--- Extraction, T – Transformation, L - Loading
E--- Extraction: Extraction is the process of getting or pulling the data from single or different Source systems.
Source system: Source system is nothing but the one of the business system among of all of his business systems.
Ex: client is HDFC Bank... He may contains business like Savings, FD, Loans (PL, HL, CREDIT CARD LOANS)
Types source systems:
Files
Tables (SQL)
Mainframe files (Project spesific)
Web based sourcesystem (.json), .XML
Hive files.
Sales force
T – Transformation: Transformation is the process of changing the data into one format to another
format. The conversion process is called Transformation.
Ex: Business-date: 15/03/2025 from (First file)
Business-date: [Link] (Second file)
Business-date: [Link] [Link] (Third file)
L- Loading: Inserting transformed data into define tables (Target table) and creating the target files.
5. What is Data?
A. Data is nothing but the information which has been organized and stored for future usage. Either in
file system or Database system.
6. Types of Data?
1. Structured data (Organized data) Ex: .csv (, separated values) files, Database tables.
2. Unstructured data (Unorganized data): The information may meaningful but system cannot read the
data it is called unstructured data.
Ex: Word documents (.doc, .docx): Reports, presentations, memos, and other textual content.
Plain text files (.txt): Log files, notes, another text-only content.
PDF files (.pdf): Documents, reports, and other content often used for sharing and archiving.
Emails (.eml, .msg): Messages, attachments, and other email-related content.
Spreadsheets (.xlsx): Data tables, reports and other content.
Multimedia:
• Images (.jpg, .png, .gif, .tiff): Digital photographs, graphics, and other visual content.
• Audio files (.mp3, .wav, .flac): Music, audio recordings, and other sound content.
• Video files (.mp4, .avi, .mov): Movies, recordings, and other video content.
• Surveillance footage:
• Satellite imagery:
3. Semi-structured data: Most of the data will be come in a structured format, some data will be
unstructured format it is called semi-structured data.
Ex: HTML, JSON, XML, emails, and NoSQL database records.
• HTML: A web technology that is commonly used to create web pages
• JSON: A JavaScript-based format that is often used to collect data from web browser, smart
phones.
• XML: An extensible Markup Language that is often used for semi-structured data
• Emails: A common form of semi-structured data that is used by many businesses and employees.
• NoSQL databases: Databases that are designed to handle semi-structured data, such as Mango
DB and Couch base
• PACSs: A database that stores information about images, where the images are unstructured
data
7. What is Delimiter?
A. A Delimiter is the one which separates one column (fields) to another column (fields) in file.
8. Can you please list of few delimiters you have seen in your source files?
A. ~,|, ^, #, ,
9. Difference between ETL and ELT?
A. Refer the link: - [Link]
10. In your opinion which one is best ETL or ELT? Why?
A. We cannot conclude which one is best because it is purely depend on the end users requirement.
11. What are source systems you are working in your project?
A. My Source system in present project is .csv file and oracle tables...
ALM (which Asset liabilities management system) we will receive data as .csv file and from CC (which
Credit card) system we will receive data from Oracle tables.
12. What is the database your organization is maintaining DWH?
A. We are using Teradata as Data ware housing.
13. What is the project Hierarchy? (OR) Project Architecture?
14. What are the layers involved in your project? Or tell me about your present project?
Landing zone
Staging layer
Business layer
Data mart layer
DWH layer
Report Layer
15. Type of tables we may involve in informatica?
Staging tables
Dimension Tables
Fact Tables
Data mart tables
Data ware house tables
16. What is staging table?
A. Staging table located in staging area. In order to maintain 1 day data in to Data ware house.
We are using staging layer, we can keep the data till it passes to downstream.
We load truncate and load fashion, we load data into staging table, Means yesterday data has to
delete before loading to today data...
17. How staging table will load?
A. Truncate and load fashion.
18. What is load type in staging table?
A. Truncate and load method
19. What is mean by Upstream and downstream?
A. Landing zone and staging layer collectively we called it as Upstream and from business layer to DWH
layer we called it as downstream.
20. Why we are required Staging layer? We can directly load the data from landing zone to business
layer..?
1. Minimalizing the number of times to accessing the client server to get source data even through
particular source file data we may using in 100 tables of downstream.
2. We cannot apply business logics on source if it is file format, to derive values for new columns.
3. The format of the source file may not compatible with Business layer table. In order to make
uniqueness of the data.
21. What is landing zone (Landing area)?
A. Landing Zone is a place where source system guys will place files/Data to process in to data
warehousing based on specified interval. We can also tell that common access location between source
team and ETL team.
Ex: /root/src_location/savings/
22. What is data warehouse? (OR) DSS (Decision support system)?
A. DWH is a place to store years of history Data for the purpose of analyzing the business to
Take the wise decisions about future forecasting.
23. What is data ware housing?
A. This is the process of preparing Data warehouses called Data warehousing.
It has two types of approaches - Top down and bottom up approach
For top down approach first data will load in Data mart after that load in DWH
For Bottom up approach opposite way.
24. What is Data mart?
A. A data mart is a subset of a data warehouse focused on a specific business area or department,
Designed to provide users with access to relevant data for targeted analysis and reporting.
Ex: My banking client has 6 Data marts such has
CCDM-- Credit card Data mart
SVNGDM-- This deals with savings account Datamart
HLDM -- Housing loan Data mart
25. What is metadata?
A. Metadata is nothing but info about data, means it tells what are the data type of each column in the
table and what each column size is how the table is having relation with other tables ...These all info can
be denoted by metadata.
26. What is mapping?
A. Mapping is nothing but ETL Jobs, Extract data from source file and applying transformation and load
the data into target tables is called Mapping.
27. What are the types of loading strategies we may have?
1. Initial Load
2. Incremental load
3. Delta load
4. Incremental Aggregation
5. Truncate and load jobs (Staging table)
6. SCD loads: (Slowly changing Dimensions) Load
a. SCD type1
b. SCD type 2
c. SCD type 3
28. What is initial load?
A. A process of loading all the data from source systems into a new or empty data warehouse for the
very first time. Generally it happen starting of production jobs run.
29. What is Incremental load?
A. Updating a data warehouse with only the new or changed data since the last load, rather than
reloading the entire dataset, which is faster and more efficient, especially for large datasets.
30. What is meaning by Delta loading?
A. It involves transferring only the changes (additions, modifications, or deletions) to the data since
the last load, rather than reloading the entire dataset.
31. What are the difference between Delta load and scdtype1 load?
A. Delta load: Identifies and loads only the changes (inserts, updates, or deletes) that have occurred in
the source data since the last load. And efficiently updates the target table with the latest data,
minimizing data transfer and processing time.
SCD type1: Overwrites existing data in the target table with the new values, ensuring that only the most
recent data is present. And Simplifies data management by maintaining a single, up-to-date version of
the data.
32. How we load data in to Increment load? (Advanced)
A. Whenever data is loading it has to be verifying that what is the last time data has loaded in to this
table. Based on that from that date (time) onwards it will proceed for loading, this implementation we
need to relay on variable port, which capture the last loaded timestamp with help of setmax function.
33. How many ways we can implement truncate and load option for a specific table?
A. 1st way: Need to select truncate table option.
2nd way: Need to write Delete query in pre-Sql option.
3rd way: Before starting of 150 stage tables in my project we have big script to delete the data from
all those tables which ever we have received today file. With help of control table and audit table we
can find the list of tables we need to delete in today run.
34. Components in informatica?
Server Components
Client Components
35. What is the Informatica Hierarchy? (OR) Informatica Architecture?
36. What are the server components of informatica?
1. Repository Service – Responsible for maintaining Informatica metadata & providing access of
same to other services.
2. Integration Service(IS) – Responsible for the movement of data from sources to targets
3. Reporting Service – Enables the generation of reports
4. Nodes – Computing platform where the above services are executed
5. Domain: - It is act as mediator between server components.
37. What are the client components of informatica?
1. Informatica Designer – Used for creation of mappings between source and target
2. Workflow Manager – Used to create workflows and other task & their execution
3. Workflow Monitor – Used to monitor the execution of workflows
4. Repository Manager – Used to manage/store objects in repository
5. Informatica Administration console
38. What are the activities we can do by using admin (Administration) console?
1. Creating new user id for newly joined developers.
2. Allocating roles is the developer or admin with role we have to provide.
3. Providing access on Specific folders means what are the folders that new guy has to work those
folders only by using admin consol we can grant the access
4. If the new guy doesn’t want to edit any ETL code, simple we can give read only access so that he
cannot able to edit the code.
39. What is Domain in informatica? What it will do?
A. It is used to maintain the metadata related to the data dictionary of the power center tool, like details
about the columns or attributes data types or size and the details of the tables and other relevant
information’s are stored in the domain folder.
40. What are audit columns? Why they required?
A. Audit columns are those columns which we can use for tracking the source system info.
Below are examples for Audit columns
1. Run_time
2. Run_by/user_name
3. SRC_file_name/ SRC_system_name
4. Batch_id
5. Insert_timestamp
6. Create_dt
7. Update_timestamp
41. Software (project) life cycle? OR How migration will implement here in u r project?
A. Most of the projects follow the below format
DEV --> SIT ---> UAT---> Prod
Rarely can we also see below format
DEV --> QA--> SIT ---> UAT--->PREPOD--> Prod
42. What are activities we do in development phase?
A. Understanding Mapping document, and develop the ETL jobs as per logics mentioned in mapping
document need to develop the mapping by using different transformations as per logics mentioned
mapping document and implementing Unit testing for finding bugs in ETL jobs which we created.
43. What is mapping document?
A. It is simple excel sheet which mentioned about Target table information and source information
along with the logic which we need to implementing.
44. What are the types of test cases we can prepare in UT (Unit testing)?
1. Count of records validation test case.
2. Data validation test cases.
3. Negative test case execution
4. Random records validation testing
5. Checksum validation
6. Business Rules validation testing: Hear all records satisfied business rules or not.
45. What are the activities we do in SIT?
A. Executing end to end without manual intervention. Using of control M or any other scheduling tools.
46. What are the activities we do in UAT?
A. Need to fix the CR (Change request) which raised by UAT team after finding the bugs in ETL code.
48. How to create connection file DATA bases?
A. With help of Data base user id password and server details we will create connection string. With
help of that connection string we can access the tables to get the data.
49. What is Source Qualifier transformation?
A. It is default transformation which can be created default whenever we
50. If my source is Flat file can we able to use user defined Query?
A. No, We cannot use
51. What are the types of transformation?
A. There are 2 types of transformations.
1. Active Transformation
2. Passive Transformation.
52. What is active transformation and what is passive Transformation?
A. Active Transformation:- The transformation which have the capability to change the number of
records from source to target, those transformations we call it as Active transformation. Or atleast it can
be change the order of records then it can be consider as active transformation.
Ex : Filter and router transformation
Passive Transformation:- The transformation which doesn't have capability to change the number of
records from source to target, those transformations we call it as Passive transformation.
Ex: Expression Transformation
53. What is Expression transformation?
A. The Expression transformation is a passive transformation, connected transformation used for row-
wise data manipulation, enabling calculations, data conversions, and string manipulations on individual
records before passing them to the target.
54. How can u define a transformation as Active?
A. The transformation when it can change the record count from source and target actively participate
on source data and either changing the record order at least.
55. How can u define a transformation as Passive transformation?
A. The transformation doesn’t have a capability to change the either record count or order of the record
that transformation considers as passive transformation.
Ex: Expression transformation, sequence generate transformation, Lookup transformation, stored
procedure transformation, normalized transformation, http transformation and java transformation.
56. What are the ports in Expression transformation?
A. There are 3 ports in Express transformation.
1. Input port
2. Output port
3. Variable port
57. What is variable port in Expression transformation?
A. It is intermediate processing port to implement calculations the result we will capture in to output
port.
58. What is Input port in Expression Transformation?
A. The columns which are coming from in source object called it as input port
59. What is output port in Expression Transformation?
A. The columns which are going out from expression to the further transformation or target called it as
output port
60. How many times we can use the value generated in Variable port?
A. N number output ports.
61. How can we see the SQL Query whichever ETL job has created at the time of execution?
A. Go to session log and search with key word 'select'.
62. If i have 2 different Queries in mapping and session.... while executing which query will take into
consideration?
A. Session level Query...Because IS (integration service) gives the high priority to the session level
configuration.
63. What are the functionality of pre and post SQL?
A. Pre-SQL it executes Query before ETL job will execute,
Post-SQL it execute Query after executing of JOB.
Example of Using both:
Presql Query to drop index on target table, and PreSQL is for recreate the index which has dropped
before inserting the records.
64. How you Derive a value into new column?
A. By user defined Query in Source Qualifier transformation or by using expression transformation.
Example for SQL OVER ride Query will be as below:
select empno,ename ,sal,comm, sal+comm as total_sal from emp in above Query TOTAL_Sal is the
column derived column.
Ex for expression:
Sal--- input column
Comm---input column
SAL+comm--- Output column
65. Explain about nvl(column_name,0) function?
A. Whenever we will get null it is going to replace with Zero.
66. How to handle Null value in informatica?
A. Using NVL function to convert to Specified value.
nvl(column_name,0) -- to change null to zero in Source Qualifier over ride Query.
IF (isnull(COMM),'0',COMM))-- in expression transformation.
67. How many ways we can generate a sequence numbers in Informatica?
A. We can generate sequence numbers in 3 ways
1. Using sequence generator transformation
2. Using Expression transformation we can use variable port.
3. Using Sequence generator transformation.
68. What is the values we can expect isnull(column_Name) function will do ?
A. 0 or 1 i.e. 0 means Yes it is null value, 1 means No it is not null values.
69. How many ways we can Generate Values for Extra columns in target?
A. 1. Source Qualifier Override Query
2. Expression transformation.
70. What is surrogate key..?
A. It is just sequence number which can act as temporary primary key whenever we have missed
primary key in our source system. surrogate key values will be like 1,2,3,4,5,6,7,8,9,10 that.
71. What is Composite Primary Key?
A. It is combination of multiple primary keys.
72. What is Primary key?
A. It is a combination of Unique Key constrains value and not null value.
73. What is Sequence generator transformation?
A. The Sequence Generator transformation is a connected transformation that generates unique
numeric values, primarily used generate sequence numbers based on increment values whatever we will
pass. We can generate Sequence numbers. It can also generate cyclical incrementing values such as
1,2,3,1,2,3,1,2,3 like that.
74. What is the cur value and next value in sequence generator transformation?
A. cur value result will 1 for each record. Next value will be sequence number it can generate.
75. What is Filter transformation?
76. Drawback of Filter transformation?
1. Using Filter Transformation, we can load only one Target Table at a time.
2. If we have five different conditions, then we need to create five different Filter Transformations.
3. We cannot capture the False or Unsatisfied records of the condition in a table.
77. What is Router Transformation?
78. What is the purpose of the "Default" group in Router transformation?
A. To capture the rejected records i.e. if a row doesn’t meet any condition in a Router transformation
then Default group can capture that record as result.
79. If an input record satisfied by multiple groups how many groups that particular record pass to
target?
A. Multiple groups
80. If I give only 1 table instead of 3 targets in router transformation how it will work?
A. Yes it will work, but it works as Filter transformation.
81. What are the differences between Router and Filter Transformations?
Filter Router
1. Accepts single condition 1. Router T/R Accepts multiple conditions
2. Filter transformation Cannot capture [Link] transformation Can capture rejected
rejected rows rows by using default group
3. If we use multiple filter transformations to 3. Multiple conditions can be handle in single
handle multiple conditions , performance will router performance will be increases
be degraded Example : source has 10 rows and Example : only 1*10 =10 times only
we use 5 filters then each row will be input to
each filter so it reads 5*10= 50times
82. Router transformation is an active transformation or passive transformation? Why?
A. Router transformation is an Active transformation, because it supports a condition in the group
which can be eliminate records from source to target.
83. How many output groups can have a Router transformation?
A. Multiple groups or N number of groups.
84. Can we have multiple conditions for a single output group in Router transformation?
A. Yes, by using AND, OR operators.
85. How does Router transformation improve performance compared to filter transformation?
A. By supplying the sorted data
86. How can you use Router transformation to implement a conditional split of data?
A. By Creating Number of groups.
87. Can a single record be sent to multiple output groups in Router transformation? How?
A. Yes, if it satisfies multiple Groups, that many outputs it will pass.
88. Which transformation is more performance-efficient: Router or multiple Filter transformations?
Why?
A. Using the single Router transformation is high performance.
89. What are the operators we used in Filter Transformation?
A. We can use AND, OR operator in Filter transformation.
90. What is Aggregator transformation?
A. Aggregator transformation is an Active transformation, it performs aggregate calculations like sum,
average, count etc, and an aggregator transformation performs calculations on group of rows. It works
same as aggregate functions in SQL SUM ( ), AVG (), Min ( ), MAX ( ).
91. What is Cache file?
A. Few Transformations in informatica will create cache files, which is temporary internal file for the
purpose of improving the performance and applying the required clause.
92. What are the types of caches can generate by Aggregator Transformation? What is aggregate
cache?
A. The integration service creates index cache and data cache in memory to process the aggregator
transformation.
1. Stores the data group in index cache.
2. Stores row level data in data cache.
If the integration service requires more space, it stores the overflow values in cache files.
Aggregate cache:
In the Informatica Aggregator transformation, the "aggregate cache" is a memory area where the
Integration Service temporarily stores data during aggregate calculations. It consists of two components:
an index cache and a data cache.
93. How can we improve performance of aggregate transformation?
Use sorted input: Sort the data before passing into aggregator. The integration service uses
memory to process the aggregator transformation and it does not use cache memory.
Filter the unwanted data before aggregating.
Limit the number of input/output or output ports to reduce the amount of data, the aggregator
transformation stores in the data cache. Those, columns which we don’t use either in calculation
or into target column.
94. What are the ways we can arrange sorter data to supply to Aggregator Transformation?
By using Order by clause in Source Qualifier Transformation
By using Sorter transformation. (When we received source as flat file).
95. What are the different types of aggregate functions?
The different types of aggregate functions are listed below:
● AVG
● COUNT
● FIRST
● LAST
● MAX
● MEDIAN
● MIN
● PERCENTILE
● STDDEV
● SUM
● VARIANCE
96. What is Nested Aggregate function?
A. min (sum (sal))
In SQL: select min (sum_sal) from (
Select deptno, sum (sal) as sum_sal from emp group by deptno);
97. Why cannot you use both single level and nested aggregate functions in a single aggregate
transformation?
A. The Nested aggregate function returns only one output row, whereas the single level aggregate
function returns more than one row. Since the number of rows returned is not same, you cannot use
both single level and nested aggregate functions in the same transformation. If you include both the
single level and nested functions in the same aggregator, the designer marks the mapping or mapplet
as invalid. So, you need to create separate aggregator transformations.
98. Multiple calculation like sum (), Min (), Max () will support by single aggregator transformation?
A. YES
99. Up to how many levels, you can nest the aggregate functions?
A. We can nest up to two levels only.
Example: MAX (SUM (SAL))
100. How can we get corresponding columns related to result of nested aggregation value?
A. With the help of Joiner and Aggregator transformation
101. How the NULL values are handled in Aggregator? OR
How the NULL values are handled in Expression Transformation?
A. You can configure the integration service to treat null values in aggregator functions as NULL or zero.
By default the integration service treats null values as NULL in aggregate functions.
IIF (Isnull(column_name),0,column_name)
IF Condition
Satisfy ---0 (1st block)
Not satisfy – (2nd block) column_name
102. If i don’t apply any calculation either grouping key column in aggregator transformation what will
be the result?
A. IS (integration service) will consider all records as one group and will apply default calculation which
is nothing but LAST () m function.
103. What is a sorter transformation?
A. Sorter transformation is used to sort the data. You can sort the data either in ascending or
descending order according to a specified sort key.
104. Why sorter is an active transformation?
A. Whenever we use distinct option available in sorter transformation to remove duplicate records
then that particular sorter transformation consider as active transformation.
105. How to improve the performance of a session using sorter transformation?
A. Sort the data using sorter transformation before passing into aggregator or joiner transformation.
As the data is sorted, the integration service uses the memory to do aggregate and join operations and
does not use cache files to process the data.
106. How can we remove duplicates in Sorter transformation?
A.
107. Can we sort data based on 2 key columns?
A. Yes
108. What is a Joiner Transformation and why it is an Active one?
A Joiner is an Active transformation and connected transformation used to join two source data
streams coming from same or heterogeneous databases or files.
The Joiner transformation joins sources with at least one matching column. The Joiner
transformation uses a condition that matches one or more pairs of columns between the two
sources.
In the Joiner transformation, we must configure the transformation properties namely Join
Condition, Join Type and optionally Sorted Input option to improve Integration Service
performance.
The join condition contains ports from both input sources that must match for the Integration
Service to join two rows.
Depending on the join condition and the type of join selected, the Integration Service either
adds the row to the result set or discards the row.
Because of this reason, the number of rows in Joiner output may not be equal to the number of
rows in Joiner Input. This is why Joiner is considered an Active transformation.
109. State the limitations where we cannot use Joiner in the mapping pipeline?
Joiner transformation cannot be used when either of the input pipelines contains an Update
Strategy transformation.
Joiner transformation cannot be used if we connect a Sequence Generator transformation
directly before the Joiner transformation.
110. Out of the two input pipelines of a joiner, which one will we set as the master pipeline?
A. During a session run, the Integration Service compares each row of the master source against the
detail source. The master and detail sources need to be configured for optimal performance.
When the Integration Service processes an unsorted Joiner transformation, it blocks the detail source
while it caches rows from the master source. Once the Integration Service finishes reading and caching
all master rows, it unblocks the detail source and reads the detail rows. This is why if we have the source
containing fewer input rows in master, the cache size will be smaller, thereby improving the
performance.
For a Sorted Joiner transformation, use the source with fewer duplicate key values as the master source
for optimal performance and disk storage. When the Integration Service processes a sorted Joiner
transfor-mation, it caches rows for one hundred keys at a time. If the master source contains many rows
with the same key value, the Integration Service must cache more rows, and performance can be
slowed.
Blocking logic is possible if master and detail input to the Joiner transformation originate from dif-ferent
sources. Otherwise, it does not use blocking logic. Instead, it stores more rows in the cache.
111. What are the different types of Joins available in Joiner Transformation?
A. In SQL, a join is a relational operator that combines data from multiple tables into a single result
set. The Joiner transformation is similar to an SQL join except that data can originate from different
types of sources. The Joiner transformation supports the following types of joins:
1. Normal
2. Master Outer
3. Detail Outer
4. Full Outer
112. Define the various Join Types of Joiner Transformation.
Normal Join: - The Integration Service discards all rows of data from the master and detail
source that do not match, based on the join condition.
Master Outer Join: - Master outer join keeps all rows of data from the detail source and the
matching rows from the master source. It discards the unmatched rows from the master source.
Detail Outer Join: - Detail outer join keeps all rows of data from the master source and the
matching rows from the detail source. It discards the unmatched rows from the detail source.
Full Outer Join: - Full outer join keeps all rows of data from both the master and detail sources.
113. Describe the impact of number of join conditions and join order in a Joiner.
We can define one or more conditions based on equality between the specified master and
detail sources. Both ports in a condition must have the same data type.
If we need to use two ports in the join condition with non-matching data types we must convert
the data types so that they match. The Designer validates data types in a join condition.
Additional ports in the join condition, increases the time necessary to join two sources.
The order of the ports in the join condition can impact the performance of the Joiner
transformation. If we use multiple ports in the join condition, the Integration Service compares
the ports in the order we specified.
Only equality operator is available in joiner join condition.
114. How does Joiner transformation treat NULL value matching?
A. The Joiner transformation does not match null values.
Ex: if both EMP_ID1 and EMP_ID2 contain a row with a null value, the Integration Service does not
consider them a match and does not join the two rows.
To join rows with null values, replace null input with default values in the Ports tab of the joiner,
and then join on the default values.
If a result set includes fields that do not contain data in either of the sources, the Joiner
transformation populates the empty fields with null values.
If we know that a field will return a NULL and we do not want to insert NULLs in the target, set a
default value on the Ports tab for the corresponding port.
115. When we configure the join condition, what are the guidelines we need to follow to maintain
the sort order?
Suppose we configure Sorter transformations in the master and detail pipelines with the following
sorted ports in order: ITEM_NO, ITEM_NAME and PRICE.
A. If we have sorted both the master and detail pipelines in order of the ports say ITEM_NO,
ITEM_NAME and PRICE we must ensure that:
Use ITEM_NO in the First Join Condition.
If we add a Second Join Condition, we must use ITEM_NAME.
If we want to use PRICE as a Join Condition apart from ITEM_NO, we must also use ITEM_NAME
in the Second Join Condition.
If we skip ITEM_NAME and join on ITEM_NO and PRICE, we will lose the input sort order and the
Integration Service fails the session.
116. What are the transformations that cannot be placed between the sort origin and the Joiner
transformation so that we do not lose the input sort order?
A. The best option is to place the Joiner transformation directly after the sort origin to maintain sorted
data. However do not place any of the following transformations between the sort origin and the Joiner
transformation:
1. Custom
2. Unsorted Aggregator
3. Normalizer
4. Rank
5. Union transformation
6. XML Parser transformation
7. XML Generator transformation
8. Mapplet [if it contains any one of the above mentioned transformations]
117. What is the use of sorted input in joiner transformation?
A. It is recommended to Join sorted data when possible. We can improve session performance by
configuring the Joiner transformation to use sorted input. When we configure the Joiner transformation
to use sorted data, it improves performance by minimizing disk input and output. We see great
performance improvement when we work with large data sets.
For an unsorted Joiner transformation, designate as the master source the source with fewer rows. For
optimal performance and disk storage, designate the master source as the source with the fewer rows.
During a session, the Joiner transformation compares each row of the master source against the de-tail
source. The fewer unique rows in the master, the fewer iterations of the join comparison occur, which
speeds the join process.
118. Can we join two tables based on a join column having different data type?
For example table 1 EMPNO (string) and table 2 EMPNUM (number)
A. Yes possible in this case. If we are using Joiner, we should be able to do this explicit conversion in an
expression transformation before joining the tables.
119. Implementation Scenario1 - Joiner transformation is joining two tables s1 and s2. S1 has 10,000
rows and S2 has 1000 rows. Which table you will set master for better performance of joiner
transformation? Why?
A. Set table S2 as Master table because informatica server has to keep master table in the cache so if it
is 1000 in cache will get performance instead of having 10000 rows in cache.
120. Difference between joiner and Source Qualifier?
A. A joiner transformation can join heterogeneous data sources where as a source qualifier can join only
homogeneous sources. Source qualifier transformation can join data from only relational sources but
cannot join flat files.
121. What is difference between lookup transformations and joiner transformations?
122. What kind of cache files are created by joiner Transformation?
123. What are the types of cache files available in join and Aggregator Transformation?
A. They are two types of caches available
Index cache :
Data cache :
124. What is lookup Transformation?
A. The Lookup transformation is passive transformation. Lookup is a transformation to look up data in a
flat file or a relational table, view, or synonym.
125. What are the types of lookup Transformation?
A. There are two types of Lookups in Power center Designer.
1. Connected Lookup
2. Unconnected Lookup
126. What is connected and unconnected lookup transformation?
A. Connected Lookup: A connected lookup transformation is connected the transformations in the
mapping pipeline. It receives source data, performs a lookup and returns data to the pipeline.
Unconnected Lookup: An unconnected Lookup transformation is separate from the pipeline in the
mapping. We write an expression using the: LKP reference qualifier to call the lookup within another
transformation. Unconnected lookup does not participate in mapping data flow. In fact it is not
connected to the other transformation so Informatica server creates a separate cache for unconnected
and processing takes place in parallel. That’s why performance increases.
The major advantage of unconnected lookup is its reusability. We can call an unconnected lookup
multiple times in the mapping unlike connected lookup.
127. Difference between connected and unconnected lookup Transformation?
Connected Lookup Unconnected Lookup
Connected lookup participates in dataflow and receives Unconnected lookup receives input values from the
input directly from the pipeline result of a LKP: expression in another transformation
Connected lookup can use both dynamic and static cache. Unconnected Lookup cache can NOT be dynamic.
Connected lookup can return more than one column value Unconnected Lookup can return only one column value
(output port). i.e. output port.
Unconnected lookup caches only the lookup output
Connected lookup caches all lookup columns.
ports in the lookup conditions and the return port.
Connected Lookup Unconnected Lookup
Supports user-defined default values (i.e. value to return
Does not support user defined default values.
when lookup conditions are not satisfied).
128. What are the tasks of a lookup transformation?
A. We can use the Lookup transformation to perform following:
Get a related value: EMP has DEPTNO but DNAME is not there. We use Lookup to get DNAME
from DEPT table based on Lookup Condition.
Perform a calculation: We want only those Employees who are SAL > Average (SAL). We will
write Lookup Override query.
For comparison: Update slowly changing dimension tables: Most important use. We can use a
Lookup transformation to determine whether rows already exist in the target.
129. How do you configure a lookup transformation?
A. Configure the lookup transformation to perform the following types of lookups:
Relational or flat file lookup
Pipeline lookup
Connected or unconnected lookup
Cached or uncached lookup
130. What is a pipeline lookup transformation?
A. A pipeline lookup transformation is used to perform lookup on application sources such as JMS,
MSMQ or SAP. A pipeline lookup transformation has a source qualifier as the lookups source.
131. How do you handle multiple matches in lookup transformation? Or what is "Lookup Policy on
Multiple Matches"?
A. "Lookup Policy on Multiple Match" option is used to determine which rows that the lookup
transformation returns when it finds multiple rows that match the lookup condition. You can select
lookup to return first or last row or any matching row or to report an error.
132. What is "Output Old Value on Update"?
A. This option is used when dynamic cache is enabled. When this option is enabled, the integration
service outputs old values out of the lookup/output ports. When the Integration Service updates a row
in the cache, it outputs the value that existed in the lookup cache before it updated the row based on
the input data. When the Integration Service inserts a new row in the cache, it outputs null values.
When you disable this property, the Integration Service outputs the same values out of the
lookup/output and input/output ports.
133. What are the types of lookup cache files available?
Static Cache : By default, the IS(Integration service) creates a static cache. It caches the lookup file or
table and Looks up values in the cache for each row that comes into the transformation. The IS
(Integration service) does not update the cache while it processes the Lookup transformation.
Dynamic Cache: To cache a target table or flat file source and insert new rows or update existing rows in
the cache, use a Lookup transformation with a dynamic cache. The IS dynamically inserts or updates
data in the lookup cache and passes data to the target. Target table is also our lookup table. No good for
performance if table is huge.
Persistent Cache: If the lookup table does not change between sessions, we can configure the Lookup
transformation to use a persistent lookup cache. The IS (Integration service) saves and reuses cache files
from session to session, eliminating the time required to read the lookup table.
Re-cache from Source: If the persistent cache is not synchronized with the lookup table, we can
configure the Lookup transformation to rebuild the lookup cache. If Lookup table has changed, we can
use this to rebuild the lookup cache.
Shared Cache: They are two types of cache.
• Unnamed cache: When Lookup transformations in a mapping have compatible caching structures,
the IS shares the cache by default. You can only share static unnamed caches.
• Named cache: Use a persistent named cache when we want to share a cache file across mappings or
share a dynamic and a static cache. The caching structures must match or be compatible with a named
cache. You can share static and dynamic named caches.
134. What is "Insert Else Update" and "Update Else Insert"?
A. These options are used when dynamic cache is enabled.
Insert Else Update option applies to rows entering the lookup transformation with the row type
of insert. When this option is enabled the integration service inserts new rows in the cache and
updates existing rows when disabled, the Integration Service does not update existing rows.
Update Else Insert option applies to rows entering the lookup transformation with the row type
of update. When this option is enabled, the Integration Service updates existing rows, and
inserts a new row if it is new. When disabled, the Integration Service does not insert new rows.
135. What is a cached lookup transformation and uncached lookup transformation?
Cached lookup transformation: The Integration Service builds a cache in memory when it
processes the first row of data in a cached Lookup transformation. The Integration Service stores
condition values in the index cache and output values in the data cache. The Integration Service
queries the cache for each row that enters the transformation.
Uncached lookup transformation: For each row that enters the lookup transformation, the
Integration Service queries the lookup source and returns a value. The integration service does
not build a cache.
136. How the integration service builds the caches for connected lookup transformation?
A. The Integration Service builds the lookup caches for connected lookup transformation in the
following ways
Sequential cache: The Integration Service builds lookup caches sequentially. The Integration
Service builds the cache in memory when it processes the first row of the data in a cached
lookup transformation.
Concurrent caches: The Integration Service builds lookup caches concurrently. It does not need
to wait for data to reach the Lookup transformation.
137. What are the different values returned by New Lookup Row port?
A. The different values are
0 - Integration Service does not update or insert the row in the cache.
1 - Integration Service inserts the row into the cache.
2 - Integration Service updates the row in the cache.
138. How do you improve the performance of lookup transformation?
OR
Please explain about few performance techniques?
Create an index on the columns used in the lookup condition
Place conditions with equality operator first
Cache small lookup tables.
Join tables in the database: If the source and the lookup table are in the same database, join the
tables in the database rather than using a lookup transformation.
Use persistent cache for static lookups.
Avoid ORDER BY on all columns in the lookup source. Specify explicitly the ORDER By clause on
the required columns.
For flat file lookups, provide Sorted files as lookup source.
444. What is complex scenario?
A.
445. What is incremental aggregation? (Advanced)
A. The integration service performs aggregate calculations and then stores the data in historical cache,
Next time when you run the session, the integration service reads only new data and uses the historical
cache to perform new aggregation calculations incrementally.
446. Why we cannot use sorted input option for incremental aggregation?(Advanced)
A. In incremental aggregation, the aggregate calculations are stored in historical cache on the server.
In this historical cache the data need not to be in sorted order. If you give sorted input, the records
come as presorted for that particular run but in the historical cache the data may not be in the sorted
order. That is why this option is not allowed.
447. A what is batch?
A. Some set of jobs are going to collectively run in certain day. Those groups of jobs we called as batch.
For example in my project batch jobs will start from early morning 5 am and end by 11 am
approximately. From 12 O clock onwards we proceed for adhock Jobs.
448. What is mean by Adhock jobs?
A. Those jobs will not run daily .They execute once we receive specific request, then we executes jobs
manually to provide data to downstream.
449. What are the activities we do in Prod?
1. Daily monitoring the batch jobs either all the jobs are triggered or any of the job has missed we
need to track.
2. The flow is in proper or order is in wrong order.
450. What are the performance tuning techniques we have implemented?