[Update 2023]: This blog now has an alternative solution that you can also use to achieve similar result in minimum number of steps.
- Introduction
- Problem Statement / Scenario
- Solution-A: Using “Copy rows to result”
- Solution-B: Using “Get rows from Result”
Introduction
There arise several situations in Pentaho Data Integration, where we would need to execute a single piece of code for every single data rows coming from the input stream. Each row generates a different set of output. So in order to accomplish this, Pentaho has a step named “Copy rows to result“.
This step allows you to transfer rows of data (in memory) to the next transformation (or job entry) in a job via an internal result row set. It can be used by the Get rows from result step and some job entries that allow to process the internal result row set. [ref: Pentaho | Copy rows to result]
Problem Statement / Scenario
Suppose you have an excel file which contain rows of employee names along with their details. Check the sample employee details as below image.
Now the requirement is to create multiple (separate) Excel files for each employee along with their details.
Solution-A: Using “Copy rows to result”
This solution uses the inbuilt feature of pentaho “Copy rows to result”.
Step-1: Create a Job with two Transformation
- Transformation 1: Load Employees List into Memory
- Transformation 2: Generate Output for every Employee
Step-2: Read contents to memory
Subscribe to continue reading
Subscribe to get access to the rest of this post and other subscriber-only content.
35 responses to “Using “Copy rows to result” in Pentaho Data Integration”