Inserting XML Node into a XML Source Data using Pentaho Data Integration

Published by

on


Let us suppose, we have a XML data source as below:

<Rootnode>
    <Node></Node>
    <Node></Node>
    <Node></Node>
</Rootnode>

Now if we want to insert a new XML Node in between the <Node></Node> Tag; something like as below:

<Rootnode>
    <Node><newField/></Node>
    <Node><newField/></Node>
    <Node><newField/></Node>
</Rootnode>

Here <newField/> is the new xml node, which i would like to insert in between the <Node>.

Pentaho DI (kettle) provides few steps and sample examples to deal with XML data source. Steps like Get Data from XML, Add XML, XML Join will be used to achieve the above result. So let start by first of showing the entire transformation i have done to achieve this:

Capture2

Follow the Steps below:

Step-1: Get Data from XML

Take two “Get Data from XML” step having the same source data. In the First Step, simply fetch the <Rootnode> structure using the XPath as : //*

In the Second Step: We would require to read all the Nodes inside the Rootnode. You can achieve reading all the nodes by using the recursive XPath which is nothing but the use of “.“(dot). Check the image below:

Capture21

This will ensure that all the Nodes are read in a recursive fashion, which is required since we want to enter the new node into each of the <Node>.

Step-2: Add a constant

 In order to define the new node, i have used Add Constant step to define a new node. Just define a fieldname and place value as newField, or the name of the node which you are going to use.

Step-3: Add XML

Capture2222

In the Field’s section of this step, add the newField to the XML node having the Root XML element as “Node”. This is because we want to add the new node in the <Node> tag.

Subscribe to continue reading

Subscribe to get access to the rest of this post and other subscriber-only content.

One response to “Inserting XML Node into a XML Source Data using Pentaho Data Integration”