Latest Posts
-

Demystifying the Black Box: A Hands-On Guide to Explainable AI (XAI)
As artificial intelligence scales across industries, machine learning models are taking on increasingly high-stakes responsibilities. From credit scoring and fraud detection to clinical decision support, the predictive power of complex algorithms is undeniable. However, as… Read more
-

Designing AI Infrastructure for the Modern Enterprise
Enterprises are increasingly prioritising AI infrastructure, particularly generative AI, as essential for integrating AI frameworks into business workflows. The demand for specialised infrastructure is driven by the need for scalability, cost efficiency, and reliability, especially… Read more
-

Avoiding Data Quality Failures: Enterprise Challenges and Best Practices
Data quality is crucial for businesses, as poor data quality can lead to significant financial losses and hinder decision-making. It encompasses dimensions such as accuracy, completeness, uniqueness, consistency, timeliness, and validity. Enterprises often struggle with… Read more
Archive Posts
-
This blog demonstrate the use of big data and Hadoop using Pentaho Data Integration. I will explain the basic hadoop-wordcount example using PDI. Prerequisite Steps PDI provides a very intuitive steps to deal with HDFS and MapReduce. All the steps under the Big Data group is basically used to do all the hadoop activities (if…
-
The blog post provides instructions on how to pass parameters in Pentaho Data Integration/Kettle, both in transformations (.ktr) and jobs (.kjb) using a Java code. The parameters allow dynamic passing of metadata to any Pentaho transformation/job and their scope extends across multiple transformations inside a job.
-
Inserting a new xml node into a complex XML data source will fail with the approach provided in my previous blog. This is because handling multiple source structure will fail in case it is having multiple parent-child relationship. The use “.“(dot) will also not work, since it will recurse through all the child node missing…
-
Let us suppose, we have a XML data source as below: Now if we want to insert a new XML Node in between the <Node></Node> Tag; something like as below: Here <newField/> is the new xml node, which i would like to insert in between the <Node>. Pentaho DI (kettle) provides few steps and sample…
-
[Update 2023]: This blog is now applicable for the older versions of Pentaho Data Integration. Pentaho version 8, 9 and above are not applicable and it does not support this process. An Update blog will be available soon for the later version. Sometimes during the phase of the development, we might need to import some…
A Tech Spaghetti Blog Newsletter made just for you
Latest blog posts and insights on data, analytics, cloud, pentaho, artificial intelligence and more straight to your inbox.
Subscribe
Join hundreds of happy subscribers!
