Latest Posts
-

Demystifying the Black Box: A Hands-On Guide to Explainable AI (XAI)
As artificial intelligence scales across industries, machine learning models are taking on increasingly high-stakes responsibilities. From credit scoring and fraud detection to clinical decision support, the predictive power of complex algorithms is undeniable. However, as… Read more
-

Designing AI Infrastructure for the Modern Enterprise
Enterprises are increasingly prioritising AI infrastructure, particularly generative AI, as essential for integrating AI frameworks into business workflows. The demand for specialised infrastructure is driven by the need for scalability, cost efficiency, and reliability, especially… Read more
-

Avoiding Data Quality Failures: Enterprise Challenges and Best Practices
Data quality is crucial for businesses, as poor data quality can lead to significant financial losses and hinder decision-making. It encompasses dimensions such as accuracy, completeness, uniqueness, consistency, timeliness, and validity. Enterprises often struggle with… Read more
Archive Posts
-
The blog provides steps for executing transformation files in Java using Pentaho Data Integration. It guides through creating Maven Project, adding dependencies, creating a sample job, and writing Java Code that triggers the Kettle.
-
This is a blog on an upgraded version of the Special Character Remover Pentaho Kettle Plugin. Please read the Version 1.0.0 of this plugin before continuing with this. What is New? With the new version of the Special Character Remover plugin, i have introduced a feature to either choose or customize the algorithms to clean…
-
Problem Statement When handling data especially in a data warehousing environment, developers tends to face serious issues with the data quality issue. Though there are multiple data quality issues, dealing with the special characters in the data set is one of the most commonly occuring data issue. DWH Developers or any person working with data…
-
In Pentaho DI, the data flow direction is denoted by “Hops.” Data movement can occur in either “Copy Data” or “Distribute Data” modes. Copy Data sends all input data to each output, while Distribute Data distributes input data in a round-robin fashion across outputs. The default setting is Distribute Data, which saves time and reduces…
-
[Update 2023]: This blog is updated with information for Pentaho Kettle 8.x.x or higher. For Pentaho Kettle version 8.0.0 or lower In case you want to mavenize the pentaho kettle plugin development, you can use the below XML code in your pom.xml file. Maven dependencies The above dependencies are basically the part of the core-kettle…
A Tech Spaghetti Blog Newsletter made just for you
Latest blog posts and insights on data, analytics, cloud, pentaho, artificial intelligence and more straight to your inbox.
Subscribe
Join hundreds of happy subscribers!
