Install and Configure Pentaho Data Integrator
Introduction
Pentaho is an open source business intelligence suite owned by Hitachi data system. They provide the wild range of products.- Big Data Solutions .
- Data Integration (ETL) .
- OLAP Services .
- Reporting tools .
- Dashboarding .
- Data mining.
There is two main release in Pentaho data integrator
- Pentaho Community Edition.
- Pentaho Enterprise Edition.
What is E-T-L
This is an abstract view of E-T-L Architecture. If you want more about E-T-L Read this Wiki article.Download
1. Download Pentaho CEhttp://community.pentaho.com/Click download section button
2. Click all OS button to start download from SourceForge
Install Pentaho CE
1. Once you finish your download, You should extract the zip file into any locations and after open the extracted folder, But you couldn’t find any executable file to install Pentaho. Because Pentaho CE is a portable version. So you don't want to any installation process to start Pentaho. (Anyway if you buy or download the trial of enterprise version you should have the executable file to start installation). So you should find below file to run Pentaho.spoon.bat
2. Pentaho CE is opening.
Configure Pentaho CE
Pentaho has two type of repository. (If you want to create DB repository, First step is to create DB user for Pentaho repository)- Database repository
- File system repository
As a first step, I will discuss with you that how to create DB repository and as a second step will explain that how to create file repository. If you create DB repository you will have many futures like that you can find transformation detail by querying your repository tables and also you can keep historical log details in separate tables.
Create Table base repository.
1. First, you should create DB user/schema (or separated DB in MySQL) in your database. (In this tutorial, I will use oracle DB, However you can use any relational DB system to create this repository like as a MySQl, MSSQL or DB2)2. Once you create the user, Open Pentaho and follow this steps.
- Tools –> Repository –> Connect
3. Click add new repository button.
4. Select ‘Kettle database repository’ and click ‘OK’
5. Click ‘New’ button to add new database connection
6. Select your database type and fill the details and click ‘OK’. (If you want you can test your connection using test button. )
Test result
7. Select the created database connection from 'Select Database Connection' and enter the name and description for the repository and click ‘Creta or Upgrade’ to create repository in the database.
8. Click 'Yes' if you agree.
9. Click 'Yes' if you agree.
10. This is SQL scripts to create repository table in the database, Click ‘ Execute’ to execute scripts.
10. This is SQL execute result. Check the end of this result. If you success, Its should be ‘XX SQL statements executed’. If you success click ‘ OK’ and close SQL script windows and click ‘OK’ in repository information window to finalise repository creation process.
11. Select your repository name, Enter login details and click ‘OK’.
Default login details for admin user
Usre Name : admin12. This is your Pentaho main windows
Password : admin
13. You can explore your repository by following this steps
Tools –> Repository –> Explore
14. This is your repository explore window.
Create file base repository.
1. Go to tools –> Repository –> Connect.2. This is repository connection windows
3. Click this add new repository button
4. Select your repository type as a ‘kettle file repository’
.
5. This is file repository configuration window. Fill the required details like as a file directory, Repository Name and also you can mention wether is will be the read-only repository or not. and click ‘OK’
6. Now click your repository name and click ‘ OK’ to open repository. (When you use file system you don;t want user account)
7. This is your Pentaho interface.
Install and Configure Pentaho Data Integrator
Reviewed by Lilantha Lakmal
on
2:40:00 PM
Rating:
No comments: