搜档网
当前位置:搜档网 › Community_user_guide

Community_user_guide

Community_user_guide
Community_user_guide

Introducing the Pentaho BI Suite Community

Edition

This document is copyright ? 2004-2008 Pentaho Corporation. No part may be reprinted without written

permission from Pentaho Corporation. All trademarks are the property of their respective owners. About This Document

If you have questions that are not covered in this guide, or if you find errors in the instructions or language, please contact the Pentaho Technical Publications team at documentation@https://www.sodocs.net/doc/4a16196265.html,. The Publications team cannot help you resolve technical issues with products.

Support-related questions should be submitted through the Pentaho Community Forum at http://

https://www.sodocs.net/doc/4a16196265.html,/. There is also a community documentation effort on the Pentaho Wiki at: http://

https://www.sodocs.net/doc/4a16196265.html,/ that you may find helpful.

For information about how to purchase enterprise-level support, please contact your sales representative, or send an email to sales@https://www.sodocs.net/doc/4a16196265.html,.

For information about instructor-led training on the topics covered in this guide, visit https://www.sodocs.net/doc/4a16196265.html,/ training.

Limits of Liability and Disclaimer of Warranty

The author(s) of this document have used their best efforts in preparing the content and the programs

contained in it. These efforts include the development, research, and testing of the theories and programs to determine their effectiveness. The author and publisher make no warranty of any kind, express or implied, with regard to these programs or the documentation contained in this book.

The author(s) and Pentaho shall not be liable in the event of incidental or consequential damages in connection with, or arising out of, the furnishing, performance, or use of the programs, associated instructions, and/or

claims.

Trademarks

Pentaho (TM) and the Pentaho logo are registered trademarks of Pentaho Corporation. All other trademarks are the property of their respective owners. Trademarked names may appear throughout this document. Rather than list the names and entities that own the trademarks or insert a trademark symbol with each mention of the trademarked name, Pentaho states that it is using the names for editorial purposes only and to the benefit of the trademark owner, with no intention of infringing upon that trademark.

Company Information

Pentaho Corporation

Citadel International, Suite 340

5950 Hazeltine National Drive

Orlando, FL 32822

Phone: +1 407 812-OPEN (6736)

Fax: +1 407 517-4575

https://www.sodocs.net/doc/4a16196265.html,

E-mail: communityconnection@https://www.sodocs.net/doc/4a16196265.html,

Sales Inquiries: sales@https://www.sodocs.net/doc/4a16196265.html,

Documentation Suggestions: documentation@https://www.sodocs.net/doc/4a16196265.html,

Contents Introduction (2)

Community Edition or Enterprise Edition? (2)

Community Edition Support Options (2)

The Pentaho Client Tools (2)

Installation (4)

Hardware Requirements (4)

Software Requirements (4)

Downloading and Installing the BI Suite (5)

Starting the BI Platform (5)

Configuring the BI Platform With the Administration Console (5)

Getting Started (7)

How to Log Into the Pentaho User Console (7)

Navigating the Pentaho User Console (7)

Tutorials (9)

Ad hoc Reporting Tutorial (9)

Analysis View Tutorial (11)

Building a simple input-output transformation (12)

Introduction

The Pentaho BI Suite Community Edition is an open source business intelligence package that includes ETL, analysis, metadata, and reporting capabilities. It is entirely open source software, licensed mostly under the GNU General Public License version 2, with parts under the LGPLv2, the Common Public License, and the Mozilla Public License. Pentaho optimizes, platform-tests, and guarantees certified builds of the BI Suite; this enhanced version of the software is packaged with a powerful service management tool called Enterprise Console, user support, IP indemnification, and professional documentation, and sold by Pentaho as Enterprise Edition.

The purpose of this guide is to introduce new users to the Pentaho BI Suite, explain how and where to interact with the Pentaho community, and provide some basic instructions to help you get started using the software.

Community Edition or Enterprise Edition?

The BI Suite Community Edition is ideal for:

?Business intelligence aficionados

?Open source software programmers

?Early adopters

?College students

Pentaho no longer suggests using Community Edition for enterprise evaluations. If you are a

business user interested in trying out the BI Suite Enterprise Edition, follow the Enterprise Edition evaluation link on the https://www.sodocs.net/doc/4a16196265.html, front page, or contact a Pentaho sales representative.

The enhancements, service, and support packaged with the BI Suite Enterprise Edition are

designed to accommodate production environments, especially where downtime and time spent figuring out how to install, configure, and maintain a business intelligence solution are prohibitively expensive. If your business will save money or make more money as a result of a successful

business intelligence implementation, then Enterprise Edition is the most appropriate choice.

Community Edition Support Options

As a Pentaho BI Suite Community Edition user, you will have to install, configure, and

maintain the software on your own. Your only support options are the community forum ( http:// https://www.sodocs.net/doc/4a16196265.html, ) and the community Wiki ( https://www.sodocs.net/doc/4a16196265.html, ). If you do not find an

answer right away, please be a good community participant and contribute a Wiki article that

explains the solution after you've figured it out.

At any time, you can contact Pentaho sales and upgrade to Enterprise Edition. Enterprise Edition customers get phone support, access to Pentaho software engineers, and a Web-based knowledge base that is updated weekly with new support articles, tips, and comprehensive user guides.

The Pentaho Client Tools

The Pentaho client tools are:

?Report Designer: An advanced report creation tool. If you want to build a complex data-driven report, this is the right tool to use. Report Designer offers far more flexibility and functionality than the ad hoc reporting capabilities of the Pentaho User Console.

?Design Studio: An Eclipse-based tool that enables you to hand-edit a report or analysis view xaction file. Generally, people use Design Studio to add modifications to an existing report that cannot be added with Report Designer.

?Aggregation Designer: A graphical tool that helps improve Mondrian cube efficiency.

?Metadata Editor: Enables you to add a custom metadata layer to an existing data source.

Usually you would do this for a data source that you intend to use for reporting; it's not

required, but it makes it easier for business users to parse the database when building a

query.

?Pentaho Data Integration: The Kettle extract, transform, and load (ETL) tool, which enables you to access and prepare data sources for analysis, data mining, or reporting. This is

generally where you will start if you want to prepare data for analysis.

?Schema Workbench: A graphical tool that helps you create ROLAP schemas for analysis.

This is a required step in preparing data for analysis.

After they're installed, you can find all of these tools in their own directories in the /biserver-

ce/client/ directory. The scripts that run them should be fairly self-explanatory. If you are using Windows, there should be a Pentaho program group with icons that will initialize the BI Server and run the client tools.

Installation

Follow the instructions below to download and install the Pentaho BI Suite Community Edition.

Hardware Requirements

The Pentaho BI Suite software does not have strict limits on computer or network hardware. As long as you meet the minimum software requirements (note that your operating system will have its own minimum hardware requirements), Pentaho is hardware agnostic. There is, however, a

recommended set of system specifications:

It's possible to use a less capable system, but in most realistic scenarios, the too-limited system resources will result in an undesirable level of performance.

Your environment does not have to be 64-bit, even if your processor architecture supports it.

Software Requirements

In terms of operating systems, Windows XP with Service Pack 2, modern Linux distributions (SUSE Linux Enterprise Desktop and Server 10 and Red Hat Enterprise Linux 5 are officially supported, but most others should work), Solaris 10, and Mac OS X 10.4 are all officially supported.

No matter which operating system you use, you must have the Sun Java Runtime Environment

(JRE) version 1.5 (sometimes referenced as version 5.0) installed. 1.4.2 will not work, and 1.6 (6.0) is not fully supported at this time.

Note: The GNU Compiler for Java, or GCJ for short, interferes with the way many native Java

programs work on Linux, including some of the components of the Pentaho BI Suite. If you are

using a Linux distribution that installs GCJ by default (which includes all of the most popular distros), then before you begin installation you must remove, disable, or circumvent GCJ. If you cannot

remove it, you can simply ensure that your JAVA_HOME variable is properly set, and add the Java Runtime Environment's /bin/ directory to the beginning of your PATH variable in ~/.bashrc or /etc/ environment, then relog before continuing.

Workstations will need to have reasonably modern Web browsers to access Pentaho's Web

interface. Internet Explorer 6 or higher; Firefox 2.0 or higher (or the Mozilla or Netscape equivalent);

and Safari 2.0.3 or higher will all work.

Your environment can be either 32-bit or 64-bit as long as it meets the above requirements.

The aforementioned configurations are officially supported by Pentaho. Other operating systems such as Windows Vista, FreeBSD, and OpenBSD; other Java virtual machines like Blackdown; and other Web browsers like Opera may work without any problems. However, the Pentaho support team may not be able to help you if you have trouble installing or using the BI Suite under these conditions.

Note: If you intend to install onto a headless Linux, Solaris, or BSD server, you will need to execute two extra steps. First of all, the installation utility requires a graphical environment, so you'll have to install onto a workstation and then copy over the /bi-suite-2.0.0/ directory to your server. You will also have to install the Xvfb package on your server to simulate a working X11R6 environment;

the Pentaho Reporting engine requires an X server or Xvfb instance to generate charts in Report Designer or the ad hoc reporting interface in the BI Server.

Downloading and Installing the BI Suite

Follow the below process to download and install the Pentaho BI Suite Community Edition.

1.Open a Web browser and navigate to the Pentaho page on https://www.sodocs.net/doc/4a16196265.html,.

https://www.sodocs.net/doc/4a16196265.html,/projects/pentaho/ . If you cannot click on links in this document, you can

simply navigate to https://www.sodocs.net/doc/4a16196265.html,/projects/pentaho/

2.Click Download.

3.At the SourceForge download screen, click Business Intelligence Server.

4.In the Latest category at the top of the list, click either the .zip or .tar.gz file for the biserver-ce

project.

This is an archive package of the Pentaho BI Platform, along with a Tomcat Java application

server configured to run it. There is no functional difference between the zip and tar archives;

they are merely in compression formats that are generally preferred by Windows and Linux

users, respectively.

5.Once the file is downloaded, unpack it using your preferred archive utility.

Ideally you would be unpacking this on what you expect to be your server, though there is no

reason why you can't install the Pentaho client tools on the same machine.

6.Repeat this process for any or all of the following Pentaho client tool projects:

?Report Designer

?Pentaho Metadata

?Design Studio

?Data Integration

?Schema Workbench

You may not need all of these tools, but it can't hurt to download all of them.

You have retrieved all of the relevant Pentaho software, and are ready to configure the BI Platform.

Starting the BI Platform

In order to use and configure the Pentaho BI Platform, you must start the BI Server, then the

Pentaho Administration Console.

Starting the BI Server

To start the BI Server, run the start-pentaho script in the /biserver-ce/ directory.

Starting the Pentaho Administration Console

To start the Pentaho Administration Console, run the start script (on Windows) or startup script (on Linux) in the /biserver-ce/administration-console/ directory.

Configuring the BI Platform With the Administration Console Follow the below process to log into the Pentaho Administration Console.

1.Open a Web browser and type in the Web or IP address of the Pentaho Administration

Console server, which is http://localhost:8099/admin by default.

2.Type in your administrator credentials, then click Login.

The default credentials are admin for the user, and password for the password.

3.Click the Administration tab on the left side of the window.

4.Remove the default sample users and roles and create your own.

5.Click the Data Sources tab at the top of the window.

6.Enter the connection details for the data source you want to use for reporting and analysis.

By default, there is a sampledata source listed. If you intend to follow the examples later in

this guide, you must leave this data source intact.

You are now logged into the Pentaho Administrator Console and ready to finish configuring the BI Platform.

Getting Started

Your workflow will vary depending on your BI goals. Typically, Pentaho BI Suite users will start with Pentaho Data Integration to prepare a data source, then use Metadata Editor to create a metadata layer for that data source, then potentially Schema Workbench to create a ROLAP schema. At that point, you're ready to create reports and analysis views.

If you just want to create a quick report, the ad hoc reporting component of the Pentaho User

Console is the best tool for the job. If you want to create a detailed report, go directly to Report

Designer instead. If you have created a ROLAP schema, then you can do some data exploration first by using an analysis view, which allows you to drill down into the smallest of details in a data source.

Ideally, everything will end up being published to the BI Platform, which enables you to display, run, and share your reports with others, or to schedule them to run at given intervals.

Once you've got some reports and/or analysis views that you like, you might create some

dashboards that display them in creative and useful ways for your business users.

Follow the instructions below to log into the Pentaho User Console and familiarize yourself with its graphical interface.

How to Log Into the Pentaho User Console

Follow the below process to log into the Pentaho User Console.

1.Open a Web browser and type in the Web or IP address of the Pentaho server, which is

http://localhost:8080/pentaho/ by default.

You'll see an introductory screen with some Pentaho-related information and a Login button in

the center of the screen.

2.Click Login.

The login dialog will appear.

3.For the locally installed version of the BI Suite, select Joe from the user drop-down box, and

type in password into the password field, then click Login. For hosted demo users, select

Guest and type in guest as the password instead.

You are now logged into the Pentaho User Console and ready to start creating and running reports. Navigating the Pentaho User Console

The first thing you will see when logging in is the quick launch screen, shown here:

If you'd like to experiment on your own before continuing on to the tutorials, click one of the three icons in the center of the screen to create a new ad hoc report, start a new analysis view, or edit existing solutions.

The button bar near the top of the page also contains icons for creating new ad hoc reports and analysis views, along with a button to print the current report or analysis view, and to open a previously saved solution.

Different user roles have different levels of access in the Pentaho User Console. The menu above the button bar performs these same functions as the buttons, plus administrative actions if you are logged in as an administrator, and also offers access to My Workspace and external links to help and support resources.

The three buttons in the quick launch screen will appear when you log into the Pentaho User Console for the first time, and when you close all tabs in the solution browser.

You can change views between My Workspace and the solution browser at any time by clicking the

rightmost icons in the top button bar, or through the View menu.

Tutorials

The below sections offer, in no specific order, basic tutorials for the three major pillars of the BI Suite: Reporting, analysis, and data integration. These tutorials assume that you are working with the included sample data source, and that you have Report Designer and Pentaho Data Integration installed, and that you are logged into the Pentaho User Console.

Ad hoc Reporting Tutorial

You must be logged into the Pentaho User Console as Joe before continuing.

This walkthrough shows you how to create a simple, template-based report that shows which

territory generates the most sales.

1.Click the Create New Report button in the middle of the Pentaho User Console screen.

The ad hoc query wizard will start.

2.In the first step of the wizard, select Orders in the Business Model Details pane.

A business model is another term for data set.

3.In the Apply a Template field, select a predefined report template that appeals to you.

A thumbnail preview of the template will appear in the Template Details field. A template

specifies a variety of properties in the report that affect its appearance, like font size and

background colors for various report elements.

4.Click Next.

5.In the Available Items list, click the Territory business column and drag it to the upper right

into the Level 1 box.

This will determine how the data is grouped.

6.Drag and drop the Amount and Buy Price into the Details box on the right.

This determines which fields to display for the given groups.

7.Click Go to preview how these new items have affected the report, then close the preview tab

when you're done.

8.Click Next.

9.Click the Territory item in the Groups list.

A list of general options will appear on the right.

10.Click Center.

This will center the territory name above each table, making it easier to read.

11.Click Amount, then click Add in the Sort Detail Columns area on the right.

This will sort the sales amounts from lowest to highest.

12.Click Go to test the new change, or Next to continue to the next part of the wizard.

13.To set the header, footer, description, paper type, and page orientation, change the on-screen

values for these elements accordingly.

PDF is the only output type that has a concept of a page, so the Page portion of the Header and Footer sections only applies to PDFs.

14.Click the blue Save button in the top toolbar to save your report. In the ensuing file dialog,

navigate to the location you want to save the report to, and type in a filename for the report.

You can continue to modify your report after it's been saved; just click Save to update the

report file after you've made changes.

You now have a report that shows how much revenue is coming from each sales territory, and

the itemized price of each purchased product. As you can see, ad hoc reporting is quicker

and simpler than Report Designer, but doesn't offer nearly the same level of design detail,

nor does it have advanced reporting features like conditional formatting or parameterization.

Analysis View Tutorial

Analysis views are similar to reports, except they're designed to be totally interactive and dynamic, whereas reports tend to be static or minimally interactive after they're created. Analysis views allow you to dynamically explore your data and drill down into it to discover previously hidden details.

In this example, you'll try to find out which product line is responsible for the highest number of

cancelled orders.

1.In the File menu, select the New sub-menu, then click New Analysis View....

This is one of several ways to create a new analysis view; all methods lead to the same screen

in the Pentaho User Console.

2.Select the SteelWheels schema and SteelWheels Sales cube from the drop-down lists.

3.Click OK to continue.

An analysis view will open in a new tab.

4.Open the OLAP Navigator by clicking the cube icon in the top button bar – it's the third from

the left.

A new table will appear above the analysis view fields. The default basis for comparison is

Measures, though that is not very useful for finding returned products.

5.In the Rows section, click the two-tone square next to Product. This will move it up to the

Columns section.

6.In the Columns section, click the funnel icon next to Measures. This will move it down to the

Filters section.

7.Click OK to modify the analysis view.

8.Click the + next to All Products to drill down into it.

9.Click the + next to All Status Types to drill down into it.

Clearly the ships category has the most cancelled orders.

10.Drill down further into ships by opening the OLAP Navigator again.

11.Click Product in the Columns section.

12.Un-check all products except Ships.

13.Click OK.

14.Click Order Status.

15.Un-check all statuses except Cancelled.

16.Click OK, then OK again to close the OLAP Navigator and refine the analysis view.

17.Click the + next to Ships to show its constituent product lines.

18.Notice that the Carousel DieCast Legends series has the highest number of cancelled

orders.

You now know which product line has the most cancelled orders.

Building a simple input-output transformation

You must have a database connection to complete this exercise.

This exercise walks you through the process of building a transformation that uses input from a database table (a list of customers), applies an SQL statement, and outputs the data stream to a plain text file. This exercise is useful for testing and tuning transformations to see how the fields are passed in the data stream; it also shows you how to examine the results of each step of a transformation, as well as the final output.

1.In the left pane, click Design, then click Input to expand the folder and view the input step

options.

2.Click Table Input and drag the cursor to the right pane.

A Table Input step is added to the transformation. This step reads information from the

database using the database connection and SQL.

3.Double-click Table Input to display a dialog box that allows you to view and edit the SQL

statement.

As a general rule, it is a good idea to maximize this window. This is a relatively simple

example, but for longer statements, maximizing the window ensures that you see all parts of the statement. You must edit the following parameters to specify the table to use as input, the values get from the table, and (optionally) the conditions:

4.Click Preview to view the data stream flowing from the Table Input step. A dialog box appears

that allows you to specify the number of rows you want to preview.

Preview shows you what is in the data stream coming into the step. If there are errors, a log file appears that includes a record of the errors to assist in troubleshooting.

5.In the left pane, click the Output folder to expand it and view the output step options. Click

Text File Output and drag the cursor to the right pane.

A Text File Output step is added to the transformation. This step sends the fields in the

incoming data stream to a text file.

6.Click Table Input, then press and drag the cursor to Text File Output.

This adds a hop (data connection) between the two steps and displays as an arrow connecting the steps. The hop represents the data stream flowing from the Table Input step to the Text File Output step.

7.Double-click the Text File Output step and type a path/file name in the File name box.

The Content tab allows you to specify optional settings such as the field separator and the format (DOS or UNIX).

8.Click the Fields tab, then click Get Fields.

Note: The file extension (in this case, .txt) is filled in for you.

The Fields tab displays all the fields in the data stream that is flowing through the hop from Table Input step to the Text File Output step, and this data will in turn flow into the text file you defined. You can also type fields into this tab to add them to the data stream.

9.Right-click Text File Output to display the context menu, then click Show Input Fields.

This dialog box displays all the fields in the data input stream and their transformations. In this simple example, there are no intermediate transformation steps, for example filters, but in a transformation with intermediate steps, the details display here.

10.On the context menu, click Show Output Fields.

This dialog box displays all the fields in the data output stream and their transformations. In this simple example, there are no intermediate transformation steps, for example filters, but in

a transformation with intermediate steps, the details display here. Together, the Show Input

Fields and Show Output Fields dialog boxes display all the detail information on every field

相关主题