Friday, June 24, 2011

DAC Reporting in OBIEE

The purpose of this blog is to show how to consume data in the DAC repository via OBIEE reports.

What is DAC?

Oracle Business Intelligence Data Warehouse Administration Console (Oracle DAC) provides a centralized console for schema management, configuration, administration, loading, and monitoring of the Oracle Business Analytics Warehouse. The Data warehouse Application Console (DAC) is used to create and schedule Informatica ETL workflows. It manages the following data warehouse load processes:
  • Dynamic generation of subject areas and execution plans
  • Dynamic settings for parallelism and load balancing
  • Intelligent task queue engine based on user-defined and computed scores
  • Index management for ETL and query performance
  • Embedded high-performance OTLP change capture techniques
  • Restart from any point of failure
DAC / ETL Statistics
The metadata for ETL is stored in the DAC repository that is composed of approximately 120 tables. These tables store the ETL metadata such as tasks, task steps, execution plan details, connection details, run history, etc.
In this exercise we are only interested in the run history table - W_ETL_DEFN_RUN. This table has the ETL execution details for every Execution plan that was run through DAC (see screenshot below).

We will be walking you through the steps for creating a simple solution that will allow you to analyze the metadata in W_ETL_DEFN_RUN in OBIEE.
RPD
Below is a simple RPD model that is based on the single W_ETL_DEFN_RUN table from the DAC repository.
Reports
Using the subject area (Statistics – ETL) shown above, you can build useful reports to analyze the execution history (for each execution plan type).
1. Current month summary view with duration in minutes & hours.
2. Snapshot of run duration for each month.
Notice the failed steps in orange below.
3. Report showing Error / Status description with number of steps – Total, Failed, and Success.
4. Compare the ETL for large duration (2010 and 2011).
By adding other repository tables into your RPD you will be able to develop more sophisticated reports. You can even have error reports delivered to you with iBots whenever ETL errors occur.

Friday, June 17, 2011

Informatica - Beyond ETL

Informatica has long been one of the business intelligence / data warehouse industry leaders in data integration. Its PowerCenter and PowereExchange products have consistently been positioned in the leader quadrant for data integration by Gartner. However, Informatica is not resting on its laurels. Since 2009 Informatica has gone on an acquisitions rampage buying up best in breed companies with similar data-centric DNA while at the same time continuing to innovate and broaden its base products. The Informatica Platform now enables you to integrate data within the traditional enterprise (Enterprise Data Integration), control data in the clouds (cloud data integration), share data with your partners (B2B Data Exchange), archive data for efficiency and eDiscovery (Application ILM), and deliver data at zero latency (Ultra Messaging). To unlock the business value of all this data, the platform enables you to cleanse and gain trustworthy data (Data quality), to detect and act on operational intelligence (Complex Event Processing) and gain a 360 view of your strategic, authoritative master data assets (Master Data Management).

Informatica’s CEO stated that “only Informatica delivers the comprehensive, unified and open data integration platform to empower the data-centric enterprise.”


Clearly, Informatica has moved beyond ETL to embrace these key data-centric technologies.

Acquisitions since January 2009:
  • Applimation (Jan. 2009) – Leader in Application Information Life Cycle Management. Helping Customers manage their data assets from cradle-to-grave:

  • Data Archive for storing inactive and rarely accessed data on lower cost storage medium.
  • Data Privacy for masking identifying and identity data to remain in compliance with today’s stringent personal information and privacy laws.
  • Data Subset for creating purpose-built data sets for environment (DEV/QA/SYSTEM) or functional (CRM/ERP/FINANCE/HR) based testing.
  • Address Doctor (Mid-2009) – Leader in Worldwide Name and Address Validation for Over 200 Countries. Providing Customers with the highest quality name and address content for name and address standardization and validation.
  • Agent Logic (Oct. 2009) – Leader in Complex Event Processing and Operational Intelligence. Helping Customers institutionalize event detection patterns and human analysis patterns into automated opportunistic action and response alerts.
  • Siperian (Jan. 2010) – Leader in Master Data Management. Helping Customers achieve mastery of a single view of ‘X’ across all mission critical data domains.

Wednesday, June 8, 2011

Data Modeling: Schema Generation Issue with ERwin Data Modeler 7.3

We are using Computer Associate's ERwin Data Modeler 7.3 for data modeling. In one of our engagements, we are pushing data model changes to the physical database (Oracle 11g R2) via ERwin's Forward Engineering / Schema Generation functionality.



The Forward Engineer / Schema Generation wizard offers the "Physical Order" option under "Column". Checking this box is supposed to force the ERwin DDL generation engine to preserve the column order (i.e. the column order in the physical database must match the column order in the logical/physical model).

Problem: Even though our data modelers pay very close attention to the column order in the logical and physical model in ERwin, the Forward Engineering / Schema Generation process seems to lose the physical column order when generating the DDL (even though the "Physical Order" option is checked). Thus, the column order in the Oracle database never matches with the column order in the ERwin data model.

Solution / Work-Around: This is a bug as the behavior of this option seems to be inverted. If you want to preserve the column order, the "Physical Order" must be un-checked.

Friday, June 3, 2011

Upgrading to Informatica PowerCenter 9.1 from Version 8.6

We have completed the upgrade from Informatica PowerCenter 8.6 to PowerCenter 9.1. The Informatica Upgrade Guide for PowerCenter 8.5.x and 8.6.x describes an in-place upgrade of an existing hardware and software environment. However, as we wanted to upgrade our hardware at the same time, our plan was to setup a brand new Informatica PowerCenter 9.1 environment and import the code from our legacy PowerCenter 8.6 server.

Installing Informatica PowerCenter 9.1 on a Windows 64-bit server was straight forward but we ran into the following two issues:
  1. There seems to be a bug in the Pre-Installation System Check Tool (i9Pi). The purpose of this tool is to check that the server meets all installation requirements. This tool kept raising the following error “Make sure that open cursors are at least set to 1000” for our repository connection information to our Oracle 11g database. Even after our DBA had set this value to 1500 and restarted the Oracle 11g database service, the Pre-Installation System Check Tool continued to raise this error. This seems to be a bug in i9Pi because the installation was able to complete successfully.
  2. If you are planning on running Informatica PowerCenter 9.1 on a Windows platform, please be aware that Informatica has released an Advisory for Windows 32-bit Server and 64-bit Server Platforms. Informatica has identified problematic behavior of the PowerCenter Server, which a user would observe when monitoring multiple concurrent tasks from PowerCenter Workflow Monitor. This behavior is tracked as CR 252994 within Informatica’s bug tracking system. Informatica has released an emergency fix that can be downloaded at the following location:
After installing Informatica PowerCenter 9.1 and starting the service, one has to log into the Administration console (http://server-name:6008) and create the Repository Service and the Integration Service. Once done, install the Informatica PowerCenter 9.1 Client on a client workstation and re-create all Connection information in Workflow Manager. This is a critical task because dependent objects cannot be validated during import without having all corresponding source and target connections defined.

Export all Informatica content from the legacy 8.6 environment into an XML file and import it into the Informatica PowerCenter 9.1 environment with Workflow Manager (Repository ==> Import Objects). After this, we were able to run the imported code in version 9.1 without problems.
Please let us know about your experiences with upgrading to PowerCenter 9.1.