Friday, September 18, 2009

Value Add: You just need three letters: WHY?

As “experts” in the field of business intelligence, we are called upon day in and day out to help customers find ways to use the data they capture to help solve problems, find opportunities, and to add value to their companies. However, too many times in the process of understanding the customer, the approach taken is to figure out what they do now and make it better. While finding more efficient ways of doing business does add value to the customer, it’s only the tip of the iceberg. By incorporating three little letters, W-H-Y, into the process of understanding our customer, we can often find numerous other ways to help them be successful.

The Pitfalls of Traditional Requirements Gathering:

Traditional Requirements gathering often takes the approach of examining what the customer is currently doing. For example:

• What types of reports do you currently run?
• What type of data do you currently use?
• What type of analysis do you currently do?

While these are all relevant questions to ask as part of the process of understanding our customer’s line of business, too many times this is where the discovery process ends. Often we leave out the word “WHY” when we asking the questions. For example, when asking the question, “What types of reports do you currently run?”, the real value in the question comes when you ask why they need to run these reports. Not only does it give you, as the advisor, better insight into what things are important to their business, it also opens doors to other avenues of data or other applications of existing data that may be of use to the customer. Likewise, the more you can get at the root of why the data is important to the customer the more you can come to understand their pain points, and in doing so, become less prone to be seen as a window dresser and more as a problem solver. As simple as it seems on more than one occasion I’ve had the business area say, ‘it’s really refreshing to feel like someone is taking the opportunity to truly understand our business and help us figure out ways to make it better.’ The amazing thing is that to this point, all you’ve really done is asked questions and listened to what they had to say, and you’ve established yourself in a position of trust.

Helping the end user community determine how to make it better:

It doesn’t take a variety of independent studies to conclude that including your customers in the solution development process will likely result in a much higher adoption rate. Now, there is a time and place for everything. For example, bringing an end user into a meeting to discuss hardware specifications and communications protocol, will often leave the end user feeling lost and confused. This type of session tends to leave them feeling like they have nothing to bring to the table, and it is typically a waste of their time.

However, bringing an end user into a meeting to discuss report layout and navigation will be time well spent. First of all, they probably didn’t have much of a say in how the currently do things, and giving them a say in how the new system will work is very empowering to them. Not only will the end user feel like they have had a say in the process, but it should also expedite time to market by reducing back and forth between developers and end users at user acceptance testing. Likewise, the process of coming up with those specifications will often lead to other discussions that will result in finding other useful ways to unlock the power of the data they have at their disposal.

Result = BI app that was built by and for the end user

At the end of the day, the end user will be the one that has to live with the system that is built. While we as advisors often take a fair amount of pride in the solutions we craft, 9 times out of 10 we get to walk away from a customer to move onto something new. For the customer though, they are left with something that they will work with day in and day out. The relationships we build with our customers are the foundation for future opportunities, and nothing serves us better than to walk away from a project where the end user is excited about what they’ve help create versus something that’s been dropped in their lap. The more we take to time to realize this, the better we set ourselves up to work with the customer in the future.

In summary, it’s not enough as an advisor to survey what’s already in place for a customer and figure out a way to make it look better or run faster. The real value we add is when we help them uncover unique new ways of harnessing the power of their data. Taking the time to understand the customer’s business and how they go about managing the success of that business goes a long way in helping us make those types of discoveries. Taking that information and involving the end users in the development of the solution help cement the value of the work that is being done. It all results in a BI application that is built by and for the customer. Let’s face it, that’s the end game we should all be playing for. At the end of a project, if all you wind up doing is making the same reports run faster or with a different wrapper, at the end of the day your customers may be the ones saying WHY???

Wednesday, September 2, 2009

Cleansing Essential Data, In a Hurry

Recently, a Guident team completed a data cleansing effort that helped a large federal agency centralize data for a new Business Intelligence system. The team took data from dozens of agency sources and put it through a systematic process of cleansing and matching to produce a consistent data set of the millions of companies that interact with the agency daily.

The Problem

The new BI system for Guident’s agency client was designed to show different activities for each company the various agency departments deal with, including inspections, permits, adverse events, infractions, etc., all of which could come from dozens of different sources, in varying formats. All of this data was centralized and shown to personnel from all the different departments, allowing them to coordinate their activities, and to assign resources much more efficiently, since there was a limited number of agency personnel and millions of companies.

In order for the system to work, the name and location of each company needed to be consistent throughout the agency, and each activity, regardless of which department it originated from, would need to be properly associated to the correct cleansed company information.

Oh, and of course, time was limited because the new BI system was to go into User Acceptance Testing in 3 months for a scheduled spring launch.



Dirty Data

You’ve seen the situation before: different departments of an organization capture the same information in different ways. Some departments allow for free form data entry, which means the name of a company may be written as “IBM” or “I.B.M.” or “International Business Machines” or even “Int’l Biz Mach” – and those are the versions without misspellings. Some departments allow selection from dropdown lists, but those lists may not be complete enough forcing personnel to make judgments about just which location to select. Other departments use codes to stand for the real data – but the hidden data may not have been updated in a while.
After checking the many source databases, team analysts found themselves asking: just how many different versions of the word “Corporation” are there?

Cleansing and Matching

Company information was gathered primarily through each activity record. For example: an inspection had the name and address of the company facility inspected, primary contact information, the date of the inspection, the results of the inspection, any associated infraction information, the inspector’s information, etc. The team’s plan involved a two phase cleansing and matching method:

  1. All company information from new activities was first matched against an existing table of unique company names, TB_CENTRAL.
      1. To determine a match, the company facility name and address information was “standardized” – each word went through a function that converted it into a standard format. For example: if four separate records had “Ave.”, “avenue”, “Av”, and “AVN”, they would all be standardized as “AVE”. The full standardized company information would then be compared to the existing standardized company information already existing in TB_CENTRAL.
      2. If a match was found, then that new activity would be associated with the existing matched company.
      3. If no match was found, then the company information for the new activity was added to TB_CENTRAL, and the new TB_CENTRAL record received a code of AWAITING_CLEANSING.
  2. All records in TB_CENTRAL with the code of AWAITING_CLEANSING were sent to an outside data matching and cleansing service (such as Dun & Bradstreet or Lexis Nexis), which would return the records with the best match they could find.
      1. Each returned company facility record went through step 1a and 1b again.
      2. Companies that failed out of 1a and 1b were added as new cleansed records into TB_CENTRAL.



Figure 1: Centralized company information cleansing process

Great Results…and Lingering Issues

This method of data cleansing was very effective, producing a huge company cleansed data center that handled an initial batch of millions of records, and continued to handle hundreds of new activities on a daily basis. This new data center was so effective that owners of systems other than the BI system want to use it as their official source of company data.

There were, however, two important issues to deal with. First, even with the high record of matches, there were still activities whose company info was just not going to be cleansed. The team had decided that any important missing address information (missing state or zip code) would disqualify that record from the matching process. But the activities that matched those dirty company records were still important to the various departments of the federal agency. The team and the agency client decided that the best way to handle this data quality issue was to deliver report details and statistics to the departmental source owners for handling.

The second issue was that from a testing point of view, data had changed. For example, previously there may have been a list of 3 company facilities all which had different spellings of the same address (“Saint Paul, MN”, “St. Paul, MN”, “Seint Poll, MN”), but with any close scrutiny, anyone would conclude that they were the same address. If record 1 had 5 activities, record 2 had 20 activities, and record 3 had 1 activity, under the new company cleansed data center there would be just one company record with 26 associated activities. This was expected, but it also introduced the need to carefully train the client personnel during User Acceptance Testing to be aware of changes in certain expected results.

But having to explain to your user community why their new system really is better is a good problem.
Tools Involved in this Project
  • Oracle Business Intelligence – agency BI system, reporting.
  • Informatica – ETL from agency department data sources into the company cleansed data center, and data analysis.
  • Oracle 10g – databases, SQL procedures and functions.
  • Erwin – data modeling of the company cleansed data center.


Best Practices for Managing Successful BI Deployments/Implementations

Successful Business Intelligence (BI) deployments demand a sound and flexible implementation methodology. The methodology should blend traditional system development lifecycle techniques with newer and more iterative processes to bring immediate value to customers. Guident's Business Intelligence Guide (BIG) is a proven framework that has stood up hundreds of successful implementations at government and commercial sector clients. The BIG consists of iterative Initiate, Analyze, Design, Develop and Implement phases, braced with the guidance and support of Project and Change Management. Key differentiators of the BIG include allowing requirements to evolve and change, providing quick 120-day wins to demonstrate a working prototype, emphasizing the quality and clarity of metadata construction, and instilling flexible project management principles to deliver the solution.

The BIG has evolved by leveraging best practices and lessons learned from past client successes. Having the right mix of processes, technologies, and expertise is not enough to ensure a successful implementation. They all must come together under a solid methodology that clearly defines each phase, the specific tasks, deliverables, templates, and guidelines to provide actionable information across enterprise touchpoints.

Learn more about Guident's BIG and its application at federal and commercial sector clients in this presentation. It was presented at several key BI events including:
  • Oracle COLLABORATE Conference ('08)
  • Business Objects Insight Conference ('06,'07)
  • Oracle BIWA Summit ('07)
  • Oracle OpenWorld ('08)
  • DC Area Business Objects / Crystal User Group ('08).

Crystal Reports 2008 for Web Intelligence XI Users

Crystal Reports 2008 is a powerful feature-rich business intelligence and reporting tool. It allows users to transform data into interactive and actionable information through canned or operational reports. Crystal Reports are flexible enough to access multiple data sources and can integrate with legacy web and Windows applications. The tool is available in three editions to support the demands of the enterprise: a web-based edition for medium-sized organizations that can deliver reports via the web, email, and MS Office format; a desktop installation for dedicated report designers and power users with a drag-and-drop environment; and a version bundled with Xcelsius Engage that allows users to create highly-formatted reports and interactive dashboards.

Crystal Reports 2008 differentiates itself from Business Objects Web Intelligence in that it has built-in wizards that provide step-by-step instructions for report creation, Crystal Reports can directly access multiple data sources, and development can be conducted offline. In addition, Crystal Reports provide a number of out-of-the-box features that allow for more robust reporting. Report designers can export reports in multiple formats including PDF, Excel and Word, create dynamic prompts that allow users to choose report criteria and values, create trigger notifications that alert users when specified conditions are met, and display a wide variety of charts and graphs. Learn more about Crystal Reports 2008 in this presentation to the DC Area Business Objects / Crystal User Group (’08).

Business Objects XI 3.0: Getting Acquainted

Business Objects XI (BOXI) is an enterprise information management, query, reporting and analysis tool. Release 3.0 on the XI platform comes with new out-of-the-box features and enhancements that improve the capabilities of previous releases. Key new features include the ability to track changes between report refreshes, Web Intelligence support for stored procedures in universes, migration enhancements when porting reports from Desktop Intelligence to Web Intelligence, and new BI Widgets that provide personalized information on users’ desktops.

User demands for a more robust Web Intelligence tool have also been addressed with a new rich client. The Web Intelligence Rich Client now allows users to develop and format reports, drill down and slice and dice information online and off. It bridges the functionality gap between Desktop Intelligence and Web Intelligence while improving overall performance. In addition, 3.0 offers a much improved Central Management Console (CMC) for administration. BOXI administrators can more easily navigate through CMC’s menu structure with its enhanced interface, and perform more streamlined tasks through new modal dialog windows. Get better acquainted with Business Objects XI 3.0 with this presentation to the DC Area Business Objects / Crystal User Group (’08).

OBIEE – Custom vs. Pre-Packaged

Oracle Business Intelligence (OBIEE) is a suite of powerful and highly integrated business intelligence tools for enterprise-wide reporting and analytics. Prebuilt components include:

  • Answers - Oracle BI’s ad-hoc query and analysis tool that allows users to create reports, charts and pivot tables
  • Dashboards – interactive homepages with personalized reports, charts and graphics for enhanced decision-making
  • Delivers – a monitoring tool that promptly alerts users of specified business activity via Dashboards
  • In addition, Oracle offers add-on applications and modules to extend OBIEE’s capabilities to key organizational departments. Oracle BI Analytical Applications include Sales, Service & Contact Center, Marketing, Order Management & Fulfillment, Supply Chain, Financials, and Human Resources.


Deploying the OBIEE suite and applications require a prudent strategy with the organization’s business intelligence goals in mind. Three deployment strategies include:

  1. Customizing the OBIEE suite as a standalone product (no additional applications installed)
  2. Implementing the full suite of packaged applications
  3. A hybrid approach with custom development and configuration of pre-built components.

Each deployment path has its pros and cons. Customization should be chosen when business requirements are unique and add-on applications cannot support these needs. This creates a solution that is fine-tuned to the needs of the business but also requires the most time to develop. On the other hand, utilizing the pre-built OBIEE applications and components is typically shorter to turn around and also leverages out-of-the-box metadata and data warehousing content. This strategy works best when the organization is a COTS shop using products from Siebel, Peoplesoft, or SAP since only the configuration of OBIEE is required. Finally, the hybrid strategy allows organizations to address standard and unique business requirements by blending the customization and configuration of the OBIEE suite and applications.


In order to fully realize the investment of an OBIEE implementation, organizations should weigh their business requirements against the suite’s out-of-the box capabilities. For more tips on whether a custom or pre-built OBIEE solution is right for your organization, see this Guident presentation conducted at Oracle OpenWorld (’08).