September 29, 2010, 1:21 PM — by Mike Biere - Many BI tools discussions center upon how best to determine optimal technology for a particular application or set of end users. A common, and often vigorous one, is whether an OLAP solution is appropriate or would a ROLAP (Relational OLAP) option make more sense.
There are some basic rules of thumb that can be applied to assist in this determination. But first it helps to look at the greater picture of OLAP compared to ROLAP. The 'traditional' OLAP engine follows a simple paradigm: connect to the data source(s), extract the appropriate data, run through a builder process to create a cube, provide access to the cube. The beauty of a pure OLAP solution is the resulting cube provides extremely fast access and response time (drill, slice, dice, etc.). There is even greater benefit in that the 'math' the end users might have to apply (sums, calculations, aggregations, etc.) is already contained in the cube. Assuming the calculations are accurate, this frees users or potential users from the onus of having to learn much about a particular query environment. Conversely, OLAP solutions suffer from having to constantly rebuild or add on to the existing cube(s) to remain current. Most cubes contain far more data than the users ever look at.
ROLAP solutions provide a means to build cubes "on-the-fly" with smaller result sets that contain only the data the end users have requested. They contain more current information. They will typically not be as blindingly fast in response as pure OLAP, but this may not be a requirement or inhibitor. Many modern database engines now offer a ROLAP capability. The caution here is that the math (calculations etc.) in a ROLAP solution may not be as robust as the pure OLAP approach.
A ROLAP solution has significant advantages in processing compared to traditional queries if it provides a caching mechanism. For example, if numerous people share the same data and tend to execute similar queries, a ROLAP solution could cache the results (after the first person executes it) and make them available for subsequent users. Once built, a 'virtual cube' is fairly fast. The cache will prevent additional queries from going back to the source database if it contains the appropriate data.
So how do you choose one versus the other? Here are some simple evaluation points for these two unique solutions as well as which one is more appropriate or has an advantage.
- How 'fresh' does the data have to be? - ROLAP if up to date data required.
- How big is our batch window for cube building? - ROLAP if the window is too narrow.
- Do we need persistent connectivity to the database/cube or do our end users require “cubes to go”? - many OLAP solutions provide cubes to go.
- Can the ROLAP solution handle the math? - OLAP typically has a better calc engine.
- Do we need to write back capability? - toss up, find one that works to your satisfaction ... some solutions do not provide write back.
- How much data in any cube do we really look at? - ROLAP if there is too much data left untouched.
This Tip is based on the book, 'The New Era of Enterprise Business Intelligence: Using Analytics to Achieve a Global Competitive Advantage' by Mike Biere; www.ibmpressbooks.com; Aug. 2010, ISBN 0137075421, Copyright 2011 by International Business Machines Corp.