In this article I am trying to brief about OLAP types and their importance.
OLAP systems have been traditionally categorized using the following taxonomy.
- MOLAP – Multidimensional
- ROLAP – Relational
- HOLAP – Hybrid
Multidimensional – MOLAP:
- MOLAP (multi-dimensional online analytical processing) is the classic form of OLAP and is sometimes referred to as just OLAP.
- MOLAP stores this data in optimized multi-dimensional array storage, rather than in a relational database.
- Some MOLAP tools require the pre-computation and storage of derived data, such as consolidations – the operation known as processing.
- Such MOLAP tools generally utilize a pre-calculated data set referred to as a data cube.
- The data cube contains all the possible answers to a given range of questions.
- As a result, they have a very fast response to queries.
- On the other hand, updating can take a long time depending on the degree of pre-computation.
- Pre-computation can also lead to what is known as data explosion.
- The multidimensional data model is an integral part of On-Line Analytical Processing, or OLAP.
- Because OLAP is on-line, it must provide answers quickly; analysts pose iterative queries during interactive sessions, not in batch jobs that run overnight.
- And because OLAP is also analytic, the queries are complex.
- The multidimensional data model is designed to solve complex queries in real time.
- The multidimensional data model is composed of logical cubes, measures, dimensions, hierarchies, levels, and attributes.
- The simplicity of the model is inherent because it defines objects that represent real-world business entities.
- Analysts know which business measures they are interested in examining, which dimensions and attributes make the data meaningful, and how the dimensions of their business are organized into levels and hierarchies.
- Other MOLAP tools, particularly those that implement the functional database model do not pre-compute derived data but make all calculations on demand other than those that were previously requested and stored in a cache.
Advantages of MOLAP
- Excellent performance: MOLAP cubes are built for fast data retrieval, and are optimal for slicing and dicing operations due to optimized storage, multidimensional indexing and caching.
- Smaller on-disk size of data compared to data stored in relational database due to compression techniques.
- Effective data extraction achieved through the pre-structuring of aggregated data i.e. all calculations have been pre-generated when the cube is created. Hence, complex calculations are not only doable, but they return quickly.
- Automated computation of higher level aggregates of the data.
- It is very compact for low dimension data sets.
- Array models provide natural indexing.
Disadvantages of MOLAP
- This is usually remedied by doing only incremental processing, i.e., processing only the data which have changed (usually new data) instead of reprocessing the entire data set.
- Limited in the amount of data it can handle: Because all calculations are performed when the cube is built, it is not possible to include a large amount of data in the cube itself. This is not to say that the data in the cube cannot be derived from a large amount of data. Indeed, this is possible. But in this case, only summary-level information will be included in the cube itself.
- Some MOLAP methodologies introduce data redundancy.
- Requires additional investment: Cube technologies are often proprietary and do not already exists in the organization. Therefore, to adopt MOLAP technology, chances are additional investments in human and capital resources are needed.
- ROLAP works directly with relational databases.
- The base data and the dimension tables are stored as relational tables and new tables are created to hold the aggregated information.
- It depends on a specialized schema design. This methodology relies on manipulating the data stored in the relational database to give the appearance of traditional OLAP’s slicing and dicing functionality.
- In essence, each action of slicing and dicing is equivalent to adding a “WHERE” clause in the SQL statement.
- ROLAP tools do not use pre-calculated data cubes but instead pose the query to the standard relational database and its tables in order to bring back the data required to answer the question.
- ROLAP tools feature the ability to ask any question because the methodology does not limit to the contents of a cube.
- ROLAP also has the ability to drill down to the lowest level of detail in the database.
Advantage of ROLAP
- Can handle large amounts of data: The data size limitation of ROLAP technology is the limitation on data size of the underlying relational database. In other words, ROLAP itself places no limitation on data amount.
- Can leverage functionalities inherent in the relational database: Often, relational database already comes with a host of functionalities. ROLAP technologies, since they sit on top of the relational database, can therefore leverage these functionalities.
Disadvantage of ROLAP
- Performance can be slow: Because each ROLAP report is essentially a SQL query (or multiple SQL queries) in the relational database, the query time can be long if the underlying data size is large.
- Limited by SQL functionalities: Because ROLAP technology mainly relies on generating SQL statements to query the relational database, and SQL statements do not fit all needs (for example, it is difficult to perform complex calculations using SQL).
- ROLAP technologies are therefore traditionally limited by what SQL can do.
- ROLAP vendors have mitigated this risk by building into the tool out-of-the-box complex functions as well as the ability to allow users to define their own functions.
Hybrid – HOLAP
- HOLAP technologies attempt to combine the advantages of MOLAP and ROLAP.
- For summary-type information, HOLAP leverages cube technology for faster performance.
- When detail information is needed, HOLAP can drill through from the cube into the underlying relational data.
- There is no clear agreement across the industry as to what constitutes “Hybrid OLAP”, except that a database will divide data between relational and specialized storage.
- Sometimes HOLAP database will use relational tables to hold the larger quantities of detailed data, and use specialized storage for at least some aspects of the smaller quantities of more-aggregate or less-detailed data.
- HOLAP addresses the shortcomings of MOLAP and ROLAP by combining the capabilities of both approaches.
- HOLAP tools can utilize both pre-calculated cubes and relational data sources.
Each type has certain benefits, although there is disagreement about the specifics of the benefits between providers.
- Some MOLAP implementations are prone to database explosion, a phenomenon causing vast amounts of storage space to be used by MOLAP databases when certain common conditions are met: high number of dimensions, pre-calculated results and sparse multidimensional data.
- MOLAP generally delivers better performance due to specialized indexing and storage optimizations. MOLAP also needs less storage space compared to ROLAP because the specialized storage typically includes compression techniques.
- ROLAP is generally more scalable. However, large volume pre-processing is difficult to implement efficiently so it is frequently skipped. ROLAP query performance can therefore suffer tremendously.
- Since ROLAP relies more on the database to perform calculations, it has more limitations in the specialized functions it can use.
- HOLAP encompasses a range of solutions that attempt to mix the best of ROLAP and MOLAP. It can generally pre-process swiftly, scale well, and offer good function support.