- Multidimensional structure is defined as the variation of the relational model that uses multidimensional structures to organize data and express the relationships between data.
- The ability to analyze metrics in different dimensions such as time, geography, gender, product, etc.
- The structure is broken into cubes and the cubes are able to store and access data within the confines of each cube.
- Each cell within a multidimensional structure contains aggregated data related to elements along each of its dimensions.
- Even when data is manipulated it remains easy to access and continues to constitute a compact database format.
- The data still remains interrelated. Multidimensional structure is quite popular for analytical databases that use online analytical processing (OLAP) applications.
- Analytical databases use these databases because of their ability to deliver answers to complex business queries swiftly.
- Data can be viewed from different angles, which gives a broader perspective of a problem unlike other models.
- It has been claimed that for complex queries OLAP cubes can produce an answer in around 0.1% of the time required for the same query on OLTP relational data.
- The most important mechanism in OLAP which allows it to achieve such performance is the use of aggregations.
- Aggregations are built from the fact table by changing the granularity on specific dimensions and aggregating up data along these dimensions.
- The number of possible aggregations is determined by every possible combination of dimension granularities.
- The combination of all possible aggregations and the base data contains the answers to every query which can be answered from the data.
- Because usually there are many aggregations that can be calculated, often only a predetermined number are fully calculated; the remainders are solved on demand.
- The problem of deciding which aggregations (views) to calculate is known as the view selection problem.
- View selection can be constrained by the total size of the selected set of aggregations, the time to update them from changes in the base data, or both.
- The objective of view selection is typically to minimize the average time to answer OLAP queries, although some studies also minimize the update time.
- View selection is NP-Complete.
- Many approaches to the problem have been explored, including greedy algorithms, randomized search, genetic algorithms and a search algorithm.
Online Analytical Processing (OLAP)
- OLAP is an approach to answer all multi-dimensional analytical (MDA) queries.
- The first attempt to provide a definition to OLAP was by Dr. Codd, who proposed 12 rules for OLAP.
- The OLAP Report has proposed the FASMI test, Fast Analysis of SharedMultidimensional I
- OLAP is part of business intelligence, which also encompasses relational database, report writing and data mining.
- OLAP enable users to analyze multidimensional (MDX) data interactively from multiple business perspectives.
- Databases configured for OLAP use a multidimensional data model, allowing for complex analytical and ad hoc queries with a rapid execution time.
- They borrow aspects of navigational databases, hierarchical databases and relational databases.
OLAP consists of three basic analytical operations:
- Consolidation (Roll-up): Aggregation of data accumulated & computed in one or more dimensions.
- Slicing and Dicing
- All sales offices are rolled up to the sales department or sales division to anticipate sales trends. By contrast, the drill-down is a technique that allows users to navigate through the details.
- For instance, users can view the sales by individual products that make up a region’s sales.
- Slicing and dicing is a feature whereby users can take out (slicing) a specific set of data of the OLAP cube and view (dicing) the slices from different viewpoints.
- These viewpoints are sometimes called dimensions (such as looking at the same sales by salesperson or by date or by customer or by product or by region, etc.)
Overview of OLAP Systems
- At the core of any OLAP system there is an OLAP Cube (also called a multidimensional cube or a hypercube).
- It consists of numeric facts called measures that are categorized by dimensions.
- The measures are placed at the intersections of the hypercube, which is spanned by the dimensions as a vector space.
- The usual interface to manipulate an OLAP cube is a matrix interface, like Pivot tables in a spreadsheet program, which performs projection operations along the dimensions, such as aggregation or averaging.
- The cube metadata is typically created from a star schema or snowflake schema or fact constellation of tables in a relational database.
- Measures are derived from the records in the fact table and dimensions are derived from the dimension tables.
- Each measure can be thought of as having a set of labels, or meta-data associated with it.
- A dimension is what describes these labels; it provides information about the measure
OLAP systems have been traditionally categorized using the following taxonomy.
- MOLAP – Multidimensional
- ROLAP – Relational
- HOLAP – Hybrid