Data Modeling in Qlikview
As you probably may have experienced, QlikView performs better when the data source is a perfect star schema. However, many times, a star schema is not feasible or preferred given the requirements of the BI application, the amount of time budgeted for development, or the resources of the team tasked with building an app. There are often challenges in dealing with dimensions that have multiple granularity levels and large data volumes or multiple fact tables that have different dimensions and/or multiple relationship levels.
Below are a few tips for getting the most out of your QlikView data model.
When the source data are from tables that have more than one common column, QlikView will end up adding synthetic keys to tables and create a more complicated data model when it translates relational data into associative data. Common columns here are meant to be those columns having identical column names.
The scenario below looks at a simplified case where 1 fact table and 4 dimension tables from the source are in a relational database. The relationships are mapped out nicely in diagram 1.
However, importing the data as is into Qlikview yields the data model shown in Diagram 2. As you can see, not only it is not a star schema any more, but also there are multiple synthetic keys that are going to negatively impact the QlikView application front end features and performance.
Diagram 2. Qlikview data model (after a star schema data source is loaded as is)
The solution in this case can be concatenating the common keys to produce an unique key:
1) Concatenate SalesPersonID and TerritoryID to resolve the relationship between Sales fact and SalesPerson dimension.
2) Concatenate CustomerID and TerritoryID to resolve the relationship between Sales fact and Customer dimension.
As a result, QlikView data model is now a simple star schema as shown in diagram 3.
When source data from a relational database is modeled as a start schema, it is not necessarily that
a star schema is the right data model for QlikView applications.
The simple star schema in the scenario above is rarely implemented in real QlikView applications because most of the BI applications have to deal with more than one fact table in the same application, horizontal hierarchy, or more than one level of relationships for associated dimensions.
QlikView can work with data sources from different databases. When the source data are brought in from different databases each modeled with a star schema, star schema is not necessarily the data model feasible or best suited for the QlikView application. When there are more than one fact table each with multiple dimensions, the best option is to consolidate all facts into one table if possible, which often resulted in snow flake instead of star schema. If it is not possible to consolidate all facts needed into one fact table, a link table may have to be used, which will often resulted in a modified star schema or snow flake data model.
A simple star schema will usually result in the best performance for Qlikview applications. However, when denormalizing dimensions in order to achieve star schema resulted in millions of repeated data in the dimension table(s), the impact of that on the performance may overweight the benefits of a star schema. Then, it is time to consider snow flake instead. If multiple fact tables can be consolidated into one fact table, even if it resulted in a snow flake, usually it is better than a link table that is a modified star schema unless the link table is very simple, in which case, a link table is probably not necessary to start with.