Archive | Architecture RSS for this section

Normalizing your data model

A positive thing about Qlikview is that it is oriented to technical and business users. However, having inexperienced people creating data models will, with no doubt, create headaches in the near future.

I thought it would be useful to share with you how a good data model should look like, by normalizing the tables or data we have.

Normalization is the processes of organizing data in a database. When we say “organizing” we mean, creating proper table structures in a way that we can then stablish relationships among them. One of the main reasons for this is to avoid data redundancy.

In this article we are going to cover Read More…

Advertisements

Working with data at different levels of granularity

I’m currently working in a project for a client that reports on data with different levels of granularity. Of course this becomes a problem as the architecture becomes a bit messy. However, it becomes really interesting the way you can play with Qlikview even when you have a de-normalize schema.
I’ll try to explain what I want to achieve with an example.
Let’s suppose we have two fact tables. I will not include any dimension tables to make the example simpler, but I will still have the dimension values within the fact table.
My first fact table contains data at the day/city level. It looks like this:

Date
Continent
Country
State
City
Downloads

From this table I can know the number of downloads for a song I uploaded to my site. This is by day and City. Having the city I can then create a hierarchy up to the continent level.

My second table, on the other hand, represents the number of times my song was played online (streaming) on my website. This is by day, and country.

Date
Country
NumOfTimesPlayed

Knowing that we cannot de-aggregate this last table to the city/day level, I needed to find a way to show the number of times my song was played when the users choose continent.

Joining the tables by day is not an option as we might have data in the first table for a certain day that we don’t have on the second table and vice versa. For this example, we assume that we have the same countries in the first and second table.

Within our physical model, there’s no connection between my first and second fact table. The good thing is Qlikview allows us to create logical data models.

A Logical data is based on business requirements and doesn’t have to follow strict DB definition. To clarify, joining two tables with different levels of granularity is not correct when working on your physical model, but it can be done in a logical model.

After joining my two tables I have this:

And the data in my tables would look like this:

If I select America, both tables will filter USA and Argentina but I will see different dates. What I need is my second table to show the exact same dates and countries as my first one.

Naturally, I can’t achieve what I want with this data model, so I need to find a way to make the second table show whatever I select in my first table and for this we can use the P() function. The P() function represents all possible values and allows a field name as a parameter.

I will now change my expression in the second table from:

sum(NumOfTimesPlayed)

To:

Sum({<DatePlayed P(DateDownload)>} NumOfTimesPlayed)

My model will now respond as expected and the second table will show only those countries and dates filtered in my first table.

Conclusion

It is ideal that our model contains data with the same level of granularity, but there are many cases that this won’t happen. Qlikview allow us to easily find a work around to manage this complex, de-normalize models.

JV