Power BI is a business intelligence tool that combines data visualization and business analytics to make data-driven decisions. Hybrid Tables is a Premium-only feature that strikes a correct balance between query performance and data freshness. Chances are that you have been musing about: What exactly is Hybrid Tables and why do you need it? Well, before answering this question in my blog today, allow me to explain a concept in Power BI called Storage Mode.
When you start building a report, you get the following three storage modes:
- Import mode (commonly used, fast and full featured but caches a lot of memory)
- DirectQuery (uses no memory for storage but it showers and limits many reporting features)
- Dual (data is imported into cache memory and can be served directly from the data source at the query time)
The tables in the data model can either be stored in Import storage mode or in DirectQuery mode. Although the import mode provides the optimum performance, data freshness can be a problem if new data arrives at a very rapid rate. Importing data into a dataset in extremely short intervals is both resource-consuming and inefficient. On the flipside, DirectQuery offers data freshness at the risk of report performance.
Recently, Microsoft announced a new feature: public preview of Hybrid Tables in Power BI Premium. In Hybrid Tables, you can mix Direct Query and Imported data on a table level. By dynamically integrating in-memory caches of historical data with real-time data requested in DirectQuery mode, Hybrid Tables will considerably increase performance. This implies that data changes in the data warehouse can now be represented in report visualizations without the need to refresh the dataset, resulting in lightning-fast performance across heavy data. When used in conjunction with Power BI’s incremental refresh capability, Hybrid Tables open enormous datasets for interactive analysis.
In the import mode, incremental refresh uses an advanced partitioning method to expedite refreshes compared to importing the whole data volume into a single table partition each time. Now with this advanced updated feature, you can quickly add a DirectQuery partition to receive the most up-to-date data in real time using Hybrid Tables.
So how to configure an incremental refresh policy in real time with Power BI Desktop?
1) Load the fact and dimension tables from SQL Database through Import mode. You can use Hybrid Tables only for Import mode and add “real-time” partition on top using DirectQuery for the latest data.
After loading the table, go to the edit query to set the parameters.
2) You have to configure the parameters in the query editor to filter the table based on date. Click on the parameter option to set up incremental refresh based on Hybrid Tables by creating RangeStart and RangeEnd parameter.
RangeStart-
RangeEnd-
3) Now you have to set up the custom filter, which is after RangeStart time and before RangeEnd time, within this time Power BI will import the data. The data after the RangeEnd time will be picked by DirectQuery by Incremental refresh based on Hybrid Tables.
4) Now you have to set up Incremental Refresh for your fact table-
In this window, you can archive data from 11 years and data from 2 years scheduled for incremental refresh. Now to set up Hybrid Tables you need to enable the ‘get data latest in real time with Direct Query’ option.
Note that Hybrid Tables is a Power BI Premium feature. You can only upload this dataset in Premium workspace only.
5) After setting incremental policy, you need to publish report to Power BI service with the premium account.
6) Now you can see your published report and dataset in the Power BI services workspace. (Name-Hybrid Table)
7) So now to get the incremental data, you need to set the schedule refresh by going to the dataset setting >> Schedule Refresh and Performance Optimization>>Enable Daily Refresh Frequency to daily (depends on your requirement)
8) After updating the settings, you need to refresh the report to see the real-time changes that have been done in database. So go to the dataset, click on the manual refresh icon (optional) and refresh the dataset. It may take a while for table partitions to be created and loaded to historical data.
9) After completing the refresh, go back to the report where you can see that you have loaded 11 years of historical data and 2 years of incremental data in the report with the help of this advanced incremental refresh feature.
So, what are the benefits of Hybrid Tables?
Let us see how Hybrid Tables are beneficial for businesses-
1) You can optimize resource use, mainly for massive datasets. Few data refresh cycles are required to keep up the most recent changes with new real time update data feature.
2) With Hybrid tables, you can improve report performance and user experience with more available premium capacity resources.
3) Data imported during data refresh is immediately available in local memory, so the import-mode partitions provide extraordinarily rapid query performance.
4) Using Hybrid Tables, you can quickly add a DirectQuery partition to access the most up-to-date data in real time.
Some considerations before choosing Hybrid Tables:
There are certain prerequisites that you should keep in mind before choosing Hybrid Tables-
1) To work with Hybrid Tables, you need one of the P/EM/A SKUs, or PPU (Premium Per User) subscriptions.
2) You cannot republish the same dataset from Power BI desktop after it’s been published to Power BI service since that would delete existing partitions and data.
3) You can’t download the PBIX containing the dataset back to Power BI desktop after publishing it to Power BI service.
I hope with the help of my blog you will be able to configure the incremental refresh policy in your Power BI desktop in real-time. If you are fascinated with big data, do explore Nitor Infotech’s big data services to drive business growth. Reach out to us at Nitor Infotech if you have any suggestions, queries and comments related to this topic.