Skip to main content
cancel
Showing results for 
Search instead for 
Did you mean: 

Get Fabric Certified for FREE during Fabric Data Days. Don't miss your chance! Learn more

tharunkumarRTK

Overcome Initial Full Load Failures in Power BI Incremental Refresh with Bootstrapped Load (1/2)

Building an insightful Power BI dashboard for large volumes of transactional data often requires balancing performance and efficiency. Loading the complete dataset into the semantic model each time new records are added to the data warehouse can be time-consuming and resource-intensive.

 

In most transactional systems, historical data remains unchanged, meaning there is little value in reloading it repeatedly. To address this, Power BI provides an Incremental Refresh policy, which allows only recent data to be refreshed while preserving the historical portion. You can learn more about this feature in the official Microsoft documentation.

 

Once the policy is configured and the model is published to the Power BI Service, the first refresh performs a full data load. Subsequent refreshes are incremental, appending new data to the existing historical data. While this approach works well in most cases, certain scenarios make the initial full load difficult or leads to failure.

 

Scenarios That Cause Initial Full Load Failures
Power BI Refresh Time Limits Power BI imposes refresh time limits. In Pro workspaces, the limit is two hours, while Premium capacities allow up to five hours. If your database cannot return the entire dataset within this window, the refresh will fail with a timeout error. Reference: Power BI Refresh Time Limits

 

Power BI Command Memory Limit Power BI enforces a maximum amount of memory that a semantic model can use during refresh. If the refresh process exceeds this limit, it will fail. Reference: Command Memory Limit

 

Power BI Command Timeout Limit Each Power Query (M) expression has a 10-minute command timeout limit. If the underlying data source cannot return data within this period, the refresh will fail. Reference: Power Query Timeouts

 

Database Workload Management (WLM) Rules Database administrators often enforce workload management policies to balance resource usage across teams. Two such common rules that affect Power BI incremental refresh are:

1. Concurrent Query Limit: If Power BI sends more queries than the allowed threshold, the database may terminate them automatically.

2. Query Scan Volume Limit: This rule restricts the amount of data a single query or user can scan at a time. It often affects refresh operations when the RollingWindowGranularity in the incremental refresh policy is large.

During the initial full load, Power BI sends multiple queries to the data source. Some of these may scan a significant amount of data, triggering one or more of the limits above and causing the load to fail.

Modifying these limits may not be possible for model developers. This raises an important question — can we load all data incrementally instead of performing one large initial load? The answer is yes, using a technique called Bootstrapped Initial Refresh

What is Bootstrapped Initial Refresh?
Bootstrapped Initial Refresh allows you to create all necessary partitions in your semantic model without performing a full data load. Instead, you load an empty dataset during the initial refresh, avoiding timeout and memory issues.

Let’s look at how this can be implemented.

 

Step 1: Prepare the Semantic Model
In this example, we connect to a SQL Server database containing a table of sales transactions.

Base Query

SELECT [profit_margin],
[quantity_sold],
[customer_name],
[customer_email],
[purchase_date],
[payment_method],
[shipping_address],
[product_name],
[product_category],
[product_price]
FROM [dbo].[sales_rawtransactions]


We then create two parameters, RangeStart and RangeEnd, required for the incremental refresh policy, and add them to the WHERE clause.

WHERE [purchase_date] >= 'RangeStart'
AND [purchase_date] < 'RangeEnd'


Step 2: Bootstrap the Table
To bootstrap the table, we introduce a filter condition that always evaluates to false, such as 1 = 2. This ensures the query returns no rows during the initial load.

WHERE 1 = 2
AND [purchase_date] >= 'RangeStart'
AND [purchase_date] < 'RangeEnd'


After applying this condition, the table becomes empty.

tharunkumarRTK_1-1760954936537.png


You can now load it into Power BI and configure the incremental refresh policy.

 

tharunkumarRTK_2-1760954963394.png
When the model is published to the Power BI Service and the initial refresh is triggered, it completes quickly because no rows are returned.

tharunkumarRTK_3-1760955001827.png


However, Power BI still creates all necessary partitions based on the defined refresh policy.

 

tharunkumarRTK_6-1760955221747.png

 


This approach ensures the model structure is created without consuming significant resources or triggering the limits mentioned earlier.

 

Step 3: Remove the Filter Using the XMLA Endpoint
Once the initial load completes successfully, the next step is to remove the false condition (1 = 2). You cannot do this from Power BI Desktop because republishing the model would overwrite it and re-trigger the full load. Instead, you can use SQL Server Management Studio (SSMS) and connect to the XMLA Endpoint of the workspace.

In SSMS, choose Analysis Services as the connection type and authenticate using Microsoft Entra credentials.

tharunkumarRTK_7-1760955263604.png
2. In Object Explorer, right-click your semantic model and select:

tharunkumarRTK_8-1760955287602.png

3. Locate the M expression containing your query, remove the 1=2 condition,

tharunkumarRTK_9-1760955312960.png


and execute the script.

tharunkumarRTK_10-1760955359710.png

 



This updates the dataset definition without replacing the model.

Step 4: Load Data into All Partitions
Since the initial load produced an empty table, none of the partitions contain data. Triggering a standard refresh now will only refresh the most recent partition, as defined by the incremental policy. To populate the historical partitions, you must refresh them in batches.

Several options are available for this:

Refreshing all partitions simultaneously may again exceed resource limits. Therefore, batching refresh operations is essential.

 

In my next blog I will explain how you can automate the process of refreshing all partitions sequentially using a python script 

Comments

this is so helpful