Power BI Premium becomes a world-class capacity-based Analytics SaaS product powering mission-critical, enterprise-grade analytics solutions
The new platform assures reliable, unmatched support for large-scale analytics, simple low-overhead administration, and introduces “Autoscale” — an optional add-on providing automatic temporary upscaling to address the ever-dynamic demand for computing power. Now more than ever before, Power BI Premium capacities enable unmatched analytics solutions delivery on an enterprise-scale, both efficient and cost-effective. The new platform serves as the bedrock on top of which new and exciting Power BI capabilities will be built from hereon, and using this platform will be the key to unlocking the full potential in all upcoming new features.
Administrators of next-gen premium capacities will find it easier than ever to track the essential resources a Premium capacity uses, know when opting-in to overload slowdown protection via Autoscale is advisable, or if getting a larger capacity is recommended.
After a year in preview, the next-gen platform is now battle-proven to meet the demands of the broadest range of analytics solutions: self-service business, datasets, centrally curated and distributed pixel-perfect reports, and anything in between.
Your capacity’s cpu power is derived from how many backend cores it has, with each backend core adding 30 seconds of CPU processing power to your total power.
During the preview period, Autoscale was enabled free of charge to allow early adopters to grow accustomed to the new platform and adapt any operational practices to its capabilities. Now that the platform is Generally Available, Power BI will begin charging for Autoscale cores that are added to each capacity per the previously announced for Autoscale. Autoscale charges will begin taking place on November 4th 2021.
Power BI Premium Generation 2, referred to as Premium Gen2 for convenience, is an improved and architecturally redesigned generation of Power BI Premium.

Premium Gen2 provides the following updates or improved experiences:
- Ability to license Premium Per User in addition to by capacity.
- Enhanced performance on any capacity size, anytime: Analytics operations run-up to 16X faster on Premium Gen2. Operations will always perform at top speed and won’t slow down when the load on the capacity approaches the capacity limits.
- Greater scale:
- – No limits on refresh concurrency, no longer requiring you to track schedules for datasets being refreshed on your capacity
- – Fewer memory restrictions
- – Complete separation between report interaction and scheduled refreshes
- Improved and streamlined metrics with clear and normalized capacity utilization data depends only on the complexity of analytics operations the capacity performs, not on its size, the level of load on the system while performing analytics, or other factors. With the improved metrics, utilization analysis, budget planning, chargebacks, and the need to upgrade are visible with built-in reporting.
- Autoscale is an optional feature that automatically adds one v-core at a time for 24-hour periods when the load on the capacity exceeds its limits, preventing slowdowns caused by overload. Additional v-cores are charged to your Azure subscription on a pay-as-you-go basis. See using Autoscale with Power BI Premium for steps on how to configure and use Autoscale.
- Reduced management overhead with proactive and configurable admin notifications about capacity utilization level and load increasing.
Enabling Premium Gen2
Enable Premium Gen2 to take advantage of its updates. To enable Premium Gen2, take the following steps:
- In the admin portal, navigate to Capacity settings.
- Select Power BI Premium.
- If you have already allocated capacity, select it.
- A section appears titled Premium Generation 2, and in that section is a slider to enable Premium Generation 2.
- Move the slider to Enabled.
The following short video shows how to enable Premium Gen2.
Optionally, you can also configure and use Autoscale with Power BI Premium to ensure capacity and performance for your Premium users.
Workspaces and Premium Gen2
Workspaces reside within capacities. Each Power BI user has a personal workspace known as My Workspace. Additional workspaces known as workspaces can be created to enable collaboration. By default, workspaces, including personal workspaces, are made in the shared capacity. When you have Premium capacities, both My Workspaces and workspaces can be assigned to Premium capacities.
Capacity administrators automatically have their my workspaces assigned to Premium capacities.
Capacity nodes for Premium Gen2
With Premium Gen2 and Embedded Gen 2, the amount of memory available on each node size is set to the limit of the memory footprint of a single artifact and not to the cumulative consumption of memory. Thus, for example, in Premium Gen2 P1 capacity, only a single dataset size is limited to 25 GB, compared to the original Premium, where the total memory footprint of the datasets being handled simultaneously was limited to 25 GB.
Refresh in Premium Gen2
Premium Gen2 and Embedded Gen 2 don’t require cumulative memory limits, and therefore concurrent dataset refreshes don’t contribute to resource constraints. There is no limit on the number of refreshes running per v-core. However, the refresh of individual datasets continues to be governed by existing capacity memory and CPU limits. You can schedule and run as many refreshes as required at any given time, and the Power BI service will run those refreshes at the time planned as a best effort.
Monitoring in Gen2
Monitoring in Premium Gen2 intends to simplify monitoring and management of Premium capacities. Premium Gen2 customers can adapt their monitoring approach from a tool to ensure their Premium capacities are running correctly, into a tool that alerts them if attention should be applied to correct over usage or if more resources are required. In other words, rather than constantly having to monitor for issues and adjust, Premium Gen2 aims to assure that everything is running correctly and only alerts users if they must act.
Updates for Premium Gen2 and Embedded Gen2 — Premium Gen2 and Embedded Gen 2 only require monitoring a single aspect: how much CPU time your capacity needs to serve the load at any moment.
This reduction in the need for monitoring is a departure from the many metrics that the original version of Power BI Premium required. As a result, organizations that created a cadence of monitoring and reporting on their original Premium capacities will need to transition their existing rhythm of monitoring their Premium Gen2 capabilities due to the streamlined metrics and monitoring requirements of Premium Gen2.
In Premium Gen2, if you exceed your CPU time per the SKU size you purchased, your capacity either autoscales to accommodate the need (if you’ve optionally enabled autoscale), or throttles your interactive operations, based on your configuration settings.
In Embedded Gen 2, your capacity throttles your interactive operations based on your configuration settings if you exceed your CPU time per the SKU size you purchased. To autoscale in Embedded Gen 2, see Autoscaling in Embedded Gen2.
Updates for Premium Gen2
Premium Gen2 and Embedded Gen 2 capacities use the Capacity Utilization App.
You can download and install the metrics app for Premium Gen2 and Embedded Gen2 using the following link.
Paginated reports and Premium Gen2
In Premium Gen2 and Embedded Gen2, there is no memory management for Paginated reports. With Premium Gen2 and Embedded Gen2, Paginated reports are also supported on the EM1-EM3 and A1-A3 SKUs.
When using Premium Gen2, Paginated reports in Power BI benefit from the architectural and engineering improvements reflected in Premium Gen2. The following sections describe the benefits of Premium Gen2 for Paginated reports.
- Broader SKU availability — Paginated reports running on Premium Gen2 can run reports across all available embedded and Premium SKUs. In addition, billing is calculated per CPU hour, across a 24-hour period. This greatly expands the SKUs that support Paginated reports.
- Dynamic scaling — With Premium Gen2, challenges associated with spikes in activity, or need for resources, can be handled dynamically as the need arises.
- Improved caching — Before Premium Gen2, Paginated reports were required to perform many operations in the context of memory allocated on the capacity for the workload. Now, using Premium Gen2, reductions in the required memory for many operations enhance customers’ ability to perform long-running operations without impacting other user sessions.
- Enhanced security and code isolation — With Premium Gen2, code isolation can occur at a per-user level rather than per-capacity, as was the case in the original Premium offering.
To learn more, see Paginated reports in Power BI Premium. To learn more about enabling the Paginated reports workload, see Configure workloads.
Subscriptions and licensing
Power BI Premium Gen2 is a tenant-level Microsoft 365 subscription available in two SKU (Stock-Keeping Unit) families:
- P SKUs (P1-P5) for embedding and enterprise features require a monthly or yearly commitment, billed monthly, and include a license to install Power BI Report Server on-premises.
- EM SKUs (EM1-EM3) for organizational embedding, requiring a yearly commitment, are billed monthly. EM1 and EM2 SKUs are available only through volume licensing plans. You can’t purchase them directly.
In addition, Premium Per User has the benefits available with Premium Gen2, but on an individual user basis.
Purchasing
Administrators purchase power BI Premium subscriptions in the Microsoft 365 admin center. Specifically, only Global administrators or Billing Administrators can purchase SKUs. When purchased, the tenant receives a corresponding number of v-cores to assign to capacities, known as v-core pooling. For example, buying a P3 SKU provides the tenant with 32 v-cores. To learn more, see How to purchase Power BI Premium.
Limitations in Premium Gen2
The following known limitations currently apply to Premium Gen2:
- If you’re using XMLA on Premium Gen2, make sure you’re using the most recent versions of the data modeling and management tools.
- There’s a 225-second limitation for rendering Power BI visuals. Therefore, visuals that take longer to generate will be timed-out and will not display.
- Analysis services features in Premium Gen2 are only supported on the latest client libraries. Estimated release dates for dependent tools to support this requirement are:
- Memory restrictions are different in Premium Gen2 and Embedded Gen 2. In the first generation of Premium and Embedded, memory was restricted to a limited amount of RAM used by all artifacts simultaneously running. In Gen2, there is no memory limit for the capacity as a whole. Instead, individual artifacts (such as datasets, dataflows, paginated reports) are subject to the following RAM limitations:
- – A single artifact cannot exceed the amount of memory the capacity SKU offers.
- -The limitation includes all the operations (interactive and background) being processed for the artifact while in use (for example, while a report is being viewed, interacted with, or refreshed).
- -Dataset operations like queries are also subject to individual memory limits, just as they are in the first version of Premium.
- -To illustrate the restriction, consider a dataset with an in-memory footprint of 1 GB, and a user initiating an on-demand refresh while interacting with a report based on the same dataset. Two separate actions determine the amount of memory attributed to the original dataset, which may be larger than two times the dataset size:
- -The dataset needs to be loaded into memory.
- -The refresh operation will cause the memory used by the dataset to double, at least, since the original copy of data is still available for active queries while the refresh is processing an additional copy. However, once the refresh transaction commits, the memory footprint will reduce.
- -Report interactions will execute DAX queries. Each DAX query consumes a certain amount of temporary memory required to produce the results. Therefore, each query may consume a different amount of memory and be subject to the query memory limitation described.
The following table summarizes all the limitations that are dependent on the capacity size:
Power BI Premium Gen2 architecture
Architectural changes in Premium Gen2, especially around how CPU resources are allocated and used, enables more versatility in offerings, and more flexibility in licensing models. For example, the new architecture enables offering Premium on a per-user basis, offered as Premium Per User. The architecture also provides customers with better performance, and better governance and control over their Power BI expenses.
The most significant update in the architecture of Premium Gen2 is the way capacities’ back-end v-cores (CPUs, often referred to as v-cores) are implemented:
- In the original version of Power BI Premium, backend v-cores were reserved physical computing nodes in the cloud, with differences in the number of v-cores and the amount of onboard memory according to the customer’s licensing SKU. Customer administrators were required to keep track of how busy these nodes were, using the Premium metrics app. They had to use the app and other tools to determine how much capacity their users required to meet their computing needs.
- Each administrator had the ability to tweak and configure capacities to avoid resource contention between workloads (datasets, dataflows, paginated reports, and AI) or other performance impactful effects to make sure capacity performance remained tuned or acceptable.
In Premium Gen2, backend v-cores are implemented on regional clusters of physical nodes in the cloud, which are shared by all tenants using Premium capacities in that Power BI region. The regional cluster is further divided into specialized groups of nodes, where each group handles a different Power BI workload (datasets, dataflows, or paginated reports). These specialized groups of nodes help avoid resource contention between fundamentally different workloads running on the same node.
The contents of workspaces assigned to a Premium Gen2 capacity is stored on your organizations capacity’s storage layer, which is implemented on top of capacity-specific Azure storage blob containers, similar to the original version of Premium. This approach enables features like BYOK to be used for your data.
When the content needs to be viewed or refreshed, it is read from the storage layer and placed on a Premium Gen2 backend node for computing. Power BI uses a placement mechanism that assures the optimal node is chosen within the proper group of computing nodes. The mechanism typically places new content on the node with the most available memory at the time the content is loaded, so that the view or refresh operation can gain access to the most resources and can perform optimally.
As your capacity renders and refreshes more content, it uses more computation nodes, each with enough resources to complete operations fast and successfully. This means your capacity may use multiple computational nodes and in some cases, content might even move between nodes due to the Power BI service performing internal load-balancing across nodes or resources. When such load balancing occurs, Power BI makes sure content movement doesn’t impact end-user experiences.
There are several positive results from distributing backend processing of content (datasets, dataflows, and paginated reports) across shared backend nodes:
- The shared nodes are at least as large as an original Premium P3 node, which means there are more v-cores to perform any operations, which can increase performance by up to 16x when comparing to an original Premium P1.
- Whatever node your processing lands on, the placement mechanism makes sure memory remains available for your operation to complete, within the applicable memory constraints of your capacity. (see limitations section of this doc for full detail of memory constraints)
- Internal noisy neighbor problems in your capacity don’t occur, since each of the view and refresh operations uses its own set of physical v-cores, with their own memory, on different computing nodes.
- Cross-workloads resource contention is prevented by separating the shared nodes into specialized workload groups. As a result of this separation, there are no controls for paginated report workloads.
- The limitations on different capacity SKUs are not based on the physical constraints as they were in the original version of Premium; rather, they are based on an expected and clear set of rules that the Power BI Premium service enforces:
- Total capacity CPU throughput is at or below the throughput possible with the v-cores your purchased capacity has.
- Memory consumption required for viewing and refresh operations remains within the memory limits of your purchased capacity.
6. Because of this new architecture, customer admins do not need to monitor their capacities for signs of approaching the limits of their resources, and instead are provided with clear indication when such limits are met. This significantly reduces the effort and overhead required of capacity administrators to maintain optimal capacity performance.
Premium Gen2 capacity load evaluation
To enforce CPU throughput limitations, Power BI evaluates the throughput from your Premium Gen2 capacity on an ongoing basis.
Power BI evaluates throughput every 30 seconds. It allows operations to complete, collects execution time on the shared pool physical node’s CPUs, and then for all operations on your capacity, aggregates them into 30-second CPU intervals and compares the results to what your purchased capacity is able to support.
The following image illustrates how Premium Gen2 evaluates and completes queries.
The aggregation is complex. It uses specialized algorithms for different workloads, and for different types of operations, as described in the following points:
- Slow-running operations, such as dataset and dataflow refresh, are considered background operations since they typically run in the background and users don’t actively monitor them or look at them visually. Background operations are lengthy and require significant CPU power to complete during the long process. Power BI spreads CPU costs of background operations over 24 hours, so that capacities don’t hit maximum resource usage due to too many refreshes running simultaneously. This allows Power BI Premium Gen2 subscribers to run as many background operations as allowed by their purchased capacity SKU, and doesn’t limit them like the original Premium generation.
- Fast operations like queries, report loads, and others are considered interactive operations. The CPU time required to complete those operations is aggregated, to minimize the number of 30-seconds windows that are impacted following that operation’s completion.
Premium Gen2 background operation scheduling
Refreshes are run on Premium Gen2 capacities at the time they are scheduled, or close to it, regardless of how many other background operations were scheduled for the same time. Datasets and dataflows being refreshed are placed on a physical processing node that has enough memory available to load them, and then begin the refresh process.
While processing the refresh, datasets may consume more memory to complete the refresh process. The refresh engine makes sure no artifact can exceed the amount of memory that their base SKU allows them to consume (for example, 25 GB on a P1 subscription, 50 GB on a P2 subscription, and so on).
How capacity size limits are enforced when viewing reports
Premium Gen2 evaluates utilization by aggregating utilization records every 30 seconds. Each evaluation consists of 2 different aggregations:
- Interactive utilization
- Background utilization
Interactive utilization is evaluated by considering all interactive operations that completed on or near the current 30-second evaluation cycle.
Background utilization is evaluated by considering all the background operations that completed during the past 24 hours. Each background operation contributes only 1/2880 of its total CPU cost (2880 is the number of evaluation cycles in a 24-hour period).
Each capacity consists of an equal number of frontend and backend v-cores. The CPU time measured in utilization records reflect the backend v-cores’ utilization, and that utilization drives the need to autoscale. Utilization of frontend v-cores is not tracked, and you cannot convert frontend v-cores to backend v-cores.
If you have a P1 subscription with 4 backend v-cores, each evaluation cycle quota equates to 120 seconds (4 x 30 = 120 seconds) of CPU utilization. If the sum of both interactive and background utilizations exceeds the total backend v-core quote in your capacity, and you have not optionally enabled autoscale, the workload for your Gen2 capacity will exceed your available resources, also called your capacity threshold. The following image illustrates this condition, called overload, when autoscale is not enabled.
In contrast, if autoscale is optionally enabled, if the sum of both interactive and background utilizations exceeds the total backend v-core quota in your capacity, your capacity is automatically autoscales (raised) by one v-core for the next 24 hours.
The following image shows how autoscale works.
Autoscale always considers your current capacity size to evaluate how much you use, so if you already autoscaled into one v-core, that v-core is spread evenly at 50% for frontend utilization and 50% for backend utilization. This means your maximum capacity is now at (120 + 0.5 * 30 = 135 seconds) of CPU time in an evaluation cycle.
Autoscale always ensures that no single interactive operation can account for all of your capacity, and you must have two or more operations occurring in a single evaluation cycle to initiate autoscale.
Using Premium Gen2 without autoscale
If a capacity’s utilization exceeded 100% of its resources, and it cannot initiate autoscale due to autoscale being turned off, or already being at its maximum v-core value, the capacity enters a temporary interactive request delay mode. During the interactive request delay mode, each interactive request (such as a report load, visual interaction, and others) is delayed before it is sent to the engine for execution.
The capacity stays in interactive request delay mode if the previous evaluation is evaluated at greater than 100% resource utilization.
Using Autoscale with Power BI Premium
Power BI Premium offers scale and performance for Power BI content in your organization. With Power BI Premium Gen2, many improvements are introduced including enhanced performance, greater scale, improved metrics. In addition, Premium Gen2 enables customers to automatically add compute capacity to avoid slowdowns under heavy use, using Autoscale.
Autoscale uses an Azure subscription to automatically use more v-cores (virtual CPU cores) when the computing load on your Power BI Premium subscription would otherwise be slowed by its capacity. This article describes the steps necessary to get Autoscale working for your Power BI Premium subscription. Autoscale only works with Power BI Premium Gen2.
To enable Autoscale, the following steps need to be completed:
- Select and configure an Azure subscription to use with Autoscale
- Configure Power BI Premium to use the selected Azure subscription for Autoscale
The following sections describe the steps in detail.
Note — Autoscale isn’t available for Microsoft 365 Government Community Cloud (GCC), due to the use of the commercial Azure cloud.
Embedded Gen 2 does not provide an out-of-the-box vertical autoscale feature. To learn about alternative autoscale options for Embedded Gen2, see Autoscaling in Embedded Gen2
Configure an Azure subscription to use with Autoscale
To select and configure an Azure subscription to work with Autoscale, you need to have contributor rights for the selected Azure subscription. Any user with Account admin rights for the Azure subscription can add a user as a contributor. In addition, you must be an admin for the Power BI tenant to enable Autoscale.
To select an Azure subscription to work with Autoscale, take the following steps:
- Log into the Azure portal and select Subscriptions from the left pane. In the following image, the highlighted subscription is called Pay-As-You-Go.
2. Select a subscription. Once selected, you need to create a Resource group to use with Autoscale. Select Resource group from the Settings selections for your selected subscription. Then select the Add button to create a new Resource group.
3. The Create a resource group window appears, where you can name the resource group. In the following image, the resource group is called powerBIPremiumAutoscaleCores. You can name your resource group whatever you prefer. Just remember the name of the subscription, and the name of your resource group, since you’ll need to select it again when you configure Autoscale in the Power BI Admin Portal.
4. When you’re satisfied with the name of the resource group, select the Review + create button in the bottom left corner of the portal pane. Azure validates the information, after which you select the Create button to create the resource group. Once created, you receive a notification in the upper-right corner of the Azure portal, similar to the following:
Okay, you’ve selected the Subscription in the Azure portal that you’ll use for Autoscale, and created a Resource group for that subscription. The next step is to enable Autoscale in the Power BI Admin portal, and link it to the resource group you just created.
Considerations for preview release
When Autoscale is launched in preview, a window to enable customers to become accustomed to the usage levels and CPU core utilization is being provided. During the initial window, charges to the configured Azure subscription used for Autoscale will not be applied. That window is anticipated to be 30 days. The best way to become accustomed to the level of usage your organization is to sign up for utilization alert notifications in the Power BI Admin portal, and to monitor alerts for utilization levels.
Paginated Reports are not included in the process of determining the level of utilization, and whether to Autoscale, during initial window.
Enable Autoscale in the Power BI Admin portal
Once you’ve selected the Azure subscription to use with Autoscale, and created a resource group as described in the previous section, you’re ready to enable Autoscale and associate it with the resource group you created. The person configuring Autoscale must be at least a contributor for the Azure subscription to successfully complete these steps. You can learn more about assigning a user to a contributor role for an Azure subscription.
The following steps show you how to enable and associated Autoscale with the resource group.
- Open the Power BI Admin portal and select Capacity settings from the left pane. Information about your Power BI Premium capacity is displayed.
2. Autoscale only works with Power BI Premium Gen2. Enabling Gen2 is easy: just move the slider to Enabled in the Premium Generation 2 box.
3. Select the Manage auto-scale button to enable and configure Autoscale, and the Auto-scale settings pane appears. Select the Enable auto scale.
4. You can then select the Azure subscription to use with Autoscale. Only subscriptions available to the current user are displayed, which is why you must be at least a contributor for the subscription. Once your subscription is selected, select the Resource group you created in the previous section, from the list of resource groups available to the subscription.
5. Next, assign the maximum number of v-cores to use for Autoscale, and then select Save to save your settings. Power BI applies your changes, then closes the pane and returns the view to Capacity settings, where you can see your settings have been applied. In the following image, there were a maximum of two v-cores configured for Autoscale.
7. Here’s a short video that shows how quickly you can configure Autoscale for Power BI Premium Gen2:
And that’s it — your Power BI Premium Gen2 subscription is now configured to use Autoscale, so users in your organization automatically get the responsiveness they need from their Power BI content and insights, even under periods of heavy use.
Plan your transition to Power BI Premium Gen2
Over the last several months, we’ve been working to make many improvements to Power BI Premium. Changes include updates to licensing, performance, scaling, management overhead, and improved insight to utilization metrics. This next generation of Power BI Premium, referred to as Power BI Premium Gen2, has officially moved from preview to general availability as of October 4, 2021. You can read the announcement about this release in the Power BI blog.
If your organization is using the previous version of Power BI Premium, you’re required to migrate capacities to the modern Gen2 platform. The key dates for you to be aware of are listed below:
- October 4, 2021 — Power BI Premium Gen2 is generally available.
- November 15, 2021 — We start sending notifications reminding customers to migrate.
- January 15, 2022 — Microsoft begins migration of Premium capacities to the modern Gen2 platform for all organizations.
Self-migration to Premium Generation 2
If you want to perform your own migration to the latest platform before January 15, 2022, it’s easy to transition. You simply need to enable Premium Gen2 in the Power BI admin portal. Migrating doesn’t interrupt your Power BI service. The change typically completes within a minute and won’t take more than 10 minutes.
Ready for the next generation? Follow these steps:
- Sign in to the Power BI service as a Power BI capacity admin.
- From the navigation bar, select Settings > Admin portal > Capacity settings.
3. Select Power BI Premium.
4. If you have already allocated capacity, select it.
5. The section Premium Generation 2 appears.
6. Select the slider to switch the setting to Enabled. This step is demonstrated in the following animation:
Transition from preview to Premium Gen 2 general availability
Customers using Power BI Premium Gen2 in preview don’t need to take any action to transition to the general availability release. However, there are some key dates to consider if you’ve been using Autoscale to balance your capacity needs.
To date, organizations that have enabled Autoscale for capacities have gotten the burst processing benefits of Autoscale for free. Beginning November 4, 2021 we’ll begin charging for Autoscale cores. Take one of the following actions:
- You can continue to use Autoscale to enable the automatic use of additional cores during periods of higher-than normal demand on your capacities. Review the pricing details for Premium per capacity add-ons so that you’re aware of upcoming charges.
- Or, to avoid Autoscale charges, disable the feature. Autoscale is an optional feature and benefit of the Premium Gen2 platform. You can choose to not use it.
Migration timeline summary
Power BI Premium Gen2 FAQ
What is Power BI Premium Generation 2?
Power BI Premium recently released a new version of Power BI Premium, Premium Gen2. Premium Gen2 will simplify the management of Premium capacities, and reduce management overhead. For more information about Premium Gen2, see Power BI Premium Generation 2.
How can I control the costs of autoscaling?
Autoscaling is an optional feature of Premium Gen2, and is subject to two limits, each if which is configured by Power BI administrators:
- Proactive limit — a proactive limit sets the rate of expenses that Autoscale can generate, by limiting the number of autoscale v-cores a capacity can use. For example, by setting a maximum autoscale of v-cores to one v-core, you ensure that the maximum charge you can incur is 30 days of autoscaling with one v-core.
- Reactive limit — you can also set a reactive limit to the cost for autoscaling, by setting an expenditure limit on the Azure subscription used with autoscale. If the subscription’s budget is exhausted, Power BI is prevented from using the v-core resources for that subscription, and autoscale shuts off. You can set a budget for the Azure subscription that autoscale uses by following the Azure budget tutorial.
How does resource utilization cause Gen2 to autoscale?
Power BI Premium Gen2 evaluates your level of utilization by aggregating utilization records every 30 seconds. Each evaluation is composed of two different aggregations: Interactive utilization and background utilization.
Interactive utilization is evaluated by considering all the interactive operations that completed on or near the current half-minute evaluation cycle.
Background utilization is evaluated by considering all the background operations that completed during the past twenty-four hours, where each background operation contributes only 1/2880 of its total CPU cost (there are 2880 evaluation cycles in each 24-hour period).
A capacity consists of an equal number of frontend and backend v-cores. The CPU time measured in utilization records reflect the backend v-cores utilization, and this utilization drives the need to autoscale. Utilization of frontend v-cores is not tracked. You cannot convert frontend to backend v-cores.
If you have a P1 subscription with four backend v-cores, each evaluation cycle quota is 4*30 = 120 seconds of CPU utilization. If the sum of both utilizations exceeds the total backend core quota in your capacity, your capacity will autoscale in one v-core for the next 24 hours.
Autoscale always looks at your current capacity size to evaluate how much resource you use. If you have already autoscaled with one v-core, that v-core is spread evenly between frontend and backend at 50% each, meaning your maximum capacity is now at 120+0.5*30 = 135 seconds of CPU time in an evaluation cycle.
Autoscale always makes sure that no single interactive operation can consume all of your capacity, and you must have two or more interactive operations taking place in a single evaluation cycle to initiate autoscale.
What happens to traffic during overload if I don’t autoscale?
If a capacity’s utilization exceeded a 100% and it cannot use autoscale, due to being turned off or already at its maximum v-core utilization value, the capacity enters into a temporary interactive request delay mode, during which each interactive request (such as report load, visual interaction, and so on) is delayed before it is sent to the engine for execution. The amount of delay is proportional to the amount of overload detected. Overload of 100% will incur a delay of 20 seconds, while overloads smaller than 10% are allowed.
The capacity stays in interactive request delay mode if the previous evaluation is at greater than 100% resource usage.
Which operations contribute to interactive utilization, and which to background utilization?
The following events are interactive operations:
- Datasets workload — Report View, Query, XMLA read
- Dataflows workloads
- Paginated Report workload — paginated report render
The following are background operations:
- Datasets workload — scheduled refresh, on-demand refresh, background query (after refresh)
- Dataflows workload — scheduled dataflow refresh
- Paginated reports workload — data driven subscriptions renders
- AI workloads
How can I use my utilization data to predict my capacity needs?
Your metrics report dataset retains 30 to 45 days of data. You can use the report to indicate how close you are to your capacity’s maximum resources, and if you save monthly snapshots, you can compare them to indicate trends of growth and extrapolate the rate in which you will arrive at 100% utilization of your resources.
How can my utilization data inform me I should turn on autoscale?
Utilization data does not currently indicate whether requests were throttled due to capacity being in interactive request delay mode. The information will be added to the utilization app so admins can determine whether users experienced delays, and to what extent the delays are due to overload without autoscaling.
How can I get notified that I’m approaching my max capacity?
The Capacity management page in the Power BI admin portal has a utilization notification checkbox. Users can choose the threshold at which an alert will be triggered (default is 80%) and the email address to which utilization alerts should be sent.
How much data is Power BI storing? How can I retain more?
The Power BI service stores over 90 days of utilization data. Users who need longer data retention can use Bring Your Own Log Analytics (BYOLA) to store more utilization data.
How do I get visibility into resources of Gen2 beyond CPU time?
Today, customers don’t have visibility through utilization data to the memory footprint of their operations, and cannot know ahead of time whether any of their operations is subject to failures.
How do I use utilization data to perform chargebacks?
On the left side of the utilization report, a bar chart visual displays utilization information between workspaces for the time span of the report. The bar chart visual can be used for chargebacks, providing each workspace represents a different business unit, cost center, or other entity to which chargebacks can apply.
Read more at: Medium
Leave a Reply
Want to join the discussion?Feel free to contribute!