About using DirectQuery in Power BI

You tin connect to all sorts of dissimilar information sources when using Ability BI Desktop or the Power BI service, and brand those data connections in dissimilar ways. You tin can import data to Ability BI, which is the most common way to get data, or connect directly to data in the original source repository, which is known as DirectQuery. This article describes DirectQuery capabilities:

  • Dissimilar connectivity options for DirectQuery
  • Guidance for when you lot should consider using DirectQuery rather than import
  • Drawbacks of using DirectQuery
  • Best practices for using DirectQuery

Follow all-time practices for using import versus DirectQuery:

  • Y'all should import data to Power BI wherever possible. Importing takes reward of the loftier performance query engine of Ability BI, and provides a highly interactive and fully featured experience.
  • If your goals can't be met past importing data, consider using DirectQuery. For example, if the data is changing frequently and reports must reflect the latest data, DirectQuery may be best. Yet, using DirectQuery is but feasible when the underlying data source can provide interactive queries, less than 5 seconds for the typical aggregate query, and can handle the query load that will be generated. Additionally, the list of limitations for the use of DirectQuery should exist considered carefully.

The gear up of capabilities offered past Ability BI for import and DirectQuery evolve over time. Changes will include providing more flexibility when using imported data, such that import can be used in more than cases and eliminating some of the drawbacks of using DirectQuery. Regardless of improvements, when using DirectQuery, the performance of the underlying data source always remains a major consideration. If that underlying data source is slow, using DirectQuery for that source will remain unfeasible.

This commodity covers DirectQuery with Power BI, and not SQL Server Assay Services. DirectQuery is also a feature of SQL Server Assay Services. Many of the details described in this article apply to that feature. There are also important differences. For information about using DirectQuery with SQL Server Assay Services, see DirectQuery in SQL Server 2022 Assay Services.

This article focuses on the recommended workflow for DirectQuery, where the report is created in Power BI Desktop, just also covers connecting directly in the Power BI service.

Power BI connectivity modes

Power BI connects to a large number of varied data sources, encompassing:

  • Online services (Salesforce, Dynamics 365, others)
  • Databases (SQL Server, Access, Amazon Redshift, others)
  • Simple files (Excel, JSON, others)
  • Other information sources (Spark, Web sites, Microsoft Exchange, others)

For these sources, information technology's possible to import the data to Power BI. For some, it's as well possible to connect using DirectQuery. For a summary of the sources that back up DirectQuery, see Information Sources supported by DirectQuery. More sources will exist DirectQuery enabled in the futurity, focusing primarily on sources that can be expected to evangelize skilful interactive query performance.

SQL Server Analysis Services is a special case. When connecting to SQL Server Analysis Services, you lot can choose to import the data or use a alive connection. Using a live connectedness is like to DirectQuery. No data is imported and the underlying data source is always queried to refresh a visual. A alive connexion is different in many other regards, so a dissimilar term, alive connection versus DirectQuery, is used.

These three options for connecting to data: import, DirectQuery, and live connection.

Import connections

For import, when using Get Data in Power BI Desktop to connect to a information source like SQL Server, the behavior of that connexion is as follows:

  • During the initial Go Data feel, the set of tables selected each define a query that will return a gear up of data. Those queries can be edited before loading the data, for instance, to utilize filters, or aggregate the data, or bring together unlike tables.
  • Upon load, all of the data defined by those queries volition be imported into the Power BI enshroud.
  • Upon building a visual inside Power BI Desktop, the imported data will be queried. The Power BI store ensures the query will exist fast. All changes to the visual are reflected immediately.
  • Any changes to the underlying data aren't reflected in any visuals. It's necessary to Refresh to reimport data.
  • Upon publishing the report as a .pbix file to the Ability BI service, a dataset is created and uploaded to the Power BI service. The imported data is included with that dataset. Information technology's then possible to schedule refresh of that data, for example, to reimport the data every day. Depending upon the location of the original data source, it might be necessary to configure an on-premises information gateway.
  • When opening an existing study in the Power BI service, or authoring a new study, the imported data is queried again, ensuring interactivity.
  • Visuals, or entire report pages, can be pinned as dashboard tiles. The tiles automatically refresh whenever the underlying dataset refreshes.

DirectQuery connections

For DirectQuery, when using Get Data in Ability BI Desktop to connect to a data source, the behavior of that connectedness is equally follows:

  • During the initial Go Data experience, the source is selected. For relational sources, a ready of tables are selected and each still define a query that logically returns a set of data. For multidimensional sources, like SAP BW, merely the source is selected.
  • However, upon load, no data is imported into the Power BI store. Instead, upon edifice a visual within Ability BI Desktop, queries are sent to the underlying information source to retrieve the necessary data. The time taken to refresh the visual depends on the performance of the underlying data source.
  • Any changes to the underlying data aren't immediately reflected in any existing visuals. It's nonetheless necessary to refresh. The necessary queries are resent for each visual, and the visual is updated every bit necessary.
  • Upon publishing the study to the Power BI service, it volition over again upshot in a dataset in the Ability BI service, the same as for import. However, no data is included with that dataset.
  • When opening an existing report in the Power BI service, or authoring a new one, the underlying information source is again queried to retrieve the necessary information. Depending upon the location of the original data source, it might be necessary to configure an on-bounds data gateway, as is needed for import mode if the information is refreshed.
  • Visuals, or entire report pages, can be pinned every bit Dashboard tiles. To ensure that opening a dashboard is fast, the tiles are automatically refreshed on a schedule, for case, every 60 minutes. The frequency of this refresh can be controlled, to reflect how often the data is changing, and how important it'southward to run across the latest information. When opening a dashboard, the tiles reflect the data at the time of the last refresh, and non necessarily the latest changes made to the underlying source. You tin refresh an open dashboard to ensure it's electric current.

Live connections

When connecting to SQL Server Analysis Services, at that place's an option to either import data from or connect live to, the selected information model. If you use import, you define a query against that external SQL Server Analysis Services source, and the information is imported as normal. If yous use connect live, there's no query defined, and the unabridged external model is shown in the field listing.

The situation described in the previous paragraph applies to connecting to the following sources equally well, except that in that location'south no pick to import the data:

  • Power BI datasets, for case, when connecting to a Power BI dataset that has previously been created and published to the service, to author a new study over information technology.
  • Microsoft Dataverse.

The beliefs of reports over SQL Server Analysis Services, upon publishing to the Ability BI service, is similar to DirectQuery reports in the following ways:

  • When opening an existing report in the Power BI service or authoring a new report, the underlying SQL Server Analysis Services source is queried, possibly requiring an on-premises information gateway.
  • Dashboard tiles are automatically refreshed on a schedule, such as every hour.

At that place are also of import differences. For instance, for live connections, the identity of the user opening the written report is ever passed to the underlying SQL Server Analysis Services source.

With these comparisons out of the fashion, permit'due south focus solely on DirectQuery for the rest of this article.

When is DirectQuery useful?

The following table describes scenarios where connecting with DirectQuery could be specially useful. Information technology includes cases where leaving the information in the original source would be considered beneficial. The description includes a give-and-take about whether the specified scenario is available in Power BI.

Limitation Description
Data is changing oftentimes, and near real-time reporting is needed Models with imported data can be refreshed at most one time per 60 minutes (more often with Power BI Pro or Power BI Premium subscriptions). If the data is continually irresolute, and information technology'due south necessary for reports to evidence the latest data, using import with scheduled refresh might not encounter those needs. You can stream information direct into Power BI, though there are limits on the data volumes supported for this case.

Using DirectQuery, by contrast, ways that opening or refreshing a written report or dashboard e'er shows the latest data in the source. Additionally, the dashboard tiles can exist updated more than frequently, every bit often as every fifteen minutes.

Data is very large If the data is very large, it wouldn't exist feasible to import it all. DirectQuery, past contrast, requires no large transfer of information, because it's queried in place.

Yet, large data might besides imply that the performance of the queries against that underlying source is as well slow, as discussed in Implications of using DirectQuery. You don't always have to import the total detailed information. Instead, the data tin be pre-aggregated during import. The Power Query Editor makes it easy to pre-aggregate during import. In the farthermost, it would exist possible to import exactly the aggregate data needed for each visual. While DirectQuery is the simplest approach to large information, importing aggregate data might offering a solution if the underlying source is too deadening.

Security rules are defined in the underlying source When the data is imported, Power BI connects to the data source using the electric current user'due south credentials from Power BI Desktop, or the credentials defined as role of configuring scheduled refresh from the Power BI service. In publishing and sharing such a report with information in import fashion, be careful to only share with users allowed to see the same data, or to define row-level security every bit function of the dataset.

DirectQuery allows for a report viewer's credentials to be passed through to the underlying source and security rules to be applied at that place. Single sign-on is supported to SQL Azure datasources, and through the data gateway to on-premises SQL servers. This is covered in more detail in Overview of single sign-on (SSO) for gateways in Power BI.

Data sovereignty restrictions apply Some organizations have policies around data sovereignty, meaning that data tin't get out the organization premises. A solution based on import would clearly present issues. By dissimilarity, with DirectQuery that data remains in the underlying source.

However, fifty-fifty with DirectQuery, some caches of information at the visual level are kept in the Power BI service because of scheduled refresh of tiles.

Underlying information source is an OLAP source, containing measures If the underlying data source contains measures, such as SAP HANA or SAP Business Warehouse, then importing the information brings other issues. It means that the data imported is at a particular level of aggregation, as defined by the query. For example, measures TotalSales past Class, Twelvemonth, and City. Then if a visual is built asking for information at a higher-level aggregate, such as TotalSales past Yr, information technology'due south further aggregating the aggregate value. This aggregation is fine for additive measures, such every bit Sum and Min, merely it's an effect for non-condiment measures, such as Boilerplate, DistinctCount.

To make it easy to get the correct aggregate information, as needed for the particular visual, direct from the source, it would be necessary to send queries per visual, as in DirectQuery.

When connecting to SAP Business Warehouse (BW), choosing DirectQuery allows for this handling of measures. For information about SAP BW, see DirectQuery and SAP BW.

Notwithstanding, currently DirectQuery over SAP HANA treats information technology the same as a relational source, and provides like behavior to import. This approach is covered farther in DirectQuery and SAP HANA.

In summary, given the electric current capabilities of DirectQuery in Power BI, information technology offers the benefits in the following scenarios:

  • Information is irresolute oftentimes, and nigh real-time reporting is needed.
  • Handling very large data, without the demand to pre-aggregate.
  • Data sovereignty restrictions apply.
  • The source is a multidimensional source containing measures, such as SAP BW.

The details in the previous list relate to the utilise of Power BI alone. For more information almost using large models in Power BI, see large datasets in Power BI Premium. There's no restriction on how frequently the information can exist refreshed.

Implications of using DirectQuery

Use of DirectQuery does accept potentially negative implications, every bit detailed in this section. Some of those limitations are slightly different depending upon the verbal source that is being used. We address limitations where applicative, and split up articles encompass those sources that are substantially different.

Performance and load on the underlying source

When using DirectQuery, the overall experience depends very much on the performance of the underlying data source. If refreshing each visual, for instance, after changing a slicer value, takes a few seconds, usually less than 5 seconds, the experience would be reasonable. The experience might feel sluggish compared to the immediate response when importing the data to Power BI. If the slowness of the source causes private visuals to take longer than tens of seconds, the experience becomes extremely poor. Queries may even fourth dimension out.

Along with the operation of the underlying source, pay attention to the load placed upon the source. Load impacts operation. Each user who opens a shared report, and each dashboard tile that refreshes, sends at least one query per visual to the underlying source. This fact requires that the source tin can handle such a query load, while even so maintaining reasonable performance.

Security implications when combining data sources

It's possible to utilize multiple data sources in a DirectQuery model, just every bit when you import data, past using the Blended models feature. When y'all use multiple information sources, it'south important to understand how data is moved back and along betwixt the underlying information sources, and the security implications it brings.

Express data transformations

Similarly, there are limitations in the data transformations that can be applied within Ability Query Editor. With imported data, a sophisticated set of transformations tin easily exist practical to clean and reshape the data before using it to create visuals, such as parsing JSON documents, or pivoting data from a column to a row class. Those transformations are more limited in DirectQuery.

Outset, when connecting to an OLAP source like SAP Business Warehouse, no transformations can be defined at all, and the unabridged external model is taken from the source. For relational sources, like SQL Server, it's still possible to define a prepare of transformations per query, only those transformations are limited for performance reasons.

Any such transformation will demand to be applied on every query to the underlying source, rather than once on data refresh, so they're limited to those transformations that can reasonably be translated into a single native query. If you use a transformation that is as well circuitous, you lot receive an error that either it must exist deleted or the model switched to import.

Additionally, the query that results from the Get Data dialog or Power Query Editor will exist used in a subselect inside the queries generated and sent to call up the necessary data for a visual. The query defined in Power Query Editor must be valid inside this context. In particular, it's not possible to use a query using Common Table Expressions, nor one that invokes Stored Procedures.

Modeling limitations

The term modeling in this context means the act of refining and enriching the raw data, as part of authoring a report using it. Examples include:

  • Defining relationships between tables
  • Calculation new calculations (calculated columns and measures)
  • Renaming and hiding columns and measures
  • Defining hierarchies
  • Defining the formatting, default summarization and sort order for a column
  • Grouping or clustering values

When using DirectQuery, many of these model enrichments can all the same exist made, and certainly there'due south still the principle that the raw data is being enriched, then equally to ameliorate later consumption. Withal, in that location are some modeling capabilities that aren't bachelor, or are limited, when using DirectQuery. The limitations are generally practical to avoid performance issues. The ready of limitations that are common to all DirectQuery sources are listed here. Boosted limitations might apply to private sources, as described in Side by side steps.

  • No congenital-in date hierarchy: When importing data, every engagement/datetime column will as well have a built-in date bureaucracy available by default. For example, if importing a tabular array of sales orders including a column OrderDate, then upon using OrderDate in a visual, information technology volition be possible to choose the appropriate level (yr, month, twenty-four hours) to use. This built-in date hierarchy isn't available when using DirectQuery. If in that location's a Date tabular array available in the underlying source, as is common in many information warehouses, then the DAX Fourth dimension Intelligence functions can be used as normal.
  • Engagement/time support only to second accuracy: When using time columns in your dataset, Power BI only issues queries to the underlying source to a level of item of seconds. Queries aren't sent to the DirectQuery source for milliseconds. Remove this part of the times from your source columns.
  • Limitations in calculated columns: Calculated columns are limited to beingness intra-row, every bit in, they can merely refer to values of other columns of the same table, without the use of any aggregate functions. Additionally, the DAX scalar functions, such as LEFT(), that are allowed, are limited to those functions that tin can be pushed to the underlying source. The functions vary depending upon the exact capabilities of the source. Functions that aren't supported aren't listed in autocomplete when authoring the DAX for a calculated cavalcade, and would result in an error if used.
  • No back up for parent-child DAX functions: When in DirectQuery fashion, it'due south non possible to employ the family of DAX PATH() functions that generally handle Parent-Child structures, such as chart of accounts, or employee hierarchies.
  • Calculated tables: You tin can use calculated tables in DirectQuery when using composite models.
  • Human relationship filtering: For information about bi-directional filtering, see Bidirectional cross-filtering. This whitepaper presents examples in the context of SQL Server Analysis Services. The central points use equally to Power BI.
  • No Clustering: When using DirectQuery, it'south not possible to use the Clustering adequacy, to automatically find groups.

Reporting limitations

Virtually all reporting capabilities are supported for DirectQuery models. As such, so long as the underlying source offers a suitable level of functioning, the same prepare of visualizations can be used. There are some important limitations in some of the other capabilities offered in the Power BI service later a study is published:

  • Quick Insights isn't supported: Power BI Quick Insights searches different subsets of your dataset while applying a set of sophisticated algorithms to discover potentially interesting insights. Given the need for very high performance queries, this capability isn't available on datasets using DirectQuery.
  • Using Explore in Excel volition likely result in poorer functioning: Y'all can explore your information by using the Explore in Excel adequacy on a dataset. This approach allows Pivot Tables and Pivot Charts to be created in Excel. While this adequacy is supported on datasets using DirectQuery, the operation is generally slower than creating visuals in Power BI, and therefore if the use of Excel is important for your scenarios, this fact should be accounted for in your decision to use DirectQuery.
  • Hierarchies are not shown in Excel: When connecting using DirectQuery from Excel to an Azure Analysis Services model or Power BI dataset, for example using Analyze in Excel, whatsoever hierarchies defined in the model or dataset are not shown.
  • Maximum length for text columns: The maximum length of the data in a text column for datasets using DirectQuery is 32,764 characters. Reporting on longer texts than that will outcome in an error.

Security

As discussed before in this article, a report in DirectQuery always uses the same fixed credentials to connect to the underlying data source, afterwards it's published to the Ability BI service. This behavior applies to DirectQuery, not to alive connections to SQL Server Assay Services, which is different in this respect. Immediately afterward publish of a DirectQuery report, it'southward necessary to configure the credentials of the user that will be used. Until y'all configure the credentials, opening the report on the Ability BI service would event in an error.

Once the user credentials are provided, then those credentials will be used whichever user who opens the study. In this way, information technology'south exactly like imported data. Every user sees the aforementioned data, unless row-level security has been defined as role of the report. The aforementioned attention must be paid to sharing the report, if there are any security rules defined in the underlying source.

Additionally, 'alternate credentials' aren't supported when making DirectQuery connections to SQL Server from Power BI Desktop. You lot can employ your electric current Windows credentials or database credentials.

Behavior in the Power BI service

This section describes the behavior of a DirectQuery written report in the Power BI service, to explain the caste of load that will be placed on the back-stop data source, given the number of users that the study and dashboard will exist shared with, the complexity of the written report, and whether row-level security has been defined in the report.

Reports – opening, interacting with, editing

When a study is opened, all the visuals on the currently visible page refresh. Each visual generally requires at least 1 query to the underlying data source. Some visuals might crave more than one query. For example, a visual might testify aggregate values from two different fact tables, or contain a more than complex mensurate, or contain totals of a non-additive measure out similar Count Distinct. Moving to a new page refreshes those visuals. Refreshing sends a new set of queries to the underlying source.

Every user interaction on the report might event in visuals being refreshed. For case, selecting a different value on a slicer requires sending a new set of queries to refresh all of the afflicted visuals. The same is true for clicking on a visual to cantankerous-highlight other visuals, or changing a filter.

Similarly, editing a new report requires queries to be sent for each footstep on the path to produce the final visual.

There'south some caching of results. The refresh of a visual is instantaneous if the exact same results have recently been obtained. If row-level security is defined, such caches aren't shared across users.

Dashboard Refresh

Individual visuals, or entire pages, can be pinned to dashboard as tiles. Tiles based on DirectQuery datasets refresh automatically co-ordinate to a schedule. Tiles send queries to the back-stop data source. Past default, datasets refresh every hour, but can be configured as part of dataset settings to be between weekly and every 15 minutes.

If no row-level security is defined in the model, each tile is refreshed once, and the results shared across all users. Otherwise, in that location can be a large multiplier result. Each tile requires separate queries per user to be sent to the underlying source.

A dashboard with 10 tiles, shared with 100 users, created on a dataset using DirectQuery with row-level security, and configured to refresh every 15 minutes, would event in at least k queries beingness sent every 15 minutes to the back-end source.

Pay careful consideration to the apply of row-level security, and the configuring of the refresh schedule.

Time-outs

A time-out of four minutes is applied to individual queries in the Power BI service. Queries taking longer than that volition fail. As stressed earlier, we recommend that you use DirectQuery for sources that provide well-nigh interactive query performance. This limit is intended to foreclose issues from overly long execution times.

Other implications

Another general implications of using DirectQuery are as follows:

  • If data is changing, it's necessary to refresh to ensure the latest data is shown: Given the apply of caches, there's no guarantee that the visual is e'er showing the latest data. For example, a visual might show the transactions in the terminal day. Because of a slicer existence changed, it might refresh to testify the transactions for the concluding 2 days. The transactions could include recent, newly arrived transactions. Returning the slicer to its original value would result in it again showing the buried value previously obtained.

    Selecting Refresh clears any caches and refreshes all the visuals on the folio to evidence the latest data.

  • If data is changing, there's no guarantee of consistency between visuals: Different visuals, whether on the same page or on different pages, might be refreshed at dissimilar times. If the data in the underlying source is changing, there's no guarantee that each visual shows the data at the exact same point of time. Indeed, given that sometimes more than i query is required for a single visual, for example, to obtain the details and the totals, then consistency fifty-fifty inside a single visual isn't guaranteed. To guarantee this consistency would require the overhead of refreshing all visuals whenever any visual refreshed, in tandem with the use of costly features like Snapshot Isolation in the underlying data source.

    This issue can be mitigated to a big extent by over again selecting Refresh to refresh all of the visuals on the page. Even if using import fashion, there's a similar problem of guaranteeing consistency while importing data from more than 1 table.

  • Refresh in Power BI Desktop is needed to reverberate any metadata changes: After a report is published, Refresh will refresh the visuals in the report. If the schema of the underlying source has inverse, then those changes aren't automatically applied to change the bachelor fields in the field listing. If tables or columns accept been removed from the underlying source, it might upshot in query failure upon refresh. Opening the report in Power BI Desktop, and choosing Refresh updates the fields in the model to reflect the changes.

  • Limit of 1 one thousand thousand rows returned on any query: There's a fixed limit of 1 one thousand thousand rows placed on the number of rows that tin can exist returned in any single query to the underlying source. This limit more often than not has no practical implications, and visuals themselves aren't going to display that many points. Nonetheless, the limit can occur in cases where Power BI isn't fully optimizing the queries sent, and at that place'southward some intermediate result being requested that exceeds the limit. It tin also occur while building a visual, on the path to a more reasonable final country. For instance, including Client and TotalSalesQuantity would hit this limit if there were more 1 million customers, until some filter were applied.

    The error that would be returned would be: "The resultset of a query to external data source has exceeded the maximum allowed size of '1000000' rows."

  • Can't change from import to DirectQuery mode: While it's possible to switch a model from DirectQuery mode to use import mode, all the necessary information must be imported. It's also non possible to switch back, primarily because of the set of features not supported in DirectQuery mode. DirectQuery models over multidimensional sources, like SAP BW, also tin can't be switched from DirectQuery to import, because of the different handling of external measures.

DirectQuery in the Ability BI service

All sources are supported from Power BI Desktop. Some sources are besides available directly from within the Power BI service. For example, it'southward possible for a business user to use Ability BI to connect to their data in Salesforce, and immediately get a dashboard, without employ of Power BI Desktop.

Only ii of the DirectQuery enabled-sources are available directly in the service:

  • Spark
  • Azure Synapse Analytics (formerly SQL Data Warehouse)

Even so, we recommend that any apply of DirectQuery over those 2 sources start inside Power BI Desktop. The reason is that when the connection is initially made in the Power BI service, many primal limitations will utilize. While the kickoff betoken was easy, starting in the Power BI service, there are limitations on enhancing the resulting written report whatsoever further. For instance, it's not possible and then to create any calculations, or utilise many belittling features, or even refresh the metadata to reflect whatever changes to the underlying schema.

Guidance for using DirectQuery successfully

If yous're going to use DirectQuery, this section provides you with some high-level guidance on how to ensure success. The guidance in this section is derived from the implications of using DirectQuery that take been described in this commodity.

Back-end information source performance

Validate that uncomplicated visuals refresh in a reasonable time. A refresh fourth dimension should be within 5 seconds to have a reasonable interactive experience. If visuals are taking longer than thirty seconds, it'due south highly likely that further issues will occur following publication of the written report. These problems tin can make the solution unworkable.

If queries are slow, examine the queries being sent to the underlying source, and the reason for the query performance. This article doesn't cover the broad range of database optimization all-time practices across the full set up of potential underlying sources. This article does embrace the standard database practices that utilise to most situations:

  • Relationships based on integer columns generally perform better than joins on columns of other data types.
  • The appropriate indexes should be created. Index creation generally means the employ of column store indexes in those sources that support them, for example, SQL Server.
  • Any necessary statistics in the source should exist updated.

Model Design Guidance

When defining the model, consider post-obit this guidance:

  • Avoid complex queries in Ability Query Editor. Power Query Editor translates a complex query into a single SQL query. The unmarried query appears in the subselect of every query sent to that table. If that query is complex, it might issue in performance issues on every query sent. The actual SQL query for a set of steps can be obtained by selecting the terminal step in Ability Query Editor, and choosing View Native Query from the context carte du jour.

  • Continue measures uncomplicated. At least initially, we recommend limiting measures to elementary aggregates. Then if the measures operate in a satisfactory manner, more circuitous measures can be defined, only paying attending to the performance for each.

  • Avoid relationships on calculated columns. This guidance is relevant to databases where you need to do multi-column joins. Ability BI today doesn't permit a relationship to exist based on multiple columns every bit the FK/PK. The common workaround is to concatenate the columns together using a calculated column, and base of operations the join on that column. While this workaround is reasonable for imported data, for DirectQuery, it results in a join on an expression. That outcome commonly prevents apply of any indexes, and leads to poor functioning. The only workaround is to really materialize the multiple columns into a unmarried cavalcade in the underlying database.

  • Avoid relationships on uniqueidentifier columns. Power BI doesn't natively back up a datatype of uniqueidentifier. Defining a relationship betwixt columns of type uniqueidentifier column results in a query with a join involving a cast. Again, this approach commonly leads to poor performance. Until this instance is specifically optimized, the merely workaround is to materialize columns of an culling type in the underlying database.

  • Hide the to column on relationships. The to cavalcade on relationships is commonly the main key on the to table. That column should be hidden. If hidden, it doesn't appear in the field listing and can't exist used in visuals. Ofttimes the columns on which relationships are based are in fact system columns, for case, surrogate keys in a data warehouse. It's practiced practice to hide such columns anyway. If the cavalcade does have meaning, and so innovate a calculated cavalcade that is visible, and that has a elementary expression of being equal to the chief key, as in the post-obit instance:

                                          ProductKey_PK   (Destination of a relationship, hidden)     ProductKey (= [ProductKey_PK],   visible)     ProductName     ...                                  
  • Examine all uses of calculated columns and data type changes. Apply of these capabilities aren't necessarily harmful. They do upshot in the queries sent to the underlying source containing expressions rather than simple references to columns. That once again might result in indexes not being used.

  • Avoid employ of the bi-directional cross filtering on relationships. Utilize of bi-directional cross filtering tin can lead to query statements that don't perform well.

  • Experiment with setting Assume referential integrity. The Assume Referential Integrity setting on relationships enables queries to use INNER JOIN statements rather than OUTER Bring together. This guidance generally improves query operation, though it does depend on the specifics of the data source.

  • Don't use the relative data filtering in Power Query Editor. It's possible to define relative date filtering in Power Query Editor. For case, to filter to the rows where the date is in the last 14 days.

    Filter rows for last 14 days

    However, this filter is translated into a filter based on the stock-still appointment, as at the time the query was authored. This result can exist seen from viewing the native query.

    Filter rows in native SQL query

    This result is probably not what you wanted. To ensure the filter is applied based on the date at the fourth dimension the report runs, instead use the filter in the report as a Study Filter. Currently, this arroyo would be done by creating a calculated cavalcade calculating the number of days ago, using the DAX Appointment() function, and then using that calculated column in a filter.

Report Blueprint Guidance

When creating a written report using a DirectQuery connectedness, follow this guidance:

  • Consider employ of Query Reduction options: Power BI provides options in the written report to send fewer queries, and to disable certain interactions that would result in a poor experience if the resulting queries accept a long time to run. To access these options in Ability BI Desktop, become to File > Options and settings > Options and select Query reduction.

    Query reduction options

    Checking box selections on the Query reduction permit you disable cantankerous-highlighting throughout your entire report. Y'all can also show an Apply button to slicers or filter selections. This arroyo lets you then make many slicer and filter selections earlier applying them. No queries are sent until you lot select the Apply button on the slicer. Your selections can and so be used to filter the data.

    These options employ to your report while yous collaborate with it in Power BI Desktop. These options also apply when your users swallow the report in the Power BI service.

  • Utilise filters kickoff: Always use any applicative filters at the showtime of building a visual. For instance, rather than drag in TotalSalesAmount and ProductName, then filter to a item year, apply the filter on Year at the very starting time. Each step of building a visual sends a query. Although it'southward possible to so make some other modify before the first query has completed, this approach still leaves unnecessary load on the underlying source. By applying filters early, it generally makes those intermediate queries less costly. Also, failing to use filters early can result in hitting the 1 million row limit.

  • Limit the number of visuals on a page: When you lot open a page or change a folio level slicer or filter, all of the visuals on a page are refreshed. There's also a limit on the number of queries that are sent in parallel. As the number of visuals increases, some of the visuals will be refreshed in a serial manner, increasing the fourth dimension taken to refresh the entire page. For this reason, we recommend that you limit the number of visuals on a single page, and instead have more, simpler pages.

  • Consider switching off interaction between visuals: By default, visualizations on a report folio tin be used to cross-filter and cross-highlight the other visualizations on the page. For example, having selected 1999 on the pie chart, the column nautical chart is cross highlighted to evidence the sales past category for 1999.

    Multiple visuals with cross-filtering and cross-highlighting

    Cross-filtering and cantankerous-highlighting in DirectQuery require queries to be submitted to the underlying source. The interaction should exist switched off if the time taken to respond to users' selections would be unreasonably long. You tin switch off this interaction. Switch off the interaction for either the entire report, as described earlier for query reduction options, or on a instance-by-example basis. For more information, meet How visuals cross-filter each other in a Power BI study.

In addition to the previous suggestions, each of the following reporting capabilities can cause performance issues:

  • Measure filters: Visuals containing measures, or aggregates of columns, tin contain filters in those measures. For example, the following graphic shows SalesAmount by Category, but only including those categories with more than than 20M of sales.

    Visual showing measures that contain filters

    This approach results in two queries existence sent to the underlying source:

    • The first query retrieves the Categories coming together the status, SalesAmount greater than 20 million.
    • The second query then retrieves the necessary information for the visual, including the categories that met the condition in the WHERE clause.

    This approach more often than not works well if there are hundreds or thousands of categories, every bit in this case. Performance can degrade if the number of categories is much larger. The query fails for more than than a million categories meeting the condition. The i 1000000 row limit was discussed before.

  • TopN filters: Advanced filters can be defined to filter on only the top or bottom N values ranked past some mensurate. For example, filters can include the meridian 10 categories in the previous visual. This approach again results in ii queries being sent to the underlying source. Even so, the outset query will render all categories from the underlying source, and and so the TopN are adamant based on the returned results. Depending on the cardinality of the cavalcade involved, this approach can lead to functioning issues or query failures because of the i million row limit.

  • Median: More often than not, any aggregation, such as Sum or Count Distinct, is pushed to the underlying source. However, this fact isn't true for median, which this amass is generally not supported by the underlying source. In such cases, the detail data is retrieved from the underlying source, and the median calculated from the returned results. This arroyo is reasonable when the median is to be calculated over a relatively small number of results. Performance issues or query failures considering of the 1 million row limit occur if the cardinality is big. For example, Median Land Population might be reasonable, simply Median Sales Price might not be.

  • Avant-garde text filters (contains and like): When filtering on a text column, the advanced filtering allows filters like contains and begins with and so on. These filters tin can certainly issue in degraded performance for some data sources. In particular, the default contains filter shouldn't exist used if what is required is an exact match. Although the results might be the same, depending on the actual data, the operation might exist drastically different because of indexes.

  • Multi select slicers: By default, slicers only allow a single option to be made. Allowing multi-selection in filters tin cause some functioning problems, because the user selects a set of items in the slicer. For instance, if the user selects the x products of interest, each new choice results in queries beingness sent to the source. Although the user tin can select the side by side item before the query completes, this approach results in extra load on the underlying source.

  • Consider switching off totals on visuals: By default, tables and matrices display totals and subtotals. In many cases, separate queries must be sent to the underlying source to obtain the values for such totals. This fact applies whenever using DistinctCount aggregation, or in all cases when using DirectQuery over SAP BW or SAP HANA. Such totals should be switched off by using the Format pane.

Maximum number of connections selection for DirectQuery

Y'all tin fix the maximum number of connections DirectQuery opens for each underlying data source, which controls the number of queries concurrently sent to each data source.

DirectQuery opens a default maximum number of 10 concurrent connections. You can change the maximum number for the current file in Power BI Desktop. Become to File > Options and Settings > Options. In the Electric current File section in the left pane, select Published dataset settings.

Setting maximum DirectQuery connections

The setting is only enabled when there's at least one DirectQuery source in the current report. The value applies to all DirectQuery sources, and to any new DirectQuery sources added to the same report.

Increasing Maximum connections per data source ensures more queries, upward to the maximum number specified, can exist sent to the underlying data source. This approach is useful when many visuals are on a unmarried page, or many users access a written report at the same time. Once the maximum number of connections is reached, further queries are queued until a connection becomes bachelor. Increasing this limit does result in more load on the underlying source, so the setting isn't guaranteed to meliorate overall performance.

One time a report is published, the maximum number of concurrent queries sent to the underlying information source also depend upon stock-still limits. The limits depend on the target environment to which the report is published. Different environments, such as Power BI, Power BI Premium, or Ability BI Study Server, can impose different limits. The table below lists the upper limits of the active connections per data source for each Power BI environment. These limits use to deject data sources and on-premise data sources such as SQL Server, Oracle and Teradata.

Surround Upper limit
Power BI Pro 10 active connections per data source
Power BI Premium 30 active connections per data source
Ability BI Report Server 10 active connections per data source

Note

The maximum number of DirectQuery connections setting applies to all DirectQuery sources when enhanced metadata is enabled, which is the default setting for all models created in Power BI Desktop.

Diagnosing performance problems

This section describes how to diagnose performance issues, or how to become more than detailed information to allow the reports to exist optimized.

We recommended that yous start diagnosis of performance problems in Ability BI Desktop, rather than in the Power BI service. Performance bug are frequently based on the performance of the underlying source. You tin more easily place and diagnose problems in the more isolated environment of Ability BI Desktop. This arroyo initially eliminates certain components, such as the Power BI gateway. If the performance issues are absent from Power BI Desktop, investigate the specifics of the report in the Power BI service. The functioning analyzer is a useful tool for identifying bug throughout this process.

Similarly, we recommend to first try to isolate any issues to an individual visual, rather than many visuals on a page.

Permit's say the steps in the previous paragraphs in this section have been taken. We now have a single visual on a folio in Power BI Desktop that is nonetheless sluggish. Utilise the performance analyzer to decide the queries that Power BI Desktop sends to the underlying source. Information technology'southward too possible to view traces and diagnostic information that might be emitted past the underlying data source. Traces might also contain useful details of how the query was executed, and how it can be improved.

Further, even in the absenteeism of such traces from the source, information technology'south possible to view the queries sent by Ability BI, along with their execution times, equally described in the next section.

Determining the queries sent by Power BI Desktop

By default, Power BI Desktop logs events during a given session to a trace file chosen FlightRecorderCurrent.trc.

For some DirectQuery sources, this log includes all queries sent to the underlying data source. The remaining DirectQuery sources will be included in the future. The following sources send queries to the log:

  • SQL Server
  • Azure SQL Database
  • Azure Synapse Analytics (formerly SQL Information Warehouse)
  • Oracle
  • Teradata
  • SAP HANA

The trace file can be found in the AppData folder for the current user:

<User>\AppData\Local\Microsoft\Power BI Desktop\AnalysisServicesWorkspaces

To get to this folder, in Power BI Desktop, select File > Options and settings > Options, and and so select Diagnostics. The following dialog appears:

A link to open traces folder

When you select Open crash dump/traces folder, under Diagnostic Options, the following binder opens: <User>\AppData\Local\Microsoft\Power BI Desktop\Traces.

Navigating to that folder's parent folder displays the folder containing AnalysisServicesWorkspaces, which volition contain one workspace binder for every open instance of Power BI Desktop. These folders are named with an integer suffix, such equally AnalysisServicesWorkspace2058279583.

Inside that binder is a \Data folder. It contains the trace file FlightRecorderCurrent.trc for the electric current Power BI session. The corresponding workspace folder is deleted when the associated Ability BI Desktop session ends.

The trace files can be read using the SQL Server Profiler tool. Get it equally role of the gratuitous download SQL Server Direction Studio.

One time you download and install SQL Server Direction Studio, run SQL Server Profiler.

SQL Server Profiler

To open the trace file, take the post-obit steps:

  1. In SQL Server Profiler, select File > Open > Trace file.

  2. Enter the path to the trace file for the currently open Ability BI session, such as: C:\Users<user>\AppData\Local\Microsoft\Power BI Desktop\AnalysisServicesWorkspaces\AnalysisServicesWorkspace2058279583\Information.

  3. Open FlightRecorderCurrent.trc.

All events from the electric current session are displayed. An annotated example is shown here, which highlights groups of events. Each group has the post-obit events:

  • A Query Begin and Query End event, which correspond the first and end of a DAX query generated past the UI, for example, from a visual, or from populating a list of values in the filter UI.
  • One or more pairs of DirectQuery Begin and DirectQuery Terminate events, which correspond a query sent to the underlying information source, as office of evaluating the DAX query.

Multiple DAX queries can run in parallel, and then events from different groups can be interleaved. The value of the ActivityID can be used to determine which events vest to the same group.

SQL Server Profiler with Query Begin and Query End events

Other columns of involvement are every bit follows:

  • TextData: The textual detail of the event. For Query Begin/End events, the particular is the DAX query. For DirectQuery Begin/End events, the detail is the SQL query sent to the underlying source. The TextData for the currently selected consequence is besides displayed in the region at the bottom.
  • EndTime: The fourth dimension when the event completed.
  • Duration: The duration, in milliseconds, taken to execute the DAX or SQL query.
  • Error: Indicates if an error occurred, in which case the event is as well displayed in crimson.

In the prototype in a higher place, some of the less interesting columns have been narrowed, to allow other columns to be seen more easily.

We recommend the post-obit approach to capturing a trace to help diagnose a potential functioning result:

  • Open a unmarried Power BI Desktop session, to avoid the defoliation of multiple workspace folders.
  • Do the set of actions of interest in Power BI Desktop. Include a few boosted deportment, to ensure that the events of involvement are flushed into the trace file.
  • Open SQL Server Profiler and examine the trace, equally described previously. Think that endmost Power BI Desktop deletes the trace file. Likewise, further actions in Power BI Desktop don't immediately appear. The trace file should be closed and reopened to see the new events.
  • Keep private sessions reasonably small-scale, perhaps 10 seconds of actions, not hundreds. This approach makes information technology easier to interpret the trace file. There's also a limit on the size of the trace file. For long sessions, in that location'south a chance of early events being dropped.

Understanding the form of query sent past Power BI Desktop

The general format of queries created and sent by Power BI Desktop use subselects for each of the tables referenced. The Power Query Editor query defines the subselect. For example, assume the post-obit TPC-DS tables in SQL Server:

TPC-DS tables in SQL Server

Consider the following query:

A sample query

That query results in the following visual:

Visual result of a query

Refreshing that visual volition result in the SQL query shown here. As you can tell, there are three subselects for Web Sales, Item, and Date_dim, that each return all the columns on the respective table, fifty-fifty though only iv columns are actually referenced by the visual. These queries in the subselects that are shaded are exactly the effect of the queries divers in Ability Query Editor. Apply of subselects in this manner hasn't been found to impact performance for the information sources so far supported for DirectQuery. Data sources like SQL Server optimize away the references to the other columns.

Power BI employs this pattern because the SQL query used can be provided directly by the analyst. Information technology'southward used "as provided", without an attempt to rewrite it.

SQL query used as provided

Side by side steps

This article describes aspects of DirectQuery that are common across all data sources. There are certain details that are specific to private sources. See the following articles covering specific sources:

  • DirectQuery and SAP HANA
  • DirectQuery and SAP BW

For more information most DirectQuery, see the following resources:

  • Data Sources supported past DirectQuery