0 ratings0% found this document useful (0 votes) 36 views12 pagesQuery Folding
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content,
claim it here.
Available Formats
Download as PDF or read online on Scribd
Query Folding in Power Query to Improve Performance
[Link]/) MENU
i] :@x
By: Koen Verbeeck (/sqlserverauthor/114/koen-ve ) | Comments (11) | Related:
Microsoft Excel Integration (/sal-server-tip-category/114/microsoft-excelintegration/) (Chat With This Website
SQL Server Scripts
Need a better way to manage and monitor your SQL Server backups? Download this free setup of SQL Server backups
scripts now.
Download SQL Server Backup Scripts
([Link]
Problem
Power Query allows you to extract and manipulate data from various sources. When you define transformations on the data, itis
possible that those transformations are sent back to the source to improve performance. This feature is called query folding and its
very important for in Power Query. In this tip we wil discuss query folding, how you can take advantage from it and how to make sure
query folding takes place.
Solution
Power Query isa lightweight ETL-lke tool inside the Power BI stack, You can find itn multiple software applications:
+ As the Query Editor in Power BI Desktop
+ As the Get Data editor in Excel (previously Power Query was a seperate add-in for Excel)
+ As the Modern Get Data experience in Analysis Services Tabular (since SQL Server 2017)
Most of the screenshots in this tip are made in Excel, however all functionality is very similar between those applications.
Depending on the source, Power Query can send some of the transformations to the source. This means that those transformations
take place at the source, before any data is sent to Power Query. This is a big performance improvement, since Power Query has to
import less data, This process is called query folding
However, not all sources support query folding — it's impossible to push transformations to a flatfile for example — and not all
transformations can be sent back to the source. For example, filtering rows can be sent back to SQL Server (in the form of a WHERE
lause), but a transformation in Power Query that removes the 5 bottom rows cannot,
There isn't yet an official list of sources that support query folding, but I've composed the following list from various resources:
+ Relational sources (SQL Server, Oracle, ...). They support most Power Query functionality.
+ OData sources (such as @ SharePoint list for example and the Azure Marketplace)
+ Active Directory
+ Exchange+ HDFS, Folder-Files and [Link] (for basic operations on paths)
‘There also isn't documentation available for transformations that can fold back, but these transformations can when using the right,
source type:
+ Filtering (on rows or columns)
+ Joins
+ Aggregates and GROUP BY
+ Pivot and unpivot
+ Numeric calculations
+ Simple transformations, such as UPPER
Using Query Folding
The user interface in Power Query doesn't tell you if Power Query uses query folding or not. This means you have to use a monitoring
tool to capture the statements sent to the source. Let's itillustrate with a case where we read the data from SQL Server and monitor
the queries with SOL Server Profiler (/sql-server-tip-category/83/profler-and-trace)),
We're reading data from the FactinteretSales table from the AdventureWorksDW2012 sample database. The first step is getting the
CategoryName into the table, We can do this by expanding the DimProduct column and retrieving the DimSubCategory column,DimDate(OrderDatekey) 4°) DimDate(ShipDatexey) |4| B
value!
value!
ie ( LargePhoto
=~ (1 EnalishDescription
ie [1 FrenchDescription
tue chineseDescription [ve
ome [0 ArabicDescription [va
Se (1 HebrewDescription [va
eee (1 thaiDescription |va
Value (1 GermanDescription lva
oie] (11 JapaneseDescription lve
value (7 turkishDescription lva
value} CO startDate |va
value’ C1 Endbate |va
value C1 Status hve
Vatwe DimProductSubcategory lve
value Co FactinternetSales lva
value C1 FactProductinventory ba! |va
Value Co FactResellerSales Iva
value) [¥| Use original column name as prefix |ve
value! |va
oe OK Cancel [V8
Value ve
‘Next we expand this newly retrieved column and get the DimProductCategory column. Then we repeat this and retrieve the
EnglishProductCategoryName column. The applied steps in the query settings pane should look like this:
4 APPLIED STEPS
Source 2
Navigation %
Expand DimProduct ca
Expand DimProductDimProduc... ¢
% Expand DimProduct DimProduc..
Now we do the same with the DimDate (OrderDateKey) column to retrieve the CalendarYear column from the DimDate dimension.
‘Then we filter this column for the values 2007 and 2008. I's possible the filter menu only shows the value 2005 because the data is
sampled.$| Sort Ascending
Z| Sort Descending
%
Number Filters
=
Yi (Select All)
Z 2005
My Ust may be incomplete. Load more
OK Cancel
In that case, click on "Load more to sample more rows.
‘Next we are going to choose the columns we want to keep. Click Choose Columns in the home ribbon and select the columns
‘SalesAmount, OrderQuantity, CalendarYear and EnglishProductCategoryName,Choose Columns
Choose the columns to keep
“| DiscountAmount
_] ProductStandardCost
_| TotalProductCost
“ SalesAmount
_| TaxAmt
_] Freight
—] CarrierTrackingNumber
_| Customer?ONumber
(J OrderDate
| DueDate
~] ShipDate
“| DimCurrency
_] DimCustomer
_| DimDate(DueDateKey)
| Calendarvear
_] DimDate(ShipDateKey)
/ EnglishProductCategoryName
(1 Dim?romotion
| DimSalesterritory
_] FactinternetSalesReason
OK Cancel
‘The next step is to group by the columns CalendarYear and CategoryName and aggregate on SalesAmount and OrderQuantity. This
can be done with the Group By transformation which can be found in the Transform ribbon,Group By...
Specify the columns to group by.
Group by *
CalendarYear ~|-
New column name Operation Column +
OrderQuantity (sum >| | OrderQuantity =i) *
SalesAmount (sum =| | SalesAmount =
OK Cancel
Furthermore, we are keeping only rows where the sum of orderquantiy is greater than 5000.
Filter Rows
Show rows where: OrderQuantity
[Rarentertien
® And © Or
As the last step, we are going to change the ProductCategory to uppercase.
OK Cancel= [Link](#"Grouped Rows", each [OrderQuantity] > 5000)
(Y! SalesAmount |
Esl Remove
Remove Other Columns
Duplicate Column
Remove Duplicates, bea
Remove Errors 5393 201524.54
Change Type
dense >| towercase
2) Replace Values. UPPERCASE
Replace Errors, Capitalize Each Word
Trim
Split Column »
Group By...
Fil »
Clean
Length
JSON
XML
ao Wee
Unpivot Columas
‘The Applied Steps should now ook tke this:
4 APPLIED STEPS
Source
Navigation
Expand DimProduct
Expand [Link]...
Expand DimProduct. DimProduc.
Expand DimDate(OrderDateKey)
Filtered Rows
Removed Other Columns
Grouped Rows
X Filtered Rowst
Beeaeeeeere s
When we load the data into Excel, a query is sent to SQL Server to fetch the data and we can monitor this with SQL Server Profiler
{(scl-servertin-category/83/profler-and-trace/). Alot of queries will pop-up, most of them to retrieve metadata, but one query is,
responsible for fetching the data in our example.seL:varchstarting select [Link] [INDECMWE], [Link] ... 53. 2015-04-29, adventure
Recs searting ence sp_reset_connection 53. 2015-04-29 00000... Adventure
Audit: Logout 53. 2015-04-28 adventure
audit Login network protocol: Upc set quote... 53 2015-04-29 20: adventure
sqLsbarchstarting gelect ().[CalendarVear] a [Oinda...] 53. 2025-04-29 Adventure
audit Login mietwork protecol: LPC set quate... 96 2015-04-29
sec startin ies arse eerie Se patente enone Advantuwe
Ft] {Engligherociectuegar dre] 25 [Sinoroscebharrocuceovbeatagory dint
_ ames Seen
ce 8 aera
fotaerquanety2
roe ae
The query looks like this:
‘select [_].[Calendarvear] as [DinDate(OrderDateKey).Calendarvear],
upper ([_]. [englishProductCategoryNane]) as [DinProduct .[Link]..EngLishProdt
[1.[salesanount2] as [SalesAnount],
[1 [orderquantity2] as [Orderquantity]
fron
G
select [_]-[Calendarvear],
LJ. LenglishproductCategoryNane],
[L).(salesanount2],
[L]-[orderquantity2]
fron
‘
select [rows]. [CalendarYear] as [CalendarYear],
[rows]. [EnglishProductCategoryNane] a5 [EnglishProductCategoryNane],
sun([rows]. [SalesAnount]) a5 [SalesAnount2],
sun([rows].[Orderquantity]) as [Orderquantity2]
fron
te
select [_].[Orderquantity],
(LJ. [salestnount],
(LJ. LenglishProductcategoryNane],
(L].(calendarvear]
fre
‘
select [Souter].[onderquantity],
[Souter] .[SalesAnount],
I's a bit unwieldy and in fact alot of columns of the inner subqueries are unnecessary because we filtered them out in a later step,
But you can see that all of the transformations are present in the query:
+ The final SELECT only selects the columns we need
+ Expanding to the ProductCategory and Date dimension was translated into LEFT OUTER JOINS
‘The function UPPER Is used on the ProductCategory column
‘The filters on CalendarYear on OrderQuantity are implemented as WHERE clauses
+ The Group By obviously became a GROUP BY in SQL and the corresponding aggregates are calculated with SUM of coursePreventing Query Folding
‘Some actions can prevent query folding from taking place. Sometimes it's even possible you want to prevent query folding for some
reason, These are the most common reasons why query folding isn’t taking plat
+ Using a source that doesn't support query folding.
+ Using the .Buffer() M function, either on a list or on a table. This function reads all data in
+ Using a custom SQL statement. If you write your own SQL statement to fetch the data, a
will not use query folding.
+ Some transformations. For example, filtering with the date filter “This Month” prevents query 1o1aing as in we current release inere
is no SQL equivalent fiter implemented yet. This might change though in future releases,
+ Some privacy level settings can prevent query folding,
+ Using "Removing rows with errors” prevents query folding.
+ Defining and using your own functions in Power Query can also prevent query folding,
Let's illustrate the date fiter problem with an example. When reading the FactintemetSales table, expand the DimDate
{OrderDateKey) to retrieve the FullDateAlternateKey. Filter this column for This Month,
| bimDate(shipdatexey) 4%) DimProduct 4a] DimPromotion [4] imé
$1 Sort Ascending Yeive Value value valu
il somnenaaiy Value Value Value Valu
Velue Value Value Valu
Value Value Value Valu
% Yelue Value Value Valu
> eHiaie Value valu
Value valu
Be Betate, Value valu
(Select Al ater. Value valu
a
7/1/2005 Between, Value valu
(| 7/2/2005 ata Value valu
(4 7/3/2005 eee Value valu
i Taos in the Previous. —— ;
7/5/2008 ; ext Mont f
F 7/6/2005 7 This Month l,
T7105 7 Last Month C
Zee Quarter > January hn
(4 7/9/2005 me i
. 5 ebruan
A 7/10/2005 Yen 4
Next remove all columns except OrderQuantity and FullDateAltemateKey. The Applied Steps look tke this:4 APPLIED STEPS
Source
Navigation
Expand DimDate(OrderDateKey)
Filtered Rows
X Removed Other Columns
When we take a look at Profiler, we can see that no transformations were pushed down to SQL Server, even though removing
columns can be folded back.
SQL: Batchstarting Select [outer] Tproductkey], } $2. 2015-04-29 [Link]
SeTecy [Souter (rroducekeyT,
[Souter]. [or derbateKey!
[Souter]. [DueateKey],
[Souter]. [Shi ppat exey]
[Souter]: [Customerkeyl
ffouter | fpronctonkey]
[Souter]: [currencykeyl
Souter] [salestetritorykey! ,
Bouter]: featecorderngnber
[Souter]: [Salesordertineunber]
fouter}- fev sionnunber J,
Souter]: [orderquantity)
[Souter]: [unitprice],
[Souter]: [exterdedamount] ,
[Souter]: [Unitericepiscouncrct)
[Souter] [piscountamount ]
[Souter]. [productstandardcost] ,
[Souter |: [roralprogucrcost]
Souter] [salesanoune]
[Souter]. [Taxame],
fggucer | fereiche),
Souter]: [carrier Track’ ngNunber]
[Souter]: [Customer roNunber |
[Souter]: forderpatel.
[Souter] [Duebatel .
[Souter] [shippate]
[Stoner]. [rul ipateaiternatekey]
fron [dbo]. [Factinternersales] as [Souter]
aft outer join [dbo] [Dimbate] as [Sinner] on ([Souter]. [orderDatekey] = [Stnner]. [oatekey])
Power Query will read the entire fact table and then perform the transformations inside its own engine.
I's important you try to do steps first where query folding can take place and that you put steps that prevent query folding as late in
the chain as possible, in order to maximize performance. If one of your first steps prevent query folding, all the subsequent steps will
not be folded back
Update - Native Query
In the most recent versions of Power Query, the option has been added to view the "native query". This means you can view for
‘example the generated SQL statement without running a trace as showed in the previous sections. Checking if query folding takes
place is simply done by rightclicking on a step in the query editor and by verifying if the "view native query” option isn't greyed outPeat ais)
eae
EMC}
ees
Edit Settings
fare ron)
Rename
Delete
Delete Until End
Insert Step After
Move Up
Move Down
Extract Previous
View Native Query
Propel
If*View Native Query" is greyed out, it means query folding isn't taking place for that step. The tip Power BI Native Query and Query
olding_{/salservertin/4563/power-bi-native-query-and-query-folding/) goes deeper into this concept.
Conclusion
In this tip we introduced the concept of query folding in Power Query and how important itis for performance. We showed when itcan
take place and what possible limitations there are, Important to remember is putting steps that can be folded back tothe source fst
in the lst of applied steps. Steps thal prevent query folding should be applied as late as possible.
Next Steps
+ Check out the following tips about Power Query:
© Extracting a SharePoint List with Power Query (/salservertip/3625/extractin,
shows how to read OData sources.
‘+ The tip Using the New OData Source in SOL Server Integration Services (/sq
server-integration-services/) explains how you can read a SharePoint list wth SSIS and OData
+ There are some resources available on Query Folding:
© Join Conditions in Power Query, Part 2: Events-In-Progress, Performance and Query Folding
{ttps://[Link]/2014/06/04/join-conditions-i
folding!) and Data Explorer (Power Query) -Where Does The Real Work Get Done?
(htfps//ewebbbi [Link]/2013/03/06/data-explorarwhere-does-the-realwork-get-done/) by Chris Webb. In his Power
‘Query book (http:/ www. [Link]/Power-Query-BI-Excel/dp/1430266910/ref=sr_1_17ie=UTFB&gid=14303398308sr=B-
© Filtering in Data E jata-explorer!) by Matt Masson
© Function Folding in #PowerQuery (htto:/igeekswithblogs netidarrengosbellarchive/2014/05/16/function folding
[Link]) by Darren Gosbell
© The tip Power BI Native Query and Query Folding (/sal
you can easily check if query folding is taking place.
ervertAbout the author
(slserverauthort!4Ikoon-verbesc!} Koon Verbeock is a seasoned business intligence consultant al AE. He has over a decade of experience
wih the Microsoft Data Platform in numerous industries. He holds several cetfcaions and ea profic writer contibuting content abaut SSIS,
ADF, SSAS, SSRS, MDS, Power Bl, Sowfake and Azuc services. He has spoken at PASS, SQLs, dataMinds Connect and delivers
webinars on [Link], Koen has bean awarded the Microsoft MVP data platform awar fr many years
a View all my tips (/salserverauthor/t14ikown-verbeeck)