Nested functions and Table.Buffer()

In this blog, we will explain nested functions in Power Query, as well as how to optimize these functions with buffering. By using nested functions and buffering, we will combine most of the advanced M coding.

Business Case

We will demonstrate the business problem we want to solve with the following snapshot from the Contoso database.

As we can see in the tables above, there is no key column on which we could do the merging, but we still need to merge those 2 tables. We are going to merge them using “partial match”, meaning we will search for keywords from the Categories table in each name in the [Product] column of the FactSales table.

Following is the M logic we will try to implement.
For each product in the FactTable, we want to iterate through the Categories table and compare if the product name contains any of the brand keywords from the Categories table. If it does, then those rows of the Categories table should be kept in a nested table of the current outer row iteration (the one of the FactTable). (Image below)

I know that we can try fuzzy merging or other techniques, but I find the following technique best for explaining both each keyword and nested functions.

Each Keyword

Before revealing the code for the partial search technique, let’s explain the each keyword.
Let’s say that we want to filter the FactTable to only show products with [SalesAmount] <1000. We can use filter option from within the PowerQuery UI. Filtering will produce the following M code as shown in the formula bar in the picture below.

If we observe the M code of the step, we can see that it consist of Table.SelectRows() function which has 2 arguments.
The first argument is the name of the table, and the second is the condition as a function. At the moment the second argument seems unclear, but we will soon demystify what it represents.

The keyword each represents the introduction of iteration. Table.SelectRows() is an “iterative” M function which creates a sort of a row context (similar to DAX row context) in which the same condition is compared for each row in the table. What we need to understand at this point is that each keyword is a sugar syntax for “Iterative function invoke”. If we remove the sugar syntax, the formula above becomes:

= Table.SelectRows(FactTable, (_)=> _[SalesAmount] < 1000)

This is how PowerQuery reads each keyword.

Each creates a “temporary” record type variable in which the underscore symbol represents the current row of the table iteration.
After the variable containing the current row is created, it is then called from within the current iteration with the use of the underscore symbol followed by the name of the column that is being used as a condition. Because underscore is a default variable for each syntax, it can be omitted when calling a column name from the record. So the following syntax is also acceptable.
= Table.SelectRows(FactTable, (_)=> [SalesAmount] < 1000)

Most of the time we will invoke iterations only in one environment (we discussed environments in the M part of the course), meaning only one table iteration in a step. In that case, the each sugar syntax is completely acceptable and much easier to understand and use. The issue starts when we need to introduce another, nested row iteration.

In our example, we need to, after we start the iteration over the FactTable, for each row in the FactTable, introduce another iteration, this time over the Categories table. If we try to create such an expression using the each keyword, we will be out of luck.

Table.AddColumn() and Table.SelectRows() are both “iterative” M functions, meaning they create row context iterations over the tables supplied as the first argument to the function.

With the following M formula, we created a nested iteration. The outer iteration is introduced with Table.AddColumn() function and the inner one is introduced with Table.SelectRows(). Keyword each in the 3rd row is iterating over the sales table, and the inner each (4th row) is iterating over the Category table. The problem becomes evident in the Table.Contains() function, where we need to search the [BrandName] of the current inner iteration (which is a part of the Categories table), against the current product name of the outer iteration over the FactTable.

This becomes even more evident if we remove the sugar syntax!

After the sugar syntax is removed, we can clearly see that now both [Product] and [BrandName] columns are trying to access the inner, category table record. Because of the underscore variable name conflict, the outer underscore, which is holding a record of the outer iteration, is not reachable from within the inner iteration. Therefore, the column [Product], which is the outer iteration record column, is non existing in the inner environment, producing an error as shown in the picture above.

Correct Approach to Nested Function Calls

Now let’s show the correct approach.

= Table.AddColumn(
           FactTable, "Category",
                      (OuterRow)=> // introduced with Table.AddColumn function
                     Table.SelectRows(CategoryTable,
                                     (InnerRow) => //Nested iteration introduced with Table.SelectRows function
                                     Text.Contains( OuterRow[Product] , InnerRow[BrandName] )
                                      )
                  )

Even though underscore is a default variable holding a record of the current row iteration, you are allowed to supply any other name as a variable that will enter the function.
In case we supply a different name than the default _, then we must always prefix that name before calling a column from the record.

This time, instead of writing each for the inner iteration, we created a function invoke with the InnerRow argument. This means that each row of the inner iteration will be bound to the InnerRow variable. In the condition part of the Text.Contains() we also bound the [BrandName] column to the InnerRow variable.

After we change the outer variable from default underscore to OuterRow, we are able to clearly see that the [Product] column is now bound to the current row of the outer record iteration created by Table.AddColumn() function, while the [BrandName] column is bound to the inner iteration of the Table.SelectRows() function.

Now there is no conflict in defining function argument names, so the outer environment variable (OuterRow) can be reached from within the inner iteration (inner environment).

This is extremely useful in case we need to introduce nested functions, but it also comes at the expense of performance. This is an extremely expensive operation, especially in case the inner table is big. This is also an excellent place to use Table.Buffer() function to load the inner context table into memory for a better execution time!

Using Table.Buffer() to provide better performance

The last part of this article is about the Table.Buffer() function. Nested iterations are very problematic in PowerQuery, since they have a huge impact on the query loading speed. If we were to load this query to the data model as is, it will take a couple of minutes to lead.

What we can do, and should do every time we have nested iterations, is to try to use Table.Buffer() function to push nested iteration into a memory, optimizing the performance.
There are also List.Buffer() and Binary.Buffer() functions that work on the same principle. And what is the principle? It’s the principle of storing the table, list, or binary object in RAM, making it a stable reference isolated from external changes during evaluation. That means faster reads of data and faster performance.

In our case, it would be a good idea to push the category table to the memory by encapsulating it with a Table.Buffer() function. Following is the whole M code of the solution.

let
    FactTable = Fact,
    CategoryTable=Table.Buffer(Category),

    SegmentColumn = 
        Table.AddColumn(
            FactTable, "Category",             
            (OuterVariable)=>            
                Table.SelectRows( 
                    CategoryTable, (InnerVariable) => 
                        Text.Contains( OuterVariable[Product] , InnerVariable[BrandName] )
                )
            )
in
    SegmentColumn

Remember not to simply wrap the table inside of the Table.Buffer() when the table is already nested, because that will push the table to memory on each iteration, therefore we will experience no performance gain. Instead, you have to create a variable outside of the inner iteration and push it to the buffer function before starting the iteration. The table is now properly buffered and we can expect a huge performance boost from it.

Table.Buffer() use cases

In which situations can we benefit from Table.Buffer() function?

When our table or list will be referenced multiple times (for example in List.Generate() function).
Whenever using a recursive functions
Before merging in certain occasions
Before passing tables to functions

Be careful when using buffering with foldable sources since buffering breaks query folding which can result in worsen performance. Also, if your tables are big, think twice before buffering, because PowerQuery works with a “limited working set” meaning it limits the amount of data it stores in RAM and writes the remaining part on the hard drive.

Conclusion

Table.Buffer() is kind of a wild beast, you never know if it will produce a faster result, so it’s a bit of a trial and error. But when it does hit, it hits hard and you can expect to increase loading speed multiple times. So whenever you encounter a decrease in performance, try to utilize the buffering option.

Where Buffering is not a solution, or you should think twice before using it?

If you use Table.Buffer() as a first step before referencing it from multiple queries. The data that is buffered cannot be shared between separate query executions – so the buffering happens multiple, once for each time the referenced query is evaluated.
When you are working with foldable sources.

That’s all for this article! Hope you enjoyed reading it. In case you have any questions, please post them in the comment section!

Table of Content

LEARN POWER QUERY

Improve your PowerQuery skills with
Exceed Academy

COMMENTS

4 Comments

Newest

Oldest Most Voted

Inline Feedbacks

View all comments

GET LATEST BI NEWS TO YOUR INBOX

Receive the latest updates on all business analyst news across all platforms.

New Power BI training held for client Bugatti Rimac

In the past 2 weeks, we held a new Power BI training for the client Bugatti Rimac for a total duration of 24 hours. The training was held again at the client’s on Campus, but this time for another department, the Project Logistics team. The participants of the training with...

U protekla 2 tjedna održali smo novu Power BI edukaciju za klijenta Bugatti Rimac u sveukupom trajanju od 24 sata. Edukacija je ponovno održana kod klijenta na Kampusu, ali ovaj put za drugi odjel, tim Projektne Logistike. Polaznici edukacije uz pomoć Power BI sustava žele automatizirati svakodnevne zadatke i stvoriti...

Last week we went to Osijek to hold a PowerApps training for Žito d.d. The training took place at the client’s office, lasting 2 days, with a total duration of 8 hours. The participants of the training have been using PowerApps in their business for some time, but they wanted...

Prošli tjedan svratili smo do Osijeka kako bismo održali PowerApps edukaciju za Žito d.d. Edukacija se održala u 2 dana kod klijenta u uredu, u sveukupnom trajanju od 8 sati. Polaznici edukacije već neko vrijeme koriste PowerApps u svom poslovanju, no željeli su usavršiti svoje vještine u samostalnoj izradi aplikacija,...

Nested functions and Table.Buffer()

Business Case

Each Keyword

Correct Approach to Nested Function Calls

Using Table.Buffer() to provide better performance

Table.Buffer() use cases

Conclusion

Table of Content

Table of Content

LEARN POWER QUERY

COMMENTS

GET LATEST BI NEWS TO YOUR INBOX

Related blog posts

New Power BI training held for client Bugatti Rimac

Nova održana Power BI edukacija za klijenta Bugatti Rimac

Held PowerApps training for Žito d.d.

Održana PowerApps edukacija za Žito d.d.

Usluge

Privatnost

Newsletter