Share on facebook
Share on twitter
Share on linkedin

Split packs into single products in Power Query

Krešimir Ledinski

Krešimir Ledinski

Microsoft certified expert in the field of Business Intelligence. His biggest passions are DAX, M, and data modeling.

Share on facebook
Share on twitter
Share on linkedin

This is going to be an easy to follow article on how to use PowerQuery to easily change source data granularity for only a subset of data. We used a client’s request to split packs into its components as an inspiration to write this post. You can also download Excel workbook and follow along.

Sometimes in the sales data, we can have sales quantity and value of packs with multiple products included. We will show how to split those packs into individual products and how to assign correct sales values and quantities to each individual product within the pack, based on a specific key.

The problem

The Fact table contains multiple products, but it also contains packs of products. For example, we can have Pack X which consists of 3 products: Product A, Product B, and Product C. The quantity of each product within a pack is different. There are 5 A Products, 10 B Products, and 15 C Products. We want to remove packs from the Fact table and only show granularity on the product level. We also need a method to equally split the total value of the pack into individual product values.

We will add another pack named Pack Y, with 10 Products B and 10 Products C. Now, let’s demonstrate how to split packs using our small sample (one day, two packs).

This is our Fact table. Instead of Pack X and Pack Y, we want to have individual products, sales quantity, and sales value for each product inside the pack. At the moment, in the ProductKey column, we have mixed product and pack keys. We want to isolate PackKeys and manipulate them exclusively while leaving real products intact.

Most of the time, when dealing with packs, clients have components list where we can find which products are a part of the pack SKU. Usually, they are in a table similar to the one below.


In this table, we have pack descriptions with pack quantity (sum of individual product quantities inside of a single pack), and a description of each component of the pack with their quantities and prices.

The idea

We would want to merge those 2 tables to get individual products for each pack that is present. To do that, first, we need to get both tables inside a Power Query (queries named FactTable and PacksTable).

The first step is to merge queries by using ProductKey (from the Fact table) and PackKey (from the Components list) columns. Merge is done using the left outer join. Then, we will expand the table with columns [PackQTY], [Product], [ProductKey], and [ProductQTY]. Now, we have a table that contains two columns with product key (ProductKey and ProductKey.1) and product name (ProductName and Product).


We will use expanded columns to create a new conditional column that will populate null values of the ProductKey.1 column with the value of ProductKey. That way, we will get a new column consisting of only ProductKeys.

This techique fails if you have a null value as a standard value in ProductKey column, which is something that should not happen and if it does, it should be cleaned before this step.

Now, we have the correct product keys and product names. To get the correct quantities, we will create a custom column in which we will multiply product quantities and sales quantities if they are packs.


The if part checks if the product is a pack, and if it is, multiplies SalesQTY by ProductQTY. ProductKey.1 is an expanded column, so it contains nulls for all products that are not packs. After we split the sales quantity, we get this table:


We can see that we have the correct quantities and product keys/names. The only thing left to do is to split the SalesValue column to get the correct sales value for each product. There are multiple ways of doing it, depending on the logic you need to apply. If you have products with the same or similar price, you can simply multiply sales value by quantity share of different products inside the pack.

For example, Pack X contains 3 products with quantities 5,10, and 15. We will divide SalesValue (4000) and PackQTY (30), then multiply by ProductQTY (5,10 or 15). This means that each product gets the value proportional to its quantity share. In our case, this would be wrong because products A, B, C in Pack A all have different prices (10,15 and 20).

This means we need to apply different logic. We want to see what is the value share (instead of quantity share) of each product in our packs. This means that we need to get individual product values (multiply columns ProductQTY and ProductPrice) and divide it by pack value (sum of all individual product values). The easiest way to get Value share of each product is to duplicate the Pack table (Components list) and do a group by PackKey to receive totals of each pack.

let
    Source = Excel.CurrentWorkbook(){[Name="Table2"]}[Content],
    #"Added Custom" = Table.AddColumn(Source, "ProductValue", each [ProductPrice]*[ProductQTY]), //Get total value of each product
    #"Grouped Rows" = Table.Group(#"Added Custom", {"PackKey"}, {{"TotalValue", each List.Sum([ProductValue]), type number}}), //Sum values on PackKey granularity
    #"Changed Type" = Table.TransformColumnTypes(#"Grouped Rows",{{"PackKey", type text}})
in
    #"Changed Type"



Now we can merge totals into PacksTable to create [ValueShare] column.


When we get this value share in our fact table, we can now use it to split SalesValue column into individual product sales value.


Finally, we have the correct sales values. It would also be good to add a column that will determine whether the product is coming from packs or is it a regular product. This would be useful to compare previous quantities and values with the new ones. We will add an additional column called PackOrRegular:


After we remove redundant columns and change column types, we get the new fact table:


We can see that we have the correct key, name, quantity, and sales value. As you can see in the comparison below, the sum of old sales value and new sales value needs to be the same, while the new quantity needs to be greater than the old quantity (since 1 pack contains more than 1 product).


The whole M code for this solution:

let
    Source = Excel.CurrentWorkbook(){[Name="FactTable"]}[Content],
    #"Changed Type" = Table.TransformColumnTypes(Source,{{"Date", type date}, {"ProductKey", type text}, {"ProductName", type text}, {"SalesQTY", Int64.Type}, {"SalesValue", type number}}),
    #"Merged Queries" = Table.NestedJoin(#"Changed Type", {"ProductKey"}, PacksTable, {"PackKey"}, "PacksTable", JoinKind.LeftOuter),
    #"Expanded PacksTable" = Table.ExpandTableColumn(#"Merged Queries", "PacksTable", {"PackQTY", "Product", "ProductKey", "ProductQTY", "ValueShare"}, {"PackQTY", "Product", "ProductKey.1", "ProductQTY", "ValueShare"}),
    #"Added Conditional Column" = Table.AddColumn(#"Expanded PacksTable", "NewProductKey", each if [ProductKey.1] = null then [ProductKey] else [ProductKey.1]),
    #"Added Conditional Column1" = Table.AddColumn(#"Added Conditional Column", "NewProductName", each if [Product] = null then [ProductName] else [Product]),
    #"Added Custom" = Table.AddColumn(#"Added Conditional Column1", "NewSalesQTY", each if [ProductKey.1] <> null then [SalesQTY]*[ProductQTY] else [SalesQTY]),
    #"Added Custom1" = Table.AddColumn(#"Added Custom", "NewSalesValue", each if [ProductKey.1] <> null then [SalesValue]*[ValueShare] else [SalesValue]),
    #"Added Conditional Column2" = Table.AddColumn(#"Added Custom1", "PackOrRegular", each if [ProductKey.1] = null then "Regular" else "Pack"),
    #"Removed Other Columns" = Table.SelectColumns(#"Added Conditional Column2",{"Date", "NewProductKey", "NewProductName", "NewSalesQTY", "NewSalesValue","PackOrRegular"}),
    #"Changed Type1" = Table.TransformColumnTypes(#"Removed Other Columns",{{"NewProductKey", type text}, {"NewProductName", type text}, {"NewSalesQTY", Int64.Type}, {"NewSalesValue", type number}, {"PackOrRegular", type text}})
in
    #"Changed Type1"


Thank you for reading!
If you have any comments, please leave them below!

Share on facebook
Share on linkedin
Share on twitter

LEARN POWER QUERY

Learn and improve your PowerQuery skills with
Exceed Academy

COMMENTS

Subscribe
Notify of
guest
3 Comments
Newest
Oldest Most Voted
Inline Feedbacks
View all comments

GET LATEST BI NEWS TO YOUR INBOX

Receive the latest updates on all business analyst news across all platforms.

By subscribing you are agreeing to our Privacy Policy.

Related blog posts

Why should you start using Power Apps?

In this article, we will take a look at what is Power Apps, why should we use it, and when organizations can benefit from custom apps created in the Power Apps platform.Many organizations are not yet familiarized with Power Apps capabilities, and the ways it can improve their agility, modernize and automate manual processes.   What is Power...

Read more

Merging with date range using Power Query

In this blog, we will see how to merge two tables in Power Query based on key column and a date range. Merging with multiple matching columns is straightforward and can be achieved in Power Query by selecting merge queries option and holding CTRL key when choosing matching columns like...

Read more

How to process multiple folders with a single PowerQuery script

Let us imagine the following scenario. We have multiple clients sharing their sales data with us. They store the data in dedicated folders that reside in different places. The number of clients, as well as the number of files for each client, will vary across periods. We wish to create...

Read more

Dynamically expand table or record columns in Power Query

In this article, we will show how to solve a common problem of expanding nested tables or records in Power Query, when the column names of the nested objects vary in numbers. In the second part of the article, we will show a more robust technique for dealing with changing...

Read more
3
0
Would love your thoughts, please comment.x
()
x
Scroll to Top