Share on facebook
Share on twitter
Share on linkedin

Using M to dynamically change column names in PowerQuery

Krešimir Ledinski

Krešimir Ledinski

Microsoft certified expert in the field of Business Intelligence. His biggest passions are DAX, M, and data modeling.

In this article, we are going to talk about cleaning and transforming column names dynamically and in a bulk. The article was inspired by a request from a client who had issues exporting data from its data warehousing program to .csv files.

Out of an unknown reason, program was adding random spaces and non-printable characters before or after column names. There was no predictable logic on where it could introduce additional characters in the export. Since the export was scheduled to run every day, it would be a nightmare to have to manually find all column name misspells and correct them.

Using the general approach of demoting headers -> transposing tables -> cleaning the first column -> transposing again was not an option because files were big, therefore double transpose would take forever to process.

Following is the solution we created that could be further enhanced to provide even more flexibility with dynamic column names renaming. We used records, lists, and an iteration to reach the solution, so this should be a pretty educational topic.

For the demonstration purposes, we will use an extract from the Contoso database. If you wish to follow along or use the final function in your solutions, you can download the Excel file here.

The problem

Every time program had exported the data, column names exported with a random number of spaces. All other structures of the file were correct.

Since we needed to consolidate every daily export in a single database (we are talking about 365 different exports for a single year of data), we needed to create a function that will dynamically clean column names from additional spaces.

The solution

We will need to use Table.RenameColumns() function, but with a dynamic argument holding column names to change. If we try to change the column names manually, this is the code we get:

#"Renamed Columns" = Table.RenameColumns(Source,{{"       ProductName", "ProductName"}, {"      Manufacturer", "Manufacturer"}})

Now, if we focus on the second argument of the Table.RenameColumns() function, we can see that it is returning a nested list of lists! If we run that expression, this is how it looks like in PowerQuery:

Each nested list is holding the original and new name of the column. If we want to make our rename columns function dynamic we need to alter the nested lists argument so that it accepts wrong and correct column names for each column.

The following M code will give us the old and new, cleaned column names.

let
    Source = Excel.CurrentWorkbook(){[Name="Table1"]}[Content],
    DemoteHeaders = Table.DemoteHeaders(Source){0},
    CreateTableFromRecord = Record.ToTable(DemoteHeaders),
    DuplicateOldColumnNames = Table.DuplicateColumn(CreateTableFromRecord, "Value", "Value - Copy"),
    RenameColumns = Table.RenameColumns(DuplicateOldColumnNames,{{"Value", "OldColumnName"}, {"Value - Copy", "NewColumnName"}}),
    TrimmedText = Table.TransformColumns(RenameColumns,{{"NewColumnName", Text.Trim, type text}})
in
    TrimmedText

Let’s explain the most important parts about the script so far:

  • {0} in DemoteHeaders step is used to create a record from a table holding only the first row from the demoted headers source table. Table indexing starts from 0, so running this expression will yield the following record

  • After we transformed that record to a table, we duplicated the values column which is holding the old column names.

As a side note, the same could be achieved by using Table.ColumnNames() function which would give us a list of all table column names.

 
  • In the last 2 steps, we renamed both columns and cleaned values in the NewColumnName column.

I only used Text.Trim transformation over the NewColumn, but you can also use any other transformation to further clean column names, like Clean, Text.Proper, etc.

Creating a nested lists argument

Now we have 2 columns holding wrong and correct column names that we need to transform to list of nested lists.
The easiest way to achieve that through the PowerQuery user interface is to add a new custom column and write the following expression.

={[OldColumnName],[NewColumnName]}

{} is a syntax for creating a list object, and inside of that object, we added old and new column names separated with a comma.

 

Now, let’s focus on the formula bar, specifically each keyword.
The keyword each is a sugar syntax for iterating function invoke that is holding a single underscore argument that is entering the function.

Without the sugar syntax, this would be the equivalent of the each keyword.

#"Added Custom" = Table.AddColumn(TrimmedText, "NestedLists", (_)=> {_[OldColumnName],_[NewColumnName]})

The underscore is a record type argument holding all the column values of the current row of an iteration.

So for each row in a table, the underscore will push the currently iterating row old and new column name inside of the list structure that was introduced with the {} bracket syntax.

To get a nested list of lists, we also need to transform that NestedLists column to a list.

The easiest way of drilling to a single column from a table and transforming it to a list is to write the name of the column inside of round brackets right after the expression that is returning a table.

 

 

Remember! We can transform rows from a table to a record type using the curly brackets syntax.
On the other hand, we transform tables to lists by writing the name of the column in round brackets.
If we were to combine round and curly brackets, we would drill into a single cell from that table (picture below).


Creating a function from the query


Receiving an adequate list argument for the Table.RenameColumns() function was the harder part of development. Now that we have the correct argument form, we need to create a function that would accept a table variable with messy column names and clean it with the logic we already set in the list argument.

Following is the complete M function code:

(TableWithDirtyNames as table) as table=>  //InputTable variable
let

    ListObjectWithCleaningLogic =  //We put the whole logic of list creation inside of this variable
        let
            Source = TableWithDirtyNames, // Source should point to the InputTable variable
            DemoteHeaders = Table.DemoteHeaders(Source){0},
            CreateTableFromRecord = Record.ToTable(DemoteHeaders),
            DuplicateOldColumnNames = Table.DuplicateColumn(CreateTableFromRecord, "Value", "Value - Copy"),
            RenameColumns = Table.RenameColumns(DuplicateOldColumnNames,{{"Value", "OldColumnName"}, {"Value - Copy", "NewColumnName"}}),
            TrimmedText = Table.TransformColumns(RenameColumns,{{"NewColumnName", Text.Trim, type text}}),
            #"Added Custom" = Table.AddColumn(TrimmedText, "NestedLists", (_)=> {_[OldColumnName],_[NewColumnName]})[NestedLists]
        in
            #"Added Custom",
    CleanColumnNames = Table.RenameColumns(TableWithDirtyNames,ListObjectWithCleaningLogic) //this step is taking InputTable as a starting point and using list variable 
                       to transform all column names at once
in
    CleanColumnNames


Additional explanations of the function:

  • We have only a single table argument entering our function (TableWithDirtyNames).
  • We put the whole logic of nested lists inside of a ListObjectWithCleaningLogic step. This way we are making the code easier to maintain and change in case we need to add more transformation logic to the new column names.
  • The Source step of the inner environment (environment of the ListObjectWithCleaningLogic variable) should also point to the function input variable.
  • In CleanColumnNames step we called the table as it was at the source step (before we created the nested lists step), and as a second argument to the Table.RenameColumns() function we provided the nested lists applied step.

Wrapping up


We used this technique to dynamically change column names without transposing the table structure. You can further enrich the function in case you need more control over the new column names. In that case, you would need to adjust the inner environment variable. That means that other transformations of the values in the NewColumnName column should be added after the TrimmedText applied step.

We used list and record objects in creating the solution so hopefully you got a clearer picture on how to use them and where. We also glimpsed at the each keyword so overall I hope this article was clear enough and educational.

We also made a video for easier understanding. You can check it below.


In case you have any questions, please feel free to post them down below!

Share on facebook
Share on linkedin
Share on twitter

LEARN POWER QUERY

Learn and improve your PowerQuery skills with
Exceed Academy

COMMENTS

Subscribe
Notify of
guest
2 Comments
Newest
Oldest Most Voted
Inline Feedbacks
View all comments

GET LATEST BI NEWS TO YOUR INBOX

Receive the latest updates on all business analyst news across all platforms.

By subscribing you are agreeing to our Privacy Policy.

Related blog posts

Sorting in Matrix created with calculation groups (PowerBI)

This is the second part of the blog series dedicated to showing workarounds for currently unsupported matrix visual features. In the first part, we explained how to add conditional formatting only for the specified columns, and in this article, we will explain how to introduce sorting. Below, you can test...

Read more

How to track changes in the table in Power Query

If you ever had a need to check if anything changed in the new table compared to the previous export, no matter in which column, then this is the article for you. We have created an M script in Excel that uses the left anti join to check for data...

Read more

Split packs into single products in Power Query

This is going to be an easy to follow article on how to use PowerQuery to easily change source data granularity for only a subset of data. We used a client’s request to split packs into its components as an inspiration to write this post. You can also download Excel...

Read more
2
0
Would love your thoughts, please comment.x
()
x
Scroll to Top