New Ways To Concatenate Text In Excel 2016 With CONCAT() And TEXTJOIN()

I don’t normally blog about Excel topics outside of Power Query and Power Pivot, but I think anyone who has ever done any serious work in Excel has found how difficult it is to concatenate more than two text values (although there are plenty of blog posts like this one showing how to do it). Well, not any more – Microsoft have finally got around to addressing this problem with two new Excel functions, TEXTJOIN() and CONCAT(). The announcement, plus news of other new Excel functions and a new funnel chart type, is here:

https://support.office.com/en-us/article/What-s-new-in-Excel-2016-for-Windows-5fdb9208-ff33-45b6-9e08-1f5cdb3a6c73

First, the bad news: at the time of writing you’ll only be able to use these functions in Excel on the desktop if you’re using the click-to-run version of Excel 2016 that you get through an Office 365 subscription (ie the version installed through the Office 365 portal that gets updated by Microsoft automatically – probably not the same version that you’re running on your desktop if you work for a big company). It is available in Excel Online and Excel Mobile too. I guess they’ll appear in the regular, Windows-installer version of Excel 2016 in a service pack at some point in the future.

How about some examples? First of all, CONCAT() is a successor function for the old CONCATENATE() function – as far as I can see it does everything CONCATENATE() does but crucially also supports ranges as inputs as well as individual text values. So

=CONCAT("The ", "cat ", "sat ", "on ", "the ", "mat.")

Returns

The cat sat on the mat.

image

…but also, if you have each of your words in different cells (nb I’ve added spaces at the end of each word here), you can concatenate all the values in a range like this:

=CONCAT(B2:B7)

image

The TEXTJOIN() function is more flexible and I suspect will be very popular. It gives you two benefits over CONCAT():

  1. The ability to specify a delimiter – a character or characters (for example a comma or a space) to insert between each item of text you want to concatenate
  2. The ability to ignore empty values

Some examples…

First, using a space in the first parameter:

=TEXTJOIN(" ",FALSE,"The","cat","sat","on","the","mat.")

Returns once again

The cat sat on the mat.

…without having to add spaces to the end of each word, as I did with the CONCAT() example above:

image

And

=TEXTJOIN(",",FALSE,"The","cat","sat","on","the","mat.")

Returns

The,cat,sat,on,the,mat.

image

Finally, here’s an example of how TEXTJOIN() can be used with a range with the second parameter being set to TRUE to ignore empty cells in a range:

=TEXTJOIN(",", TRUE,B2:B9)

With the data in the screenshot below again returns

The,cat,sat,on,the,mat.

image

You can see an Excel Online worksheet with all these examples here.

Power BI Desktop, Sort By Column And DAX Calculations That Use The All() Function

Recently I came across a problem where a DAX measure gave different results in Excel and Power BI Desktop. It turned out not to be a bug or even a feature, but since it confused me for a few minutes I thought it was worth writing up in case anyone else ran into it.

Consider a model built in Excel using Power Pivot from the following two tables:

image

image

With two measures defined in the model as follows:

Sum of Sales:=SUM(Sales[Sales])

Share:=DIVIDE([Sum of Sales], CALCULATE([Sum of Sales], ALL(Month[Month Name])))

…and, importantly, the Sort By Column property on the Month Name column set to Month Number:

image

…it’s possible to build a PivotTable that looks like this:

image

However, when you import the same model into Power BI Desktop and recreate the PivotTable above in the Report view you’ll see that the Share calculation no longer gives the same values:

image

What’s the problem here? It’s all down to the way Power BI Desktop generates DAX queries when you set the Sort By Column property. The Excel PivotTable above generates the following MDX:

SELECT 
{[Measures].[Sum of Sales],[Measures].[Share]} 
DIMENSION PROPERTIES PARENT_UNIQUE_NAME,MEMBER_VALUE,
HIERARCHY_UNIQUE_NAME ON COLUMNS , 
NON EMPTY 
Hierarchize(
{DrilldownLevel({[Month].[Month Name].[All]},,,INCLUDE_CALC_MEMBERS)}) 
DIMENSION PROPERTIES PARENT_UNIQUE_NAME,MEMBER_VALUE,
HIERARCHY_UNIQUE_NAME ON ROWS  
FROM [Model] 
CELL PROPERTIES VALUE, FORMAT_STRING, LANGUAGE, 
BACK_COLOR, FORE_COLOR, FONT_FLAGS

On the rows axis, as you would expect, the only hierarchy you see is Month Name.

However, if you run a Profiler trace (you can find out how to do this here although it’s much easier to get the Process ID from DAX Studio) to look at the DAX query generated by Power BI you’ll see

EVALUATE
TOPN (
    102,
    SUMMARIZECOLUMNS (
        ROLLUPADDISSUBTOTAL (
            ROLLUPGROUP ( 'Month'[Month Name], 'Month'[Month Number] ), 
            "IsGrandTotalRowTotal"
        ),
        "Share", 'Sales'[Share],
        "Sum_of_Sales", 'Sales'[Sum of Sales]
    ),
    [IsGrandTotalRowTotal], 0,
    'Month'[Month Number], 1,
    'Month'[Month Name], 1
)
ORDER BY
    [IsGrandTotalRowTotal] DESC,
    'Month'[Month Number],
    'Month'[Month Name]

The difference here is that the Month Number and Month Name fields are both present – they have to be since the query has to sort by Month Number. In MDX the order of members on a hierarchy can be set inside the model; in a DAX query you can only sort using an ORDER BY clause and for that to work, the field you’re ordering by must be present in the query.

The Share measure calculation needs to be changed in order to fix this, then. Here’s one way of doing this:

Share =

DIVIDE([Sum of Sales],

CALCULATE([Sum of Sales], ALL(Month[Month Name], ‘Month'[Month Number])))

Drillthrough On Multiselect Now Works In Excel 2016 And SSAS 2016

One unadvertised – but still very welcome – feature of Excel 2016 is that it is now possible to do a drillthrough in a PivotTable when there is a multiselect on a filter or a slicer. It only works if you are using SSAS 2016 on the server, or if you’re querying the Excel Data Model/Power Pivot, because the fix needed changes both in Excel and on the server.

In Excel 2013 and earlier, when you try to do a default drillthrough where there is a multiselect on a filter or a slicer, you get the following error message:

image

Show Details cannot be executed when multiple items are selected in a report filter or in a slicer. Select a single item for each field in the report filter area and for each slicer connected to this PivotTable before performing a drillthrough.

For drillthrough actions, where there is a multiselect, you won’t see the action listed under the Additional Actions right-click menu at all.

This is the result of two limitations. First, there’s the issue with the MDSCHEMA_ACTIONS schema rowset that I blogged about here. In SSAS 2016 you can now pass in multiple members from the same hierarchy in the COORDINATE restriction column, as shown in this example I captured in Profiler:

<RestrictionList xmlns="urn:schemas-microsoft-com:xml-analysis" 
xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/">
<CUBE_NAME>Model</CUBE_NAME>
<ACTION_TYPE>401</ACTION_TYPE>
<COORDINATE>
([DimDate].[DateKey].&amp;[20010701],
[DimDate].[DateKey].&amp;[20010702],
[Measures].[Sum of SalesAmount])
</COORDINATE>
<COORDINATE_TYPE>6</COORDINATE_TYPE>
</RestrictionList>

Second, subselects on a drillthrough MDX query are ignored in SSAS 2014 and earlier. For example, here’s a drillthrough query generated by an Excel 2016 PivotTable with a multiselect slicer connected to an SSAS Tabular model:

DRILLTHROUGH MAXROWS 1000 
SELECT [Measures].[Sum of SalesAmount] ON COLUMNS 
FROM 
(SELECT  FROM (
SELECT 
({[DimDate].[DateKey].&[20010702],[DimDate].[DateKey].&[20010701]}) 
ON COLUMNS  
FROM [Model]))

When run against SSAS 2014, this drillthrough returns records that are not filtered by date; when run against the same model in SSAS 2016, the subselect is respected and the resultset is filtered by the selected dates.

Excel 2016 BI Branding Changes

Office 2016 is on the verge of being released, and although Power BI is the cool new thing Excel 2016 has added several new BI-related features too. What is also interesting – and less well publicised – is that several of the BI features in Excel 2016 have been rebranded. Specifically:

  • Power Query (which is now no longer an add-in, but native Excel functionality) is not called Power Query any more, but “Get & Transform”. It has also been squeezed onto the Data tab, next to the older data import functionality:
    image

    image

  • Power Map is not called Power Map any more, but “3D Maps”
    image
  • Power View is still Power View, but as John White points out here it is no longer visible on the ribbon by default, hidden from new users, although it’s easy to add the Power View button back onto the ribbon. Power View in Excel 2016 is unchanged from Power View in Excel 2013. Read into this what you will.
  • Although Power Pivot still has its own tab on the ribbon (and has finally got a space in the middle of its name), there’s also a “Manage Data Model” button on the Data tab in the ribbon that is visible even when the Power Pivot add-in has not been enabled:
    image
    Clicking this button opens the Power Pivot window. There’s a subtle distinction between Power Pivot the add-in and the Excel Data Model (which is the database engine behind Power Pivot, and which is present in all Windows desktop editions of Excel regardless of whether the add-in is enabled or not) that has existed since Excel 2013 and which is generally unknown or misunderstood. The fact this button is titled “Manage Data Model” rather than “Power Pivot” is telling.
  • All the add-ins now have the collective name “Data Analysis add-ins” and can be enabled with a single click:
    image

So, clearly Excel has moved away from branding all its BI functionality as Power-something. My guess, informed by various conversations with various people in the know, is that this has happened for a couple of reasons:

  • The ‘Power’ prefix was intimidating for regular Excel users, who thought it represented something difficult and therefore not for them; it also made it look like this was functionality alien to Excel rather than a natural extension of Excel.
  • Having separate Power add-ins led to a disjointed experience, rather than giving the impression that all of these tools could and should be used together. It also made comparisons, by analysts and in corporate bake-offs, with other competing tools difficult – were the Power-addins separate tools, or should they be considered a single tool along with Excel?
  • Previously there was a lot of confusion about whether these add-ins are anything to do with ‘Power BI’ or not. Up to now, depending on who you talked to, they either were or weren’t officially part of Power BI. Now there is a clear distinction between Excel and Power BI, despite the close technical relationships that remain.

The new names certainly address these problems and on balance I think this change was the right thing to do, even if I was quite annoyed when I first found out about them. There are significant downsides too: for example, changing the names means that several years of books, blog posts, articles and conference presentations about Power Query and Power Map now won’t be found by new users when they search the internet for help. Similarly, it won’t be obvious to new users that a lot of content is relevant for both Power BI Desktop and Excel. Now that the Power Query name has been de-emphasised, why should anyone looking at my old blog posts on that subject know that what I’ve written is still relevant for Excel 2016’s “Get & Transform” and Power BI Desktop? What would I call a second edition of my Power Query book, if I wrote one, given that Power Query exists only as the relatively nondescript “Get & Transform” in Excel 2016 and “Get Data” in Power BI Desktop?

Exporting All M Code From Power Query In Excel 2013

Here’s a tip that I found out about on the Power Query Technet forum that I thought was worth repeating. If you ever need a quick way of exporting all the M code for all of the queries in an Excel 2013 workbook, just open the Power Query window and click the Send a Frown button shown here:

image

Then, when the Send Feedback dialog appears, make sure Include Formulas is ticked then click OK:

image

When you do that, you’ll get an email created for you that contains a whole lot of debugging information, plus all of the M code for your queries:

image

Obviously, don’t send this email to Microsoft!

It’s quite easy to see the individual queries. You then need to go to your new workbook, create a new query by selecting the Blank Query option under Other Sources, and then open the Advanced Editor window and paste the code for each query in. However, when you do that you will need to modify the code a bit. There are three pieces of code you will need to delete:

  • At the beginning of the M code, where it says
    section Section1;
  • At the beginning of each query, a piece of code that contains the original name of the query:
    shared MyOriginalQueryName =
  • At the very end, a semi-colon

Of course in Excel 2016 we’ll have much better options for copying, pasting and moving queries in VBA, but until then we’ll have to live with hacks like this.

Checking Columns Are Present In Power Query

Something I was meaning to mention in my previous post (but forgot about…) was that in a lot of cases you don’t really care if your output contains all the required columns – it’s enough just to check that your input contains all the required columns. Luckily M has a function called Table.HasColumns() to help you do this. For example, using the csv source file from my previous post, which should have three columns called Product, Month and Sales, the following query will return true if the source file has these columns and false if it doesn’t:

let
    Source = Csv.Document(File.Contents("C:\MissingColumnDemo.csv"),[Delimiter=",",Encoding=1252]),
    PromotedHeaders = Table.PromoteHeaders(Source),
    CheckColumns = Table.HasColumns(PromotedHeaders, {"Product", "Month", "Sales"})
in
    CheckColumns

Ensuring Columns Are Always Present In A Table Returned By Power Query

Disappearing or renamed columns in your data source can cause all kinds of problems when you’re importing data using Power Query: errors when you try to refresh the query, broken calculations in Power Pivot, PivotTables that reformat themselves and then need to be manually recreated. As a result, it can be a very good idea to build some logic into your Power Query queries that ensures that a table always contains the columns you’re expecting.

Consider the following csv file:

image

In Power Query, if you connect to it and create a query you’ll end up with something like this:

let
    Source = Csv.Document(File.Contents("C:\Demo.csv"),null,",",null,1252),
    #"First Row as Header" = Table.PromoteHeaders(Source),
    #"Changed Type" = Table.TransformColumnTypes(#"First Row as Header",{{"Sales", Int64.Type}})
in
    #"Changed Type"

Let’s assume that this query is called GetSourceData. Let’s also assume that your output from Power Query should always be a table that has the three columns Product, Month and Sales, and that Product and Month should be text columns and Sales should be numeric. The basic steps to take to ensure that this always happens, even if the columns in the csv file change, are as follows:

  1. Create a query that connects to your data source, for example like GetSourceData above
  2. Create a query that will always return a table with the columns you want, but which contains no rows
  3. Append the second table onto the end of the first table. This will result in a table that contains all of the columns from both tables.
  4. Remove any unwanted columns.

There are a number of ways to create the empty table needed in step 2. You could use the #table() function if you’re confident writing M code, and the following single line query (no Let needed) does the job:

#table(
 type table [Product=text, Month=text, Sales=number],
 {})

image

Alternatively, if you wanted something that an end user could configure themselves, you could start with a table in Excel like this:

image

then transpose it, use the first row of the resulting table as the header row, then set the data types on each table to get the same output:

let
    Source = Excel.CurrentWorkbook(){[Name="Columns"]}[Content],
    #"Transposed Table" = Table.Transpose(Source),
    #"First Row as Header" = Table.PromoteHeaders(#"Transposed Table"),
    #"Changed Type" = Table.TransformColumnTypes(#"First Row as Header",
	{{"Product", type text}, {"Month", type text}, {"Sales", Int64.Type}})
in
    #"Changed Type"

Assuming that this query is called ExpectedColumns, it’s then a trivial task to create a third query that appends the ExpectedColumns query onto the end of the GetSourceData query. If GetSourceData includes all the columns it should then this append will have no effect at all; if some of the columns have changed names or disappeared, you’ll see all of the columns present from both GetSourceData and ExpectedColumns in the output of the append. For example if the Month column in GetSourceData is renamed Months then the output of the append will look like this:

image 

Finally, in this third query you need to select all the columns you want (ie all those in the ExpectedColumns query) and right click/Remove Other Columns, so you remove all the columns you don’t want. In the previous example that gives you:

image

The point here is that even though the Month column only contains nulls, and the actual month names have been lost, the fact that the columns are all correct means that you won’t get any errors downstream and your PivotTables won’t be reformatted etc. Once you’ve fixed the problem in the source data and refreshed your queries, everything will go back to normal.

Here’s the code for this third query:

let
    Source = GetSourceData,
    Append = Table.Combine({Source,ExpectedColumns}),
    #"Removed Other Columns" = Table.SelectColumns(Append,{"Product", "Month", "Sales"})
in
    #"Removed Other Columns"

For bonus points, here’s another query that compares the columns in GetSourceData and ExpectedColumns and lists any columns that have been added to or are missing from GetSourceData:

let
    //Connect to Excel table containing expected column names
    ExcelSource = Excel.CurrentWorkbook(){[Name="Columns"]}[Content],
    //Get list of expected columns
    ExpectedColumns = Table.Column(ExcelSource, "ColumnName"),
    //Get a list of column names in csv
    CSVColumns = Table.ColumnNames(GetSourceData),
    //Find missing columns
    MissingColumns = List.Difference(ExpectedColumns, CSVColumns),
    //Find added columns
    AddedColumns = List.Difference(CSVColumns, ExpectedColumns),
    //Report what has changed
    OutputMissing = if List.Count(MissingColumns)=0 then
                     "No columns missing" else
                     "Missing columns: " & Text.Combine(MissingColumns, ","),
    OutputAdded = if List.Count(AddedColumns)=0 then
                     "No columns added" else
                     "Added columns: " & Text.Combine(AddedColumns, ","),
    Output = OutputMissing & "   " & OutputAdded
in
    Output

image

You can download the sample workbook for this post here.