The Ellipsis/Not Implemented Error In M

Recently I was creating a parameter in Power BI Desktop and had it configured something like this:

image

I didn’t bother to choose anything in the Default Value dropdown box, and when I looked at the code for the parameter in the Advanced Editor I saw this:

"a" 
meta 
[IsParameterQuery=true, 
 List={"a", "b", "c"}, 
 DefaultValue=..., 
 Type="Text", 
 IsParameterQueryRequired=true]

I was interested to know what the ellipsis symbol (three dots …) in the DefaultValue field in the record meant, and looking in the Language Reference I found that in M it can be used for two purposes: as an open record marker (maybe something for a future blog post) and, as in this case, as a quick way of returning an error. The language reference says that it is directly equivalent to the following expression, which returns a “Not Implemented” error:

error Error.Record("Expression.Error", "Not Implemented")

But from what I can see, it actually returns a “Value was not specified” error instead. Here’s another example of how it can be used in a function definition, and what it returns:

let
    MyFunction = (x,y) => if x>y then true else ...,
    Output = MyFunction(0,1)
in
    Output

image

It’s not something I think I’ll be using in my own code, but it’s good to know what it means!

Creating Sequences Of Integers And Characters In Power BI/Power Query Lists

When you’re writing your own M code you often find yourself needing to create a list containing a sequence of numbers or characters. Luckily the M language allows you to do this very easily when you are defining lists by using expressions in the format

{lowest_integer..highest_integer}

For example, imagine you want to create a list with all of the integers between 1 and 5. Instead of writing

{1, 2, 3, 4, 5}

You can write the following:

{1..5}

and it will return the same list:

image

You can also use this format in more complex list definitions, for example

{1..3, 5, 7..9}

Returns the list

{1, 2, 3, 5, 7, 8, 9}

image

When you’re using this technique you must always put the lowest integer first and the highest integer last; if you don’t do this you get an empty list. So, for example, the expression

{5..1}

Returns an empty list:

image

It’s also possible to use this technique to create lists of characters. For example, the expression:

{"a".."z"}

Returns a list containing all of the lowercase letters of the alphabet:

image

The first character in the expression has to have the lowest Unicode value and the second character has to have the highest Unicode value, and the sequence of characters returned is the list of all characters with Unicode values in that range. As a result, the expression

{"#".."%"}

Returns the list

{"#", "$", "%"}

image

And the expression

{"a".."Z"}

Returns an empty list because the Unicode value of “a” is greater than the Unicode value of “Z”.

This technique doesn’t work for decimal numbers, dates or other data types. If you want a sequence of values of these types you need to use functions list List.Dates() and List.Numbers().

Lists are, of course, used all over the place in M. Building on my recent post on using #table() to create tables with no data source, here’s one last example of using lists containing sequences to create a simple table with three columns and three rows:

#table({"A".."C"}, {{1..3}, {7..9}, {11..13}})

image

Two Upcoming Power BI Webinars

This is just a quick post to let you know about two webinars I’m presenting soon. First, on the 21st of June (today!) at 1pm PST I’m presenting a webinar on “Introduction to M” as part of the Power BI community webinar series; more details on it and how to register can be found here:

https://powerbi.microsoft.com/en-us/blog/week-of-june-20th-webinars-power-bi-security-and-introduction-to-m-by-chris-webb/

Second, I’m presenting a webinar with Pyramid Analytics about their on-premises BI solution and how it integrates with Power BI on the 30th of June called “The Public Cloud Is Not For Everyone”:

http://pages.pyramidanalytics.com/ThePublicCloudIsNotForEveryone.html

[Full disclosure – I’m being paid for this – but it won’t be marketing fluff, just honest discussion, similar to the webinars I did with Pyramid last year]

Power BI Diagnostics, Trace Logs And Query Execution Times (Again)

I’ve blogged a few times now about how to monitor Power BI query execution times, most recently here, and while using functions like DateTime.FixedLocalNow() is all very well I’ve never been happy with the fact that you have to alter your queries in order to be able to time them. Using the Power BI trace logs always seemed a much better option – and since the log files are in a reasonably easy to understand text format, you can use a Power BI query to load data from them. I wrote a post on importing data from the trace logs (using Power Query) here back in 2014 and Rui Romano has a great post on doing the same thing for Power BI here. The big problem with the Power BI trace logs, though, is that there is too much information in them: they’re really meant for Microsoft internal use, I guess, and not for the likes of you or me. In fact it’s all low-level data in there and there’s no way to see (as far as I know) the overall time taken for a query to run.

However, last month when I was looking for new M functions using the old #shared trick I came across the Diagnostics.Trace() function which allows you to write to the Power BI trace log direct from your query.  After a bit of experimentation, I found out that this function is the key to a truly elegant way of logging query execution times. Here’s a proof-of-concept that I’ve put together to show how it could be used…

First of all, consider the following query called LongQuery:

let
    Source = Function.InvokeAfter(
                     ()=>"Hello!", 
                     #duration(0,0,0,1 + Number.RandomBetween(0,5)))
in
    Source

image

It returns the text “Hello” and uses the Function.InvokeAfter() and Number.RandomBetween() functions to wait anything from one to six seconds before doing so. Let’s pretend that this is a real query that does something complex and that we’re trying to tune it.

Rather than alter this query, instead the best thing to do is to create a new query that calls this query through Diagnostics.Trace(). Here’s the code:

let
    Executing = Diagnostics.Trace(
		TraceLevel.Information, 
		#"Trace Message", 
		()=>LongQuery, 
		true)
in
    Executing

image

I’ve called this query Diagnostics. All it does is return the same value as LongQuery but in doing so it also writes a single message to the trace log. #”Trace Message” is a text parameter that allows you to specify what message to write – you might want to put some kind of version number in here, or a description of any changes you’ve made to your query while tuning it:

image

You should also make sure that the Diagnostics query doesn’t get loaded into the data model by making sure the Enable Load option is deselected:

image

One important thing to remember is that for the Diagnostics query to actually log anything, you need to have the Enable Tracing option turned on in the main Options dialog (you can find this on the File menu in the main Power BI window under Options and settings/Options):

image

image

You will also need to remember to turn this on every time you open Power BI Desktop.

With that done, you just need to hit the Refresh Preview button to refresh the Diagnostics query in the Query Editor window:

image

The Diagnostics query will run, calling the LongQuery query and making an entry in the Power BI trace log. You can find the directory with the trace logs in them by clicking on the Open Traces Folder link in the Options dialog shown above, and you can open the any of the trace log files using your favourite text editor.

image

Loading this trace file into Power BI is pretty straightforward using code similar to that in Rui’s or my blog post. The difference is now, instead of loading everything, you can filter the rows to only load the events created by Diagnostics.Trace(). as far as I can see you can identify these rows because they have the value

“Action”:”Engine/User”,”Message”

Somewhere inside them, as you can see from the last screenshot. Most importantly of all, the duration logged for this event is the amount of time taken to execute the query.

Therefore, with some Power BI M magic, you can load all the data from all of the log files in the log directory, filter the rows to only those created by Diagnostics.Trace(), and end up with a query that gives you something like this:

image

The Execution Index and Date Time columns tell you when the query was executed, the Message column contains the value of the Trace Message parameter at the time the query was executed, and the Duration column tells you how long the query took to execute. You can then, of course, load this table into the data model and build a report that helps you analyse these execution times:

image

To sum up, the workflow for tuning your query is:

  • Make some changes to the LongQuery query that hopefully make it faster
  • Update the Trace Message parameter with some notes about which version of the LongQuery query it is that you’ll be testing
  • Click the Refresh Preview button for the Diagnostics query to test how long LongQuery now runs for
  • Refresh, or load, the query that reads the data from the trace logs so you can see how all of your changes have affected query execution times

Some other closing comments:

  • As I said before, this is just a proof-of-concept and I although haven’t tested it thoroughly but it seems to work.
  • Microsoft may well change the format of its trace logs or make other changes in the future that could break this code shown here.
  • Although I haven’t tried it, this approach should work for Power Query too with some minor modifications.
  • Since it relies on you turning on tracing in Power BI Desktop, this won’t work after you have published your reports. In fact I’ve deliberately structured the queries here so that logging only takes place when you click the Refresh Preview button for the Diagnostics query in the Query Editor window – I found this was the best way to reduce noise in the log files, especially given the fact that a query can be executed multiple times when data is actually being loaded into the data model.
  • That said, the Power BI Enterprise Gateway has its own log files, so maybe you could adapt this approach somehow to work with them?
  • I suspect there are lots of other uses that Diagnostics.Trace() could be put to for even lower-level debugging and analysis of queries, but that’s something for a future blog post.

You can download the sample code as a Power BI template from here. When you open the template you’ll be prompted to enter two parameter values: the message to record for your events in the trace log and the path of your trace log directory.

Creating Tables In Power BI/Power Query M Code Using #table()

After my post earlier this week on creating current day/week/month/year reports in Power BI a few people asked me for a more detailed explanation of the way I was creating tables without using a data source in my M code. This is something I find myself doing quite a lot when I’m loading data with Power BI and Power Query, and while there are several ways of doing this I find that using the #table() intrinsic function is the most elegant option.

Let’s look at some examples. The following query returns a table with two columns (called “First Column” and “Second Column”) and two rows containing the values from 1 to 4:

#table({"First Column", "Second Column"}, {{1,2},{3,4}})

image

No data source is needed – this is a way of defining a table value in pure M code. The first parameter of the function takes a list of column names as text values; the second parameter is a list of lists, where each list in the list contains the values on each row in the table.

In the last example the columns in the table were of the data type Any (the ABC123 icon in each column header tells you this), which means that they can contain values of any data type including numbers, text, dates or even other tables. Here’s an example of this:

#table(
 {"First Column", "Second Column"}, 
 {
  {1,"Hello"},
  {#date(2016,1,1),3}
 }
)

image

While this is flexible it’s not exactly practical: in almost all cases the Any data type is a bad choice for loading data, and you need to explicitly set the data type for each column. You can set data types for columns quite easily as a separate step, but it is also possible to set column data types using #table():

#table(
 type table
    [
        #"Number Column"=number, 
        #"Text Column"=text,
        #"Date Column"=date
    ], 
 {
  {1,"Hello",#date(2016,1,1)},
  {2,"World",#date(2017,12,12)}  
 }
)

image

In this example the first parameter is no longer a list of column names but a declaration of a table type that not only has column names in but also column types. You can see from the icons in the column headers in the screenshot above that the column called “Number Column” has a data type of number, “Text Column” has a data type of text, and “Date Column” has a data type of date.

Of course if you need a fixed table value in Power BI you could use the “Enter Data” button or, if you’re using Excel and Power Query you could create an Excel table and then use the Excel.CurrentWorkbook() function to load the contents of it; if you or your end users need to edit the values in your table easily then you should use one of these two options. On the other hand if you don’t want users to be able to edit the values in the table or, more likely, you are generating the contents of your table using functions that return lists (as in my previous post) then #table() is the way to go.

Creating Current Day, Week, Month And Year Reports In Power BI Using Bidirectional Cross-Filtering And M

One very common requirement when creating a Power BI report is the ability to apply a filter for the current day, week, month, quarter or year. There are several ways of implementing this: you could add relative date columns to your Date table as I show here (I used DAX calculated columns but you could also do this in M quite easily too); you could also build the filter into the DAX for your measures, although that could mean you end up with a lot of quite complex measures.

Last week I received an email asking for help with an interesting variation on this problem: how can you create a report with a single slicer that allows you to switch between showing data for the current day, week, month or year? The requirement to have a single slicer is important here: if you create new columns on the date table, that would allow you to have a single slicer that allows for selecting the current day or any relative day, or the current week or any relative week, or the current month and any relative month, or the current year and any relative year,  but it wouldn’t allow you to select weeks, months and years together in the same slicer.

Here are some screenshots showing what we want to achieve: a report with two measures and a single slicer that allows the user to switch between displaying data for a variety of relative time periods:

image

image

image

The way to achieve this is not all that different from the calculated column approach, but it requires a separate table to model the many-to-many relationship between all the required relative period selections and the dates in them, as well as the use of bidirectional cross-filtering between tables (which I blogged about here). The data model I used for this report looks like this:

image

The Sales table just contains sales data; the Date table is a normal Power BI date table. The Period table is the interesting table here: it contains one row for each combination of relative time period (eg “Today”, “Current Week To Date”, “Rolling Month”) and date:

image

It’s the Period column on this table that is used to create the slicer in the screenshots above. The Sort column is used along with Power BI’s Sort By Column functionality to make the values in the Period column appear in a sensible order in the report.

Notice also that on the relationship between the Period table and the Date table the Cross filter direction property is set to Both:

image

This means that a selection on the Period table travels up the relationship to the Date table and then back down the relationship from Date to Sales. For example, selecting “Current Week” in the Period table will select the dates in the current week on the Date table, which in turn selects the rows for those dates on the Sales table.

The challenge, though, is to write the query to populate the Period dimension. I’ve done this in two parts. First, here’s a function called CreatePeriodTable that returns a table for a single time period selection. It takes the name of the time period and a start date and end date, and will return a table with one row for each date in the date range:

(
PeriodName as text, 
StartDate as date, 
EndDate as date, 
SortOrder as number
) as table =>
let
    DayCount = Duration.Days(EndDate-StartDate)+1,
    DateList = List.Dates(StartDate,DayCount,#duration(1,0,0,0)),
    AddPeriodName = List.Transform(DateList, 
        each {PeriodName,_,SortOrder}),
    CreateTable = #table(
        type table[Period=text, Date=date, Sort=number], 
        AddPeriodName)
in
    CreateTable

For example, calling this function like so:

CreatePeriodTable("Demo", #date(2016,1,1), #date(2016,1,5),1)

Returns the following table (with the dates shown in dd/mm/yyyy format):

image

Second, here’s a query that calls this function once for each of the time periods you want to be able to select by and creates a single table that contains all of the rows returned by each of the function calls:

let
    TodaysDate = Date.From(DateTimeZone.FixedUtcNow()),
    Ranges = {
                {"Today", 
                TodaysDate, 
                TodaysDate, 
                1},
                {"Current Week To Date", 
                Date.From(Date.StartOfWeek(TodaysDate)), 
                TodaysDate, 
                2},
                {"Current Month To Date", 
                Date.From(Date.StartOfMonth(TodaysDate)), 
                TodaysDate, 
                3},
                {"Current Year To Date", 
                Date.From(Date.StartOfYear(TodaysDate)), 
                TodaysDate, 
                4},
                {"Rolling Week", 
                Date.AddWeeks(TodaysDate,-1) + #duration(1,0,0,0), 
                TodaysDate, 
                5},
                {"Rolling Month", 
                Date.AddMonths(TodaysDate,-1) + #duration(1,0,0,0), 
                TodaysDate, 
                6},
                {"Rolling Year", 
                Date.AddYears(TodaysDate,-1) + #duration(1,0,0,0), 
                TodaysDate, 
                7}
             },
    GetTables = List.Transform(Ranges, 
            each CreatePeriodTable(_{0}, _{1}, _{2}, _{3})),
    Output = Table.Combine(GetTables)
    
in
    Output

In this query the Ranges step contains a list of lists, where each list in the list represents a time period with its start and end dates in the same order that you’d pass these values as parameters to the CreatePeriodTable() function. I’ve deliberately structured the code to make it easy to add new time period to the list. Hopefully the example time periods in the query above give you a good idea of what’s possible in M and all the functions it gives you for calculating different dates. The GetTables step loops over this list and calls the CreatePeriodTable() function for each list in the list, and the Output step combines all the data into a single table.

All of the dates ranges here end with today’s date, as returned by the DateTimeZone.FixedUtcNow() function, but you may want to check out Ken’s post here on handling time zones in M depending on your exact requirements. Because this happens in M when the data is loaded the value of today’s date will be fixed at the point in time that data refresh took place.

You can download the example workbook for this post here.

Understanding Let Expressions In M For Power BI And Power Query

When you start writing M code for loading data in Power Query or Power BI, one of the first things you’ll do is open up the Advanced Editor for a query you’ve already built using the UI. When you do that you’ll see a very scary chunk of code (and at the time of writing there’s no intellisense or colour coding in the Advanced Editor, making it even more scary) and you’ll wonder how to make sense of it. The first step to doing so is to understand how let expressions work in M.

Each query that you create in Power BI Desktop or Power Query is a single expression that, when evaluated, returns a single value – and that single value is usually, but not always, a table that then gets loaded into the data model. To illustrate this, open up Power BI Desktop (the workflow is almost the same in Power Query), click the Edit Queries button to open the Query Editor window and then click New Source/Blank Query to create a new query.

image

Next, go to the View tab and click on the Advanced Editor button to open the Advanced Editor dialog:

image

You’ll notice that this doesn’t actually create a blank query at all, because there is some code visible in the Advanced Editor when you open it. Delete everything there and replace it with the following M expression:

"Hello " & "World"

image

Hit the Done button and the expression will be evaluated, and you’ll see that the query returns the text value “Hello World”:

image

Notice how the ABC icon next to the name of the Query – Query1 – indicates that the query returns a text value. Congratulations, you have written the infamous “Hello World” program in M!

You might now be wondering how the scary chunk of code you see in the Advanced Editor window for your real-world query could possibly be a single expression – but in fact it is. This is where let expressions come in: they allow you to break a single expression down into multiple parts. Open up the Advanced Editor again and enter the following expression:

let
    step1 = 3,
    step2 = 7,
    step3 = step1 * step2
in
    step3

image

Without knowing anything about M it’s not hard to guess that this bit of code returns the numeric value 21 (notice again that the 123 icon next to the name of the query indicates the data type of the value the query returns):

image

In the M language a let expression consists of two sections. After the let comes a list of variables, each of which has a name and an expression associated with it. In the previous example there are three variables: step1, step2 and step3. Variables can refer to other variables; here, step3 refers to both step1 and step2. Variables can be used to store values of any type: numbers, text, dates, or even more complex types like records, lists or tables; here, all three variables return numbers. The Query Editor is usually clever enough to display these variables as steps in your query and so displays then in the Applied Steps pane on the right-hand side of the screen:

image

The value that the let expression returns is given in the in clause. In this example the in clause returns the value of the variable step3, which is 21.

It’s important to understand that the in clause can reference any or none of the variables in the variable list. It’s also important to understand that, while the variable list might look like procedural code it isn’t, it’s just a list of variables that can be in any order. The UI will always generate code where each variable/step builds on the value returned by the previous variable/step but when you’re writing your own code the variables can be in whatever order that suits you. For example, the following query also returns the value 21:

let
    step3 = step1 * step2,
    step2 = 7,
    step1 = 3
in
    step3

image

The in clause returns the value of the variable step3, which in order to be evaluated needs the variables step2 and step1 to be evaluated; the order of the variables in the list is irrelevant (although it does mean the Applied Steps no longer displays each variable name). What is important is the chain of dependencies that can be followed back from the in clause.

To give another example, the following query returns the numeric value 7:

let
    step3 = step1 * step2,
    step2 = 7,
    step1 = 3
in
    step2

image

In this case, step2 is the only variable that needs to be evaluated for the entire let expression to return its value. Similarly, the query

let
    step3 = step1 * step2,
    step2 = 7,
    step1 = 3
in
    "Hello" & " World"

image

…returns the text value “Hello World” and doesn’t need to evaluate any of the variables step1, step2 or step3 to do this.

The last thing to point out is that if the names of the variables contain spaces, then those names need to be enclosed in double quotes and have a hash # symbol in front. For example here’s a query that returns the value 21 where all the variables have names that contain spaces:

let
    #"this is step 1" = 3,
    #"this is step 2" = 7,
    #"this is step 3" = #"this is step 1" * #"this is step 2"
in
    #"this is step 3"

image

How does all this translate to queries generated by the UI? Here’s the M code for a query generated by the UI that connects to SQL Server and gets filtered data from the DimDate table in the Adventure Works DW database:

let
    Source = Sql.Database("localhost", "adventure works dw"),
    dbo_DimDate = Source{[Schema="dbo",Item="DimDate"]}[Data],
    #"Filtered Rows" = Table.SelectRows(dbo_DimDate, 
                                    each ([DayNumberOfWeek] = 1))
in
    #"Filtered Rows"

Regardless of what the query actually does, you can now see that there are three variables declared here, #”Filtered Rows”, dbo_DimDate and Source, and the query returns the value of the #”Filtered Rows” variable. You can also see that in order to evaluate the #”Filtered Rows” variable the dbo_DimDate variable must be evaluated, and in order to evaluate the dbo_DimDate variable the Source variable must be evaluated. The Source variable connects to the Adventure Works DW database in SQL Server; dbo_DimDate gets the data from the DimDate table in that database, and #”Filtered Rows” takes the table returned by dbo_DimDate and filters it so that you only get the rows here the DayNumberOfWeek column contains the value 1.

image

That’s really all there is to know about let expressions. It explains why you can do the kind of conditional branching that Avi Singh describes here; and also why, when I first tried to come up with a way to time how long a query takes to execute, I had to bend over backwards to ensure that all the variables in my let expression were executed in the correct order (though it turns out there’s an easier way of doing this). I hope you find this useful when writing your own M code.