Creating A Partitioned Table In SSAS Tabular 2017 And SSDT Using M Functions

One of the coolest new features in SSAS Tabular 2017 and Azure Analysis Services is the integration of Power Query and M for data loading. Over the last year or so the Analysis Services team blog has posted a lot of fairly complex examples of how to use this functionality, but now that the latest release of SSDT has proper support for shared expressions I thought it would be a good idea to show a simple example of how to use it to create a partitioned table using M functions.

For this example I’ll be using the FactInternetSales fact table from the Adventure Works DW sample database, and the aim is to create a table in an SSAS Tabular project that has one partition for each year of data in FactInternetSales. Assuming that a new SSAS Tabular project has been created at the 1400 compatibility level with an integrated workspace:

image

…the first thing to do is to right-click on the Data Sources folder in the Tabular Model Explorer pane and select Import From Data Source:

image

This brings up the Get Data dialog:

image

Select SQL Server database and then click Connect. Enter the server name and database name in the SQL Server database dialog:

image

Choose how SSAS is to authenticate when it connects to the SQL Server database and click Connect:

image

Select the FactInternetSales table from the list of tables in the Adventure Works DW database:

image

This opens the Query Editor window; in it there is one query called FactInternetSales:

image

Here’s where it gets interesting. The first thing to do is to create a function that returns a filtered subset of the rows in the FactInternetSales table using the technique I blogged about here for Power BI. On the Query Editor menu bar, click Query/Parameters/New Parameter and create two new parameters called StartDate and EndDate that return the numbers 20010101 and 20011231. Here’s what they should look like:

image

image

These parameters are going to be used to filter the OrderDateKey column on the FactInternetSales table. Do this by clicking on the down arrow on the column header of OrderDateKey then selecting Number Filters and then Between:

image

In the Filter Rows dialog use the StartDate parameter for the start of the filter range and the EndDate parameter for the end of the filter range, then click OK:

image

Because the OrderDateKey contains dates stored as numbers in the YYYYMMDD format the result is a table that only contains sales where the order date is in the year 2001. This table should not be loaded into SSAS though, so right click on the FactInternetSales in the Queries pane and make sure that the Create New Table is not checked:

image

Next, on the same right-click menu, select Create Function:

image

In the Create Function dialog name the new function GetFactData then click OK:

image

The new GetFactData function will now be visible in the Queries pane; enter 20010101 for the StartDate parameter and 20011231 for the EndDate parameter and click Invoke:

image

This creates yet another new query called Invoked Function which should be renamed Internet Sales:

image

Right-click on this query and make sure Create New Table is selected. Next, click the Import button on the toolbar to close the Query Editor and load the Internet Sales table into SSAS.

At this point the Tabular Model Explorer will show all of the queries created above listed under the Expressions folder, and a single table called Internet Sales with a single partition:

image

Next, right-click on the Internet Sales table in the Tables folder and select Partitions:

image

This opens the Partition Manager dialog. Rename the existing partition to Internet Sales 2001:

image

Note that the M query for this partition calls the GetFactData() function to get the rows from FactInternetSales where OrderDateKey is between 20010101 and 20011231:

let
    Source = GetFactData(20010101, 20011231)
in
    Source

Click the New button to create new partitions, one for each year of data in the FactInternetSales table. Each new partition will initially contain the same M code shown above and should be edited so that the query gets data for the appropriate year:

image

Click OK, and the end result is a table with one partition per year:

image

What’s the point of using M functions to return the data for a partition, rather than the traditional method of using a SQL query embedded in each partition? One reason to do this would be to make maintenance easier: if you need to do something like add a new column to a fact table, rather than editing lots of partitions you just need to edit the function and all the partitions will reflect that change. I can think of a few others, but I’ll save them for future blog posts…

21 thoughts on “Creating A Partitioned Table In SSAS Tabular 2017 And SSDT Using M Functions

  1. Reading this makes me wonder what would go into extending M to allow an expression to iterate over a set of parameter members and generate the partitons internally. Basically doing the last step for you.

    As it is, we need to do that last step using AMO, which is not always natural for BI developers

    • That’s not possible right now – while M can be used to load data, it can’t be used to alter the structure of the SSAS Tabular model itself. In the future… well, it would be possible without any changes to the M language, but the dev team would have to be sure it was the right approach.

  2. While this is very interesting, I am having some problems getting to work.
    I could not get this to work on an existing project. I had to change to integrated workspace mode, but there was no way to select the created parameters in the filter of the initial fact query, so I could not make a function.

    I then created a new project as described and created everything perfectly for my date dimension (filtered to just show the days I will be loading).
    I then tried to import a fact table and apply the parameters, and the option to use a parameter in the filter rows dialog box was gone again.
    It turns you need to create new parameters for every single table you wish to filter, which is not great as this will create a proliferation of expressions and make keeping track of things quite difficult.

    • Hi Michael, I agree the UI still needs a lot of work – I get confused by what’s going on too, and I get frustrated by the kind of scenarios you describe… It’s still being worked on though so hopefully these problems will be dealt with soon.

  3. Thank you for your great posts here ! I have learnt much from your blog. now I am thinking a problem, that is: how to get max value from all cells in powerpivot value area? And this max value will change accordingly with the layout of powerpivot. It’s really difficult for me, could you give me some clue?

  4. Chris,
    This is great. I found an issue after testing. I would like to manage the parameter dynamically, means to set it from another query. I did see options to set parameter as Any value/List of values/Query. But i got error when set parameter as query and refer to it from the funcation.
    Do you have chance to try the parameter as Query option?
    Anyway, it’s quite difficult to use comparing to PQ in Excel/PBI Desktop. Most of time, i have to build in PQ/PBI, then copy M code to SQL 2017. 😦
    Hope MSFT can improve the usability soon.
    Thanks, MD

    • No, I haven’t tried this yet – I would not be surprised to find that it isn’t supported yet though… I agree the usability is not great, but it gets better with each monthly release of SSDT.

  5. I’m very new to SSAS and feel lucky to have found your blog, I think it’s going to be very useful.
    I came across this post as I’m struggling with a 490 million row Fact table.

    I followed the instructions and I initially created a single partition for a 3 month period (I’m going to work in quarters, not years). This deployed successfully and I am now creating a 2nd partition for the following quarter.

    The two quarters I have created so far are for 2016 and my question is, what happens when I reach the current quarter and my partition will be 01-Oct-17 to 31-Dec-17? I will be adding to this partition will each daily load of data so how can I reprocess this partition only each day until Jan-18 and I create the next one?

    Thanks.

    • Dylan is correct – there are a lot of options out there for automating partition creation and processing, either using .NET code, using SSIS or in some other way. If you choose a naming convention for your partitions that includes some indication of the date range of data it contains, it should be easy to work out which one contains the latest data when you need to process it.

      • Thanks very much both, I’ll be looking into this more soon. There was a lot going on in that github post so may look at the SSIS option first.

        Before I look into automation I have the issue of getting the earlier data into my model in Azure AS. I created some partitions that spanned a quarter but when trying to process it received the error “Memory error: You have reached the maximum allowable memory allocation for your tier. Consider upgrading to a tier with more available memory.” I tried creating smaller partitions, reducing down to a single day (approx. 300k records) but processing that single partition is still failing with the same error. Any help or guidance gratefully received!

      • Azure SSAS is an in-memory database, so you need enough memory to store the data you want to analyse. It sounds like you have more data than is currently allowed in the tier you have purchased. As the message says, you can upgrade to a more expensive tier which will allow you to work with more data; you should also look at the data you are loading and delete any columns you don’t need for your reports, because that will reduce your memory usage.

  6. You’re right. I just looked at the metrics and am at 9.5gb of my 10gb cache limit. This is going to be a problem as I’ve hardly any of the data I need in there yet.

    Looks like I’m going to have to re-evaluate my approach.

    Thanks Chris.

      • I had to make a number of changes to reduce the number of columns in my fact table and am now at a point of creating my SSAS model. I followed the steps below again and created a partition for a single day which was worked as expected.

        Having deployed the model, I then created further partitions within SSMS which also worked fine however when I made a further change to my model within SSAS and re-deployed, the partitions I had created within SSMS had been lost. Is this expected behavior or have I done something wrong? I am about to start looking at the partition automation again but am unsure whether I need to be creating them in SSAS, or SSMS (as I originally thought).

        Hopefully this is should be a quick answer!

        Thanks.

  7. Not completely related to this topic but there’s hardly any currently that seem to cover the new get data feature in SQL 2017 so posting it here.
    When I import a table, I am seeing that is is automatically creating a partition and the name of the query changes to TableName/Partition. Is that expected behaviour?
    Also, if I add a calculated column to this table in the model and then go and edit the query to say add a new column then the calculated column goes away. Seems to me like I’m doing something wrong but can’t seem to figure out what that is.

  8. Also, after one make a change to the Table “design” i.e. say add a column, are we supposed to use “Import” or “File -> Close & Update”. What is the difference between the two?

  9. Is it possible to dynamically create partitions at the time of processing? For example, I want to partition my fact table based on an ID column. This list of ID’s I want to filter on comes from another table. Is it possible to dynamically partition the fact table, for each value comming from the ID table?

  10. ive written something in c# to process my partitions and which also creates new partitions when needed. i am using monthly partitions which each day checks whether a partition already exists for that date and creates one it necessary. im not c# expert but its not tricky code.

  11. Great blog once again Chris, I thought I would let you know that I was struggling with create an Expression and then it was never loading to a table.

    Fortunately I read through this blog post, which had the hidden gem of making sure that the “Create New Table” was enabled. And now it works!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s