Chris Webb's BI Blog: Filtering Data Loaded Into A Workspace Database In Analysis Services Tabular 2017 And Azure Analysis Services

The first mistake that all new Analysis Services Tabular developers make is this one: they create a new project in SSDT, they connect to their source database, they select the tables they want to work with, they click Import, and they then realise that trying to load a fact table with several million rows of data into their Workspace Database (whether that’s a separate Workspace Database instance or an Integrated Workspace) is not a good idea when they either end up waiting for several hours or SSDT crashes because it has run out of memory. You of course need to filter your data down to a manageable size before you start developing in SSDT. Traditionally, this has been done at the database level, for example using views, but modern data sources in SSAS 2017 and Azure Analysis Services allow for a new approach using M.

Here’s a simple example of how to do this using the Adventure Works DW database. Imagine you are developing a Tabular model and you have just connected to the relational database, clicked on the FactInternetSales table and clicked Edit to open the Query Editor window before importing. You’ll see something like this:

…that’s to say there’ll be a single query visible in the Query Editor with the same name as your source table. The M code visible in the Advanced Editor will be something like this:

[sourcecode language=”text” padlinenumbers=”true”]
let
Source =
#"SQL/localhost;Adventure Works DW",
dbo_FactInternetSales =
Source{[Schema="dbo",Item="FactInternetSales"]}[Data]
in
dbo_FactInternetSales
[/sourcecode]

At this point the query is importing all of the data from this table, but the aim here is to:

Filter the data down to a much smaller number of rows for the Workspace Database
Load all the data in the table after the database has been deployed to the development server

To do this, stay in the Query Editor and create a new Parameter by going to the menu at the top of the Query Editor and clicking Query/Parameters/New Parameter, and creating a new parameter called FilterRows of type Decimal Number with a Current Value of 10:

The parameter will now show up as a new query in the Queries pane on the left of the screen:

Note that at the time of writing there is a bug in the Query Editor in SSDT that means that when you create a parameter, close the Query Editor, then reopen it, the parameter is no longer recognised as a parameter – it is shown as a regular query that returns a single value with some metadata attached. Hopefully this will be fixed soon but it it’s not a massive problem for this approach.

Anyway, with the parameter created you can now use the number that it returns to filter the rows in your table. You could, for example, decide to implement the following logic:

If the parameter returns 0, load all the data in the table
If the parameter returns a value larger than 0, interpret that as the number of rows to import from the table

Here’s the updated M code from the FactInternetSales query above to show how to do this:

[sourcecode language=”text” highlight=”6,7,8,9,10,11,12″]
let
Source =
#"SQL/localhost;Adventure Works DW",
dbo_FactInternetSales =
Source{[Schema="dbo",Item="FactInternetSales"]}[Data],
FilterLogic =
if
FilterRows<=0
then
dbo_FactInternetSales
else
Table.FirstN(dbo_FactInternetSales, FilterRows)
in
FilterLogic
[/sourcecode]

The FactInternetSales query will now return just 10 rows because the FilterRows parameter returns the value of 10:

And yes, query folding does take place for this query.

You now have a filtered subset of rows for development purposes, so you can click the Import button and carry on with your development as usual. Only 10 rows of data will be imported into the Workspace Database:

What happens when you need to deploy to development though?

First, edit the FilterRows parameter so that it returns the value 0. To do this, in the Tabular Model Explorer window, right-click on the Expressions folder (parameters are classed as Expressions, ie queries whose output is not loaded into Analysis Services) and select Edit Expressions:

Once the bug I mentioned above has been fixed it should be easy to edit the value that the parameter returns in the Manage Parameters pane; for now you need to open the Advanced Editor window by clicking the button shown below on the toolbar, and then edit the value in the M code directly:

Then close the Advanced Editor and click Import. Nothing will happen now – the data for FactInternetSales stays filtered until you manually trigger a refresh in SSDT – and you can deploy to your development server as usual. When you do this, all of the data will be loaded from the source table into your development database:

At this point you should go back to the Query Editor and edit the FilterRows parameter so that it returns its original value, so that you don’t accidentally load the full dataset next time you process the data in your Workspace Database.

It would be a pain to have to change the parameter value every time you wanted to deploy, however, and luckily you don’t have to do this if you use BISM Normalizer – a free tool that all serious SSAS Tabular developers should have installed. One of its many features is the ability to do partial deployments, and if you create a new Tabular Model Comparison (see here for detailed instructions on how to do this) it will show the differences between the project and the version of the database on your development server. One of the differences it will pick up is the difference between the value of the parameter in the project and on in the development database, and you can opt to Skip updating the parameter value when you do a deployment from BISM Normalizer:

10 thoughts on “Filtering Data Loaded Into A Workspace Database In Analysis Services Tabular 2017 And Azure Analysis Services”

Bill Anton (@SQLbyoBI) says:

February 26, 2018 at 5:42 pm

Good stuff – and glad to see you included BISM Normalizer!

Loading...

Pingback: Filtering Data Loaded Into A Workspace Database In Analysis Services Tabular 2017 And Azure Analysis Services - SSWUG.ORG
Amit Telem says:

January 20, 2019 at 12:08 pm

Hi Chris,
Thanks a lot for an informative post.
However, after changing the parameter back to 0 nothing really changing. Build & deploy doesn’t transfer the 1 billion but still just 10. Any idea why that is?

Loading...

1. Chris Webb says:
  
  January 20, 2019 at 12:47 pm
  
  Can you script out the database on the server and see if the parameter has actually changed?
  
  Loading...
  
Mike Lee says:

September 16, 2019 at 6:53 pm

Hi Chris,

I have a 10GB csv in Blob Storage that I’d like to get into AS. When I try to expand the Content of the Blob on my workstation I get a message saying: The evaluation reached the allowed cache limit. Try increasing the allowed cache size. I think this means I don’t have enough RAM in my workstation to bring in this data (which makes sense).

I wanted to use a method similar to the one you wrote about above but it seems like the second I try to expand the Content I run into this error and don’t get a chance to use Table.FirstN.

I have a process behind the scenes that takes the 2000 or so csv’s and combines them into the single one. If I’m better served by combining them all in M (so i can get around this lack of memory issue) I can do that as well.

Thanks,

Mike

Loading...

1. Chris Webb says:
  
  September 27, 2019 at 7:58 pm
  
  I haven’t come across this error before, but I don’t think it’s related to memory. Can you go to the Options dialog in Power BI Desktop and on the Data Load tab change the Maximum allowed (MB) setting to a larger value? I think this might help.
  
  Loading...
  
2. Anonymous says:
  
  May 25, 2024 at 2:14 pm
  
  Hi Mike,
  
  I am also facing the same issues now. Getting message: The evaluation reached the allowed cache limit. Try increasing the allowed cache size.
  I have a process behind the scenes that takes the 2000 or so csv’s and combines them into the single one. If I’m better served by combining them all in M.
  It was running for 6 hours and failing with above error.
  
  Can someone please help on how to resolve this issue.
  
  Loading...
  
Pingback: Faster Power BI and Analysis Services Development with Automatic Data Subsets - The Bit Bucket
Arthur Steijn says:

March 3, 2020 at 3:24 pm

Hi Chris,

Thanks for the post! We implemented this on our big tables. Since we are working with 3 people on one model and deploy through DevOps, people sometimes forget to change the expression back to 0.

Is there a way to change this expression with a XMLA script? Tried a few variations, but didn’t get it to work.

Btw, after changing the expression in VS, I need to process a table, before VS sees the change in the expression. After this we can use the BISM normalizer or commit through DevOps.

Kind Regard,

Arthur

Loading...

Anand says:

July 28, 2020 at 12:22 pm

Hi Chris,

Great one !
One Question, Can we use the Parameters defined here at the Power BI Reports(PBI Desktop and then to service) ?

Loading...