BeginRange and EndRange connection string properties

Using the Timeout connection string property is a good way of making sure that your queries don’t run for too long, but sometimes – for example when you’re using SSRS – you want to restrict the amount of data that a query returns. You can’t properly do this with Analysis Services, but it is almost possible…

Consider the following query on Adventure Works:

SELECT
{[Measures].[Internet Sales Amount],[Measures].[Internet Tax Amount]}
ON 0,
[Date].[Date].[Date].MEMBERS
ON 1
FROM [Adventure Works]

It returns 1189 rows and 3 columns. If you click on any of the cells containing data in SQL Management Studio, to see the cell properties, you’ll see that the CellOrdinal property contains the index of each cell in the cellset. So the top left hand cell is ordinal 0, the one to its right is 1, and so on until the last column where it starts again one row down:

image

Using the BeginRange and EndRange connection string properties, you can limit the cells in a cellset that actually get populated with data. Note that you can’t restrict the overall number of cells though, which would be more useful. Both these properties take an integer value which represents a cell ordinal: BeginRange is the first cell ordinal you want to contain data, EndRange is the last cell ordinal. Their default value is –1, which for BeginRange means start at the first cell ordinal and for EndRange means end at the last cell ordinal. So, for example, with BeginRange=4 and EndRange=7, running the query above would give the following output:

As I said, the overall number of cells in the cellset remains the same, but only the cells in the range we specified actually contain data. This ‘filtering’ happens after the query axes have been resolved, as far as I can see, so adding NON EMPTY on Rows for example does not filter out any of the empty rows. If you were using SSRS, however, you could do this filtering at the DataSet level.

If you look in Profiler you’ll see that these properties have an affect on the amount of work SSAS does at query time. On a cold cache, with no BeginRange and EndRange set, the query scans all of the year partitions in the Internet Sales measure group as you would expect. But with BeginRange and EndRange set as above, on a cold cache SSAS only reads data from the 2001 partition.

BTW, remember that if you’re experimenting with these connection string properties in SQLMS, when you’re finished you’ll need to either close and reopen SQLMS or set BeginRange=-1 and EndRange=-1 as a result of this bug (which still doesn’t seem to be fixed in SP1).

14 thoughts on “BeginRange and EndRange connection string properties

  1. In my case users connect to the cube with Excel (2003/2007). Is there a way to force these connection string properties to be used for all connections (i.e. without relying on users manually entering it in Excel)? My intention here is to encourage (i.e. force) the use of cubes for high-level analysis and prevent users from designing queries that result in 1000s of cells being returned, which results in support problems (server overloaded, Excel hanging etc).

  2. Would you agree this is mostly commonly a design time problem generated part way through query design and before the full scope of the query is understood?In Intelligencia we automatically Subset the axes of queries in design time to limit the amount of data returned. This improves query building performance without affecting the final result. The user can "page" through the data if they need to see parts of the query that are not returned by default.

  3. Hi,I want to restrict the Cube 2005 to show limited number of rows even User select more than 1000 record.How can i do it.Please help me in it.

  4. They\’re connection string properties – you need to add them to the connection string that is used whenever you connect to Analysis Services. How you do that will depend on the client tool you\’re using – which one are you using?

  5. If you set the EndRange to a value greater than the number of cells returned then it throws an error. This makes it useless when trying to restrict the number of cells returned to avoid large queries hitting the server because I can’t determine the number of cells beforehand. It’s very disappointing that there is no equivalent of the SQL Top N.

    1. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
      Chris Webb says:

      There are several MDX functions you could use to do this, such as Head().

      1. Yes but they are evaluated only in the context of the current axis. For example the following query returns 6 cells.

        SELECT
        NON EMPTY {[Product].[Product Categories].[Category].&[1], [Product].[Product Categories].[Category].&[3]} * [Measures].[Internet Sales Amount] on 0,
        NON EMPTY {[Customer].[Customer Geography].[Country].Members} on 1
        FROM [Adventure Works]
        WHERE [Sales Territory].[Sales Territory].[Group].&[Europe]

        If I want to restrict it to a max of 2 cells or 2 rows/columns I could try SUBSET/HEAD/TOPCOUNT

        SELECT
        NON EMPTY {[Product].[Product Categories].[Category].&[1], [Product].[Product Categories].[Category].&[3]} * [Measures].[Internet Sales Amount] on 0,
        NON EMPTY HEAD({[Customer].[Customer Geography].[Country].Members},2) on 1
        FROM [Adventure Works]
        WHERE [Sales Territory].[Sales Territory].[Group].&[Europe]

        But this returns no data.

        Ultimately I want to prevent massive queries from reducing performance for other users because users can generate their own queries (using a OLAP client tool). I can use the timeout properties but doesn’t necessarily stop large queries from using up all the RAM on the server.

      2. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
        Chris Webb says:

        Your second query returns no data because the NON EMPTY is evaluated after the HEAD() function. What you need to do here is not use NON EMPTY and instead use the NONEMPTY() function inside the HEAD() function, so the non empty filtering takes place first.

        Anyway… this would only work if you have control over the MDX generated by your client tool, and to be honest there is no foolproof way of preventing end users from running massive queries. The only thing you can do is design your cube (particularly your hierarchies) to make it difficult for this to happen, but even then that’s not always going to be possible. Until SSAS gets a proper resource governor (if it ever does) you could try something like this to kill expensive queries automatically: http://cwebbbi.wordpress.com/2008/12/04/killing-sessions-automatically-with-ssis/

      3. I still don’t think NonEmpty() works unless I move things to the slicer. I’m guessing NonEmpty() only takes the context of the current axis plus the slicer, not all axes. If I only make that one change in my example I get just Australia and Canada with no data. I have to do the following but then get the error “The cell range provided is out of the acceptable range.”

        SELECT
        NON EMPTY {[Product].[Product Categories].[Category].&[1], [Product].[Product Categories].[Category].&[3]} on 0,
        HEAD(NonEmpty({[Customer].[Customer Geography].[Country].Members} ), 2) on 1
        FROM [Adventure Works]
        WHERE CROSSJOIN([Sales Territory].[Sales Territory].[Group].&[Europe] ,[Internet Sales Amount])

        I do have control over the generated MDX but since the generated MDX can cell calculations, calc members, MDX functions, etc I would need a solution that works all the time.

        Your SSIS example looks good. I’ve had a play with SELECT * FROM $System.Discover_Commands etc by running large queries and looking at the results but while the elapsed time updates the read KB doesn’t seem to be refreshed. It also seems hard to get the memory in use by a session at a point in time.

        I did see a SSAS property called SessionMemoryLimit which sounds promising but the only information I can find about it is “An advanced property that you should not change, except under the guidance of Microsoft support.”

        To summarize you’re right; there is no foolproof way of preventing end users from running massive queries.

      4. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
        Chris Webb says:

        I think this query does what you want (you need to set the second parameter of the NonEmpty() function appropriately):

        SELECT
        NON EMPTY
        {[Product].[Product Categories].[Category].&[1], [Product].[Product Categories].[Category].&[3]}
        *
        {[Measures].[Internet Sales Amount]}
        on 0,
        HEAD(
        NONEMPTY(
        {[Customer].[Customer Geography].[Country].Members},
        {[Product].[Product Categories].[Category].&[1], [Product].[Product Categories].[Category].&[3]}
        *
        {[Measures].[Internet Sales Amount]})
        ,2)
        on 1
        FROM [Adventure Works]
        WHERE [Sales Territory].[Sales Territory].[Group].&[Europe]

  6. That does look promising, it at least works here. I’ll try that on some more examples and then check the impact on server memory consumption. Thanks for the suggestion.

    Note: I can also shorten this a little using Axis(0):
    SELECT
    NON EMPTY
    {[Product].[Product Categories].[Category].&[1], [Product].[Product Categories].[Category].&[3]}
    *
    {[Measures].[Internet Sales Amount]}
    on 0,
    HEAD(
    NONEMPTY(
    {[Customer].[Customer Geography].[Country].Members}, Axis(0))
    ,2)
    on 1
    FROM [Adventure Works]
    WHERE [Sales Territory].[Sales Territory].[Group].&[Europe]

Leave a Reply to LaurenceCancel reply