More fun with Reporting Services. The other week I was working with a customer who I suspected was experiencing query performance problems as a result of the issue described here:
http://blogs.msdn.com/sqlcat/archive/2007/10/19/ssas-small-big-query-interaction.aspx
Certainly in Profiler I could see some fairly trivial queries which were taking much longer than they should have done, and which when I ran them on a test system ran much faster. And certainly when I ran a ‘big query’ and a ‘small query’ simultaneously I saw the ‘small query’ run slow, and when I got the secret ini file settings from PSS my ‘small queries’ ran much faster. But there was still some doubt in my mind over the extent to which the customer was suffering this problem – just by looking at my Profiler traces from the Production servers I couldn’t tell whether lots of small queries were overlapping with the big queries.
What I then thought was that it would be useful to be able to visualise the information from a Profiler trace in some way. I did some experiments with different chart types and different tools, mainly Excel 2007, but in the end I went with Reporting Services 2005 and the Dundas Gantt chart control. I’m not sure whether the Gantt chart has made it into the current RS2008 CTP but I guess it will soon if it hasn’t; in any case the data was coming from a 2005 system so that’s what I used.
The first step was to get the data out of Profiler into some useful format; the easiest way turned out to be to get Profiler to output directly to a SQL Server table and for the problem here all I needed was the Query End events, and specifically the Duration, SPID, TextData, ConnectionID, EndTime, EventClass, EventSubClass, and NTCanonicalUserName columns. I could now run my Profiler trace against the Production system and capture some data. The next problem was to manipulate the data into a useable format. Gantt charts need a start time and a duration for each event and the problem with trace data is that you only know how long something takes when it’s finished, ie when you have an ‘End’ event. Subtracting the EndTime from the duration gives the derived start time of the event, which is one way of dealing with the problem of working out when queries started and finished. There are Query Begin and Query End events and perhaps another, better, solution would have relied on the knowledge that you can only have one query executing on any given session at any one time and matching, so it should be possible to match each Query End event to a Query Begin event. In fact I started off more ambitiously by including other types of events, such a Progress Reports, but the problem here I found was that the StartTimes and EndTimes given by Profiler were rounded to the nearest second which meant that subtracting the Duration from the EndTime didn’t give accurate derived start times and I was finding events were coming out in what was clearly the wrong order when I tried to order them by derived Start Time (some of the SQL I’ll give below is more complex than necessary for just the Query End event because it was written for these other event types too). It would be nice to visualise the various internal events that are generated when you run an individual query, such as reads from partition and cache, but I need to do some more work before that’s possible.
Anyway, with my Query End events I wrote the following SQL query to use in Reporting Services which finds the derived start time for each query, then makes the first query to start at time 0 and calculates all other derived start times relative to that first query. As always I need to apologise for my SQL in advance:
with RelativeStartTimes (OriginalRowNumber, RelativeStartTime, EventClass, DurationMSecs
, TextData, ConnectionID, EventSubClass, NTCanonicalUserName)
as
(
select
a.Rownumber,
–Find the Relative Start Time of each event
–by finding the difference in seconds between the End Time
–of the event and the Earliest End Time in ms
–then subtracting the Duration
(datediff(ss,
c.EarliestEnd,
isnull(a.EndTime, a.StartTime)
) * 1000) – isnull(a.Duration, 0) as RelativeStartTime,
a.EventClass,
isnull(a.duration, 0) as DurationMSecs,
a.TextData,
a.ConnectionID,
a.EventSubclass,
a.NTCanonicalUserName
from dbo.[102QE] a cross join
–Get the earliest End Time in the Trace
(select min(isnull(b.EndTime, b.StartTime)) as EarliestEnd
from dbo.[102QE] b
where b.Rownumber>0
) c
where a.Rownumber>0
)
select
row_number() over(order by (r.RelativeStartTime – x.EarliestRelStartTime) asc) as row,
r.OriginalRowNumber,
r.RelativeStartTime,
x.EarliestRelStartTime,
r.RelativeStartTime – x.EarliestRelStartTime as BeginTime,
r.RelativeStartTime – x.EarliestRelStartTime + r.DurationMSecs as EndTime,
r.EventClass,
r.DurationMSecs,
r.TextData,
r.ConnectionID,
r.EventSubclass,
r.NTCanonicalUserName
from RelativeStartTimes r
cross join
(–Find the Earliest Relative Start Time
select
min(RelativeStartTimes.RelativeStartTime) as EarliestRelStartTime from
RelativeStartTimes) x
I could then use this as the basis for my Reporting Services report. As always with these things I spent most of my time working out how to use the Gantt chart but when I did, and after I’d done some other funky stuff that allowed me to filter by row number and start/end time, and click on an item in the Gantt chart to view the MDX for the query that a particular line represented, I got a report looking like this:
Another problem to mention here is that, as with this report, when you’ve got a lot of queries to look at the Gantt chart takes a loooooong time to render. Anyway, from this initial view you can see that almost all queries executed very quickly but fairly early on there was a very long query, represented by the longish brown line at around event 250 that I’ve circled. Filtering the view down to show this query and those around it in more detail gets you this:
Although the vast majority of queries here execute immediately (they’re just small queries the front end uses to retrieve metadata etc and are represented as single blobs) I’ve circled a few queries around the main query that look like they are taking a few seconds to run. And indeed, running at least one of these queries on the test server showed it runs faster on its own than it did when this Profiler trace was running. Which proved my point.
Moving away from this particular exercise, I’m fairly surprised that none of the big SQL Server tool vendors have tackled the problem of monitoring Analysis Services performance and activity. The only tool I know of that does this is Companion for Analysis Services:
http://www.sqlminds.com/Products/CompanionforMSAnalysisServer/tabid/119/Default.aspx
Where are the others though? I would have though Analysis Services would be the obvious next step for them, given that many SQL Server relational shops also use the MS BI stack too. Whenever I talk to enterprise customers they always ask me what tools are available, certainly… I guess there’s three things going on here: first of all the all-too-common ignorance of Analysis Services in the wider SQL Server community; second the problem that no-one (not even the SSAS experts) has really sat down and thought about what needs to be monitored in an Analysis Services deployment; and thirdly, at least before AS2008, the hooks for getting at data that can be used for monitoring either haven’t been available or have been well hidden. This and my last post on viewing and killing queries in Reporting Services have given a few ideas on what’s possible, though, and hopefully within the next few years Microsoft and/or third party vendors will give us the tools we need.
Hi Chris – very nice post.
My thoughts about SSAS monitoring:
In SQL Server 2005 it is very easy to get a lot of performance information by using SQL DMVs. I believe that many SQL Server vendors use DMVs to report on SQL Server performance and activity. There are no DMVs in SSAS 2005. This might change with SSAS 2008, as there are some DMVs that already allow easy to find long running queries, memory usage, etc. Hopefully Microsoft will add more DMVs in future SSAS releases. SQL 2005 has DMVs that show missing index statistic. Imagine if SSAS would have missing aggregation DMVs – that would be very cool. I do hope that with new SSAS 2008 DMVs more vendors will start building SSAS monitoring tools.
Good blog; I\’ve referred to it in my own blog: http://jenstirrup.blogspot.com/2009/07/using-intelligent-laziness-to-deliver.html
Cool way to look at trace data, I played around with your idea a little here: http://timlaqua.com/2009/10/charting-analysis-services-2008-trace-events/