ETL

PASS Summit Day 3

After the distraction of the last few days, here’s a quick post about what I saw on day three of the PASS Summit last week. In fact it wasn’t quite as exciting as the previous two days but I did find out a bit more about two products I was interested in: Impact Analysis & Lineage, and Data Quality Services.

In fact I didn’t find out much about Impact Analysis & Lineage – it didn’t get much exposure at all for some reason; that’s a shame because I think it’s potentially very useful service. It allows you to see what impact any changes you make in one place in your BI solution have elsewhere: for example, if you rename a column in a dimension table it could break SSIS packages, SSAS dimensions and SSRS reports, but today it’s pretty difficult to know what the impact of any changes will be. Impact Analysis & Lineage, as far as I gathered, is a service that crawls all the files associated with each part of your BI solution and builds up a list of dependencies between them to help solve this problem. I suspect it wasn’t demoed because it’s not finished yet.

We saw a lot more about Data Quality Services. DQS is a substantial new product that allows end users and BI developers to create rules about what data is valid in a given scenario, and then apply these rules to perform automated data cleansing; it’s not, apparently, a rebadged version of the Zoomix product that MS bought a while ago, it’s a lot more ambitious than that. Example scenarios include cleaning addresses by comparing them to a master address list, possibly sourced from the Azure Datamarket; using regular expressions to ensure that valid urls and stock ticker symbols were stored in a table containing information about companies; and using fuzzy matching on names and addresses to find groups of customers who live together in the same household. Although some people I talked to were a bit put off by the bug-ridden demo, I was quite impressed by what I saw – a lot of thought seemed to have gone into it and the UI looks good. I think there’ll be an SSIS component that will allow you to apply your rules within a data flow too, but that wasn’t shown.

One thought on “PASS Summit Day 3

Leave a ReplyCancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.