Power Query Book Published!

Looking for some summer holiday (or winter holiday, depending on which hemisphere you live in) reading? If so, may I suggest my new Power Query book? “Power Query for Power BI and Excel” is available now from the Apress site, Amazon.com, Amazon.co.uk and all good bookstores.

It’s an introductory level book. It covers all of the stuff you can do in the UI, it has a chapter on M, and it goes into a reasonable amount of detail on more advanced topics; it is not a 500-page exhaustive guide to the product. I’ve focused on readability and teaching the fundamentals of Power Query rather than every looking at every obscure M function, but at the same time if you’ve already used Power Query I think there’ll be plenty of material in there you’ll find interesting.

Now for the bad news: the book is out-of-date already, although not by much. One of the best things about Power Query is the monthly release cycle; unfortunately that makes writing a book on it a bit of a nightmare. I started off writing in January and had to deal with lots of added functionality and changes to the UI over the next few months; I had to retake pretty much all of the screenshots as a result. The published version of the book is based on the version of Power Query that was released in early June rather than the current version. Hopefully you can forgive this – the differences are minor – but it’s a good reason to buy the book as soon as you can! I want to do a second edition in a year’s time once (if?) the release cycle slows down.

I’ve been teased a bit for blogging and teaching so much about Power Query recently, so the final thing I want to say here is why an old corporate BI/SSAS guy like me is getting so excited about a self-service ETL tool. Well, the main reason is that Power Query is a great piece of software. It does what it does very well; it does useful things rather than what the marketing guys/analysts/journalists think is hot in BI; it is easy to use but at the same time is flexible enough for the advanced user to do really complex stuff; it is updated regularly based on feedback from its users. I only wish all Microsoft software was this good… Honestly, I wouldn’t be able to motivate myself to blog and write about Power Query if I didn’t think it was cool, and even though it hasn’t been hyped in the same way as other parts of the Power BI stack it is nonetheless the part that people get excited about when I show them Power BI. It’s not just me either – every day I see positive comments like Greg Low’s here. I think it is as important, if not more important, than Power Pivot and I think it will be a massive success.

Oh, and did I mention that I’m also teaching a Power Query course in London later this year….?

13 thoughts on “Power Query Book Published!

    1. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
      Chris Webb says:

      Hi David,

      The short answer is no, Power Query always has to reload all of the data every time. However you can get around this in the way I detail here: http://cwebbbi.wordpress.com/2013/05/13/accumulating-data-in-an-excel-table-using-data-explorer-and-powerpivot/ although I don’t think this is a particularly elegant solution and it’s not one I’d feel comfortable relying on.

      Chris

      1. Well, then I guess I will have to pull out my trusty VBA sword and get to work.

    2. There is a fairly simple method to get PowerQuery to do this work. I whipped a file together to do something like this that involves a little setup, but 3 quick and easy to expand on queries and a simple update process.
      Setup 3 queries in a book Query1, Query2, Query3. Query1 will not load to the workbook, but make it just a reference to statically named file that gets updated. The query text will be like:

      Csv.Document(File.Contents(“C:\Data 02.txt”),null,”,”,null,1252)
      // notice no need for ‘let … in’ in such short code

      Query2 is going to compare the existing (held in query 3) to the new file (query 1) I am going to explain this one last because it is the most complex and will change based on your situation.

      Query3 is going to function as alternatively a placeholder of old data for basing query2 off of, and a repository of the completed data after running Query2 is complete (a copy of the end result of Query2). Rather than reference Query2, we reference the output of query 2 as an Excel Table. The query code is simply:

      Excel.CurrentWorkbook(){[Name=”Query2″]}[Content]
      //Again no need for “Let … In”.
      //NOTE: this follows the convention that the Excel output tables are given the same name as the queries on which they are based.
      //This is an important technical distinction and procedural imperative for the example)

      So, back to Query2. This query takes Query2 as given rows that should not be duplicated. The table is transformed into a list of records for compatibility with the next function with Table.ToRecords. THen We use Table.RemoveMatchingRecords on Query1 (which presumably has the duplicates), this subtracts those duplicate rows leaving only the new data. Then we simply combine Query3 with that new data. The whole thing is as short as:

      Table.Combine( { Query3, Table.RemoveMatchingRows(Query1, Table.ToRecords(Query3)) } )
      //Again a one-liner with no need for “let … in” however the line could be decomposed if desired.

      I hope this helps to anyone interested. The key take-away is that the results of a query can be considered static for downstream queries by using Excel.CurrentWorkbook on the loaded results of a query. This is also sometimes desirable when a portion of a process takes a very long time to load but can be considered static at some intermediate step to separate the bottleneck from additional downstream processing that need not suffer from the bottleneck (especially useful during query development).

      As a result, I can also make the entire thing above self referencing via

      let

      Q1 = Csv.Document(File.Contents(“C:\Data 02.txt”),null,”,”,null,1252),

      Self = Excel.CurrentWorkbook(){[Name = “Query123”]}[Content],

      CombineQ1AndSelf = Table.Combine( { Self, Table.RemoveMatchingRows(Q1, Table.ToRecords(Self)) } )
      in
      CombineQ1AndSelf //assumes query & output table both named “Query123”

  1. Sample files for the book?

    Hi Chris,
    Book arrived from Amazon a couple of weeks ago … just starting to read it now.
    With other books … Marco & Alberto, Rob Collie etc there was an attached disk with example workbooks or else it was possible to download the files. I cannot find an equivalent for the Power Query book. It would make reading and understanding a lot easier if you can replicate what is going on in the book.

    How can I get my hands on the samples .. for example there is reference to 01_01_SimpleSales.csv in Chapter 1.

    Help!

    Ted

  2. Um….I’m trying to follow your book using Power Query Version: 2.14.3722.242.
    I’m at Figure 1-4 and I can’t can’t find the Navigator.

    1. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
      Chris Webb says:

      There have been a lot of changes in the Power Query UI in the three months since the book was published, unfortunately. In this case the Navigator pane is no longer visible for the query shown – but it will become visible in other scenarios.

      1. Chris Webb – My name is Chris Webb, and I work on the Fabric CAT team at Microsoft. I blog about Power BI, Power Query, SQL Server Analysis Services, Azure Analysis Services and Excel.
        Chris Webb says:

        I’ll apologize in advance because you’re going to run into a lot of issues like this as you read the book. A lot has changed. However the really important stuff (like the M language) is unchanged.

Leave a Reply to David HagerCancel reply