As Buck Woody notes here, when you are exploring a new data set it can be useful calculate some basic descriptive statistics. One new M function that appeared in Power BI recently can help you to do this: Table.Profile(). This function takes a value of type table and returns a table that displays, for each column in the original table, the minimum, maximum, average, standard deviation, count of values, count of null values and count of distinct values (but no mode or median?). So, given the following table:
…the Table.Profile() function returns the following table:
Of course you could create something similar yourself fairly easily (as I have done for a customer in the past), and it’s not as sophisticated as the Quick Insights feature, but it’s handy to have a single function that does all this.
You could even use it on all of the tables in a SQL Server database. Since the Sql.Database() function returns a table containing all of the tables and views in a database, like so:
All you need to do to use Table.Profile() on all these tables is to add a new custom column that calls this function for every value in the Data column:
Then finally expand the new custom column and you’ll see the stats for every column in every table:
Here’s the code:
let Source = Sql.Database("localhost", "adventure works dw"), #"Added Custom" = Table.AddColumn(Source, "Profile", each Table.Profile([Data])), #"Expanded Profile" = Table.ExpandTableColumn(#"Added Custom", "Profile", {"Column", "Min", "Max", "Average", "StandardDeviation", "Count", "NullCount", "DistinctCount"}, {"Column", "Min", "Max", "Average", "StandardDeviation", "Count", "NullCount", "DistinctCount"}) in #"Expanded Profile"
This is super helpful – thanks for posting!
A litte drawback on columns being “autodetected” as “any”: They won’t return Min, Max and Distinctcount. Text Columns do that – so this might have taken you by surprise like me.
Any comments on performance – doing that kind of analysis on large tables would be expensive I would think – especially from an IO perspective.
Yes, performance could be a problem on large tables – especially if you were working with a data source where query folding wasn’t taking place. I haven’t tested it on a really large table though, so I can’t say more than that.
[…] Bron : Chris Webb’s BI Blog Lees meer… […]
I’ve attempted this multiple times with a very basic table of data and I keep getting an expression.error: The field ‘xxxx’ of the record wasn’t found.
I imported a text file of data into excel, and gave it column headers, then formatted it as a table.
Table name: Table2
Column 1: Name
Column 2: State abbreviation
Column 3: Decision indicated with 1 or 0 (0 = yes, 1 = no)
Column 4: Another decision indication with 1 or 0
Column 7: percentage represented as a whole number
Column 8: year
All columns with numbers in them have been changed to type ‘whole number’ and ‘text’ and every time I try the Table.Profile([Table2]) I receive expression.error: The filed ‘Table2’ of the record wasn’t found. I get this error for every row.
Any suggestions? Thanks.
Can you post the full M code for your query (from the Advanced Editor window) please? It looks like you need to remove the square brackets from your expression and use Table.Profile(Table2), and use this expression on its own as a step and not inside a custom column
The additional Columns are all decision columns as well marked by 1 or 0.
let
Source = Excel.Workbook(File.Contents(“C:\Users\lpeterso\Desktop\Book2.xlsx”), null, true),
Table2_Table = Source{[Item=”Table2″,Kind=”Table”]}[Data],
#”Changed Type” = Table.TransformColumnTypes(Table2_Table,{{“Column 1”, type text}, {“Column 2”, type text}, {“Column 3”, Int64.Type}, {“Column 4”, Int64.Type}, {“Column 5”, Int64.Type}, {“Column 6”, Int64.Type}, {“Column 7”, Int64.Type}, {“Column 8”, Int64.Type}, {“Column 9”, Int64.Type}, {“Column 10″, Int64.Type}}),
#”Reordered Columns” = Table.ReorderColumns(#”Changed Type”,{“Column 1”, “Column 2”, “Column 3”, “Column 4”, “Column 5”, “Column 6”, “Column 7”, “Column 8”, “Column 9”, “Column 10″}),
#”Changed Type1″ = Table.TransformColumnTypes(#”Reordered Columns”,{{“Column 3”, Int64.Type}, {“Column 4”, Int64.Type}, {“Column 5”, Int64.Type}, {“Column 6”, Int64.Type}, {“Column 7”, Int64.Type}, {“Column 8”, Int64.Type}, {“Column 9”, Int64.Type}, {“Column 10″, Int64.Type}}),
#”Added Custom” = Table.AddColumn(#”Changed Type1″, “Custom”, each Table.Profile([Table2])),
Custom = #”Added Custom”{0}[Custom]
in
Custom
OK, does this do what you are expecting?
let
Source = Excel.Workbook(File.Contents(“C:\Users\lpeterso\Desktop\Book2.xlsx”), null, true),
Table2_Table = Source{[Item=”Table2″,Kind=”Table”]}[Data],
#”Changed Type” = Table.TransformColumnTypes(Table2_Table,{{“Column 1”, type text}, {“Column 2”, type text}, {“Column 3”, Int64.Type}, {“Column 4”, Int64.Type}, {“Column 5”, Int64.Type}, {“Column 6”, Int64.Type}, {“Column 7”, Int64.Type}, {“Column 8”, Int64.Type}, {“Column 9”, Int64.Type}, {“Column 10″, Int64.Type}}),
#”Reordered Columns” = Table.ReorderColumns(#”Changed Type”,{“Column 1”, “Column 2”, “Column 3”, “Column 4”, “Column 5”, “Column 6”, “Column 7”, “Column 8”, “Column 9”, “Column 10″}),
#”Changed Type1″ = Table.TransformColumnTypes(#”Reordered Columns”,{{“Column 3”, Int64.Type}, {“Column 4”, Int64.Type}, {“Column 5”, Int64.Type}, {“Column 6”, Int64.Type}, {“Column 7”, Int64.Type}, {“Column 8”, Int64.Type}, {“Column 9”, Int64.Type}, {“Column 10″, Int64.Type}}),
Profile = Table.Profile(#”Changed Type1”)
in
Profile
BTW it looks like some of the double quotes characters get changed here, so you might need to do a find/replace on them to get it all working properly.
I now have the following
let
Source = Excel.Workbook(File.Contents(“C:\Users\lpeterso\Desktop\Book2.xlsx”), null, true),
Table2_Table = Source{[Item=”Table2″,Kind=”Table”]}[Data],
#”Changed Type” = Table.TransformColumnTypes(Table2_Table,{{“Column 3”, type text}, {“Column 4”, type text}, {“Column 5”, type text}, {“Column 6”, type text}, {“Column 7”, type text}, {“Column 8”, type text}, {“Column 9”, type text}, {“Column 10″, type text}}),
#”Added Custom” = Table.AddColumn(#”Changed Type”, “Custom”, each Table.Profile(Table2)),
Custom = #”Added Custom”{0}[Custom]
in
Custom
and got this error
Expression.Error: A cyclic reference was encountered during evaluation.
I feel like I’m missing something incredibly simple that I’m looking over because I’ve been staring at it for so long, but I can’t find it. 🙂
@Lauren
I guess you should not use the name “Custom” twice:
Custom = #”Added Custom”{0}[Custom]