Power BI Custom Data Connector For Language Detection, Key Phrase Extraction And Sentiment Analysis

I’m pleased to announce that I’ve published my first Power BI custom data connector on GitHub here:

https://github.com/cwebbbi/PowerBITextAnalytics

Basically, it acts as a wrapper for the Microsoft Cognitive Services Text Analytics API and  makes it extremely easy to do language detection, sentiment analysis and to extract key phrases from text when you are loading data into Power BI.

Full documentation for the Text Analytics API can be found here and there is more detailed documentation available for the Detect Language, Key Phrases and Sentiment APIs. You can learn more about Power BI custom data connectors here and here.

Note: you will need to sign up for the Text Analytics API and obtain an access key before you use this custom data connector. You’ll be prompted to enter the access key in Power BI the first time you use the custom data connector. A number of pricing tiers are available, including a free tier that allows for 5000 calls per month. The custom data connector batches requests so that you can send up to 1000 individual pieces of text per call to the API.

Why build a custom data connector for this? Well, first of all, text analysis in Power BI and Power Query is something I’ve been interested in for a long time (see here for example), and I know a lot of other people want to do this too. However, calling any API – and the Microsoft Cognitive Services APIs in particular – involves a lot of tricky M code that is beyond most Power BI users. I certainly didn’t find it easy to write this custom data connector! I know Gil Raviv has blogged about how to use the Sentiment analysis API this data connector calls in two posts (here and here) but he doesn’t handle all the limitations of the API, including the 1MB limit per request, in his examples – which just goes to show what a complex task this is. Wrapping up the code for calling the Text Analytics API in a custom data connector hides this complexity from the developer, makes the code a lot more portable, and the fact that the code is open source means the community can work together to fix bugs and add new features. I welcome any contributions that anyone wants to make and I know there are a lot of improvements that can be made. Certainly the documentation is a bit sparse right now and I’ll be adding to it over the next week or so.

This is not quite a traditional custom data connector in the sense that it doesn’t act as a data source in its own right – you have to pass data to it in order to get data back. It exposes three M functions:

  • TextAnalytics.DetectLanguage(inputtext as list, optional numberoflanguages as number) as table
    This function takes a list of text values and returns a table containing the input text and the language detected in each piece of text
  • TextAnalytics.KeyPhrases(inputtext as list, optional languages as list) as table
    This function takes a list of text values (and an optional list of language identifiers for each piece of text) and returns a table containing the input text and key phrases detected in each piece of text. More than one key phrase may be returned for each piece of text.
  • TextAnalytics.Sentiment(inputtext as list, optional languages as list) as table
    This function takes a list of text values (and an optional list of language identifiers for each piece of text) and returns a table containing the input text and a score representing the sentiment detected for each piece of text.

Here are a few simple examples of how to use these functions:

First, the TextAnalytics.DetectLanguage() function. This query:

let
    input = {"hello all", "bonjour", "guten tag"},
    result = TextAnalytics.DetectLanguage(input)
in
    result

Returns the following table:

image

For the TextAnalytics.KeyPhrases() function, the following query:

let
    input = 
        {
        "blue is my favourite colour", 
        "what time it is please?", 
        "twinkle, twinkle little star, how I wonder what you are"
        },
    result = TextAnalytics.KeyPhrases(input)
in
    result

Returns this table:

image

And for the TextAnalytics.Sentiment() function, the following query:

 let
     input = 
        {
        "this is great", 
        "this is terrible", 
        "this is so-so"
        },
     result = TextAnalytics.Sentiment(input)
in
    result

Returns this table:

image

Because the first parameter of each of these functions is a list, it’s super-easy to pass in columns of data from existing tables. For example, here’s the output of a query that gets the last ten comments from the comments RSS feed of this blog:

image

If this query is called Comments, the following single line of code is all that’s needed to call the TextAnalytics.Sentiment() function for the Comment Text column on this table:

TextAnalytics.Sentiment(Comments[Comment Text])

image

You can download a .pbix file containing several examples of how to call these functions, including all the examples above and many more, here.

I hope you enjoy using these functions, and if you have any questions, find any bugs or want to make suggestions for how they can be improved please let me know via the Issues page on GitHub. Finally, this is my first time using GitHub and if I’ve done something really dumb while publishing the code please let me know what I need to do to fix it!

20 thoughts on “Power BI Custom Data Connector For Language Detection, Key Phrase Extraction And Sentiment Analysis

  1. Hi Chris, This is really exciting. I’ve played with the Cognitive Services and specifically the sentiment analysis in Power BI quite a lot, but I was never able to make it work when published to the Power BI Service – partly because of the terrible OData.Feed documentation. I’m new to custom connectors, but will this work in the Service, or is it limited to use by the Power BI Desktop at the moment…?

  2. Hello, Chris,
    I download this project from Git, add to .mez file and put it in My Documents/Power Bi Desktop/Custom Visuals and i can’t find it it list of Connectors in Power Bi Desktop.
    What i do wrong?
    Thank you

    • Probably nothing: this connector doesn’t show up as a data source in Get Data because it can’t act as a data source on its own, you have to pass data from somewhere else to it. That was a conscious decision on my part – I don’t know if it’s the right one! Do the sample queries work?

  3. @Chris – The Pbix files does not have the functions. Could you kindly include it in the Pbix file
    Write now I get an error that says
    “The import TextAnalytics.Sentiment matches no exports. Did you miss a module reference?”

  4. @Chris.
    I must be missing out on a crucial step – Because I still get the same message.
    I did the following
    1. Clicked on the TextAnalytics.pq file then clicked on Raw
    2. Copied the Code from the browser and pasted into a notepad and saved the file as TextAnalytiics.pq on the desktop
    3. I then zipped the file
    4. changed the extension of the zip archive to .mez
    5. Copied the .mez file to
    C:\Users\\Documents\Microsoft Power BI Desktop\Custom Connectors
    6. Prior to all the above I have ensured that Custom connectors is checked in Power BI Desktop
    7. I opened you sample pbix file
    8. Tried running one of the exmaples – got the below error

    “Expression.Error: The name ‘TextAnalytics.Sentiment’ wasn’t recognized. Make sure it’s spelled correctly.”

    Note : I was not prompted to enter the key when I tried running the query.
    I was connected

      • OK, I have an idea about what you’re trying to do here, but you’re making it more complicated than it needs to be. If you have Visual Studio and the Power Query SDK installed you just need to open the solution, build the project, copy the .mez file it produces over to the directory Power BI uses and you should be good to go. I’ve just realised that there are clearer instructions here in the “Quickstart” section: https://github.com/Microsoft/DataConnectors

  5. @Chris – Got it to work finally via Visual Studio – Thanks.
    Request you to kindly upload the .mez file on the Gtihub. We then don’t need VS to be installed

  6. Chris, do you have an idea what the optional ‘numberOfLanguagesToDetect’ parameter is for?

    “Set to 1 by default. Irrespective of the value, the language with the highest score is returned.”

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s