Accessing Jira from Power BI


If you use Jira to manage projects, you may want to analyse your Jira data in Power BI. Jira provides an API that allows data to be extracted from it, and Power BI can consume this data, but there are some issues (pun intended) that can cause problems.

There used to be a Power BI “content pack” for Jira that showed some nice charts and other stats about a Jira installation.

The problems with the content pack are:

  • It is no longer available on the Power BI app marketplace
  • It can’t be customised for new fields (custom fields) if you’ve added any to your Jira installation
  • If your Jira data is large, refreshing the data can take a long time and can regularly fail depending on your server and network configuration

Fortunately, the template for the content pack is available if you do a bit of searching and the version I found is available here.

I’ve used the code in this template, largely rewriting it, to simplify and improve the way it works. The resulting new template is available here.

A Power BI template is basically a report file but with no data and no stored credentials for data sources. When you open a template file in Power BI it prompts you for any required parameters and then becomes an “untitled” report that you can save.

Opening the original template prompts you to set your Jira URL (e.g. https://jira.example.com.au) and you then need to set your credentials for authentication (if necessary, and depending on your security environment). Once you do that you can refresh your Jira data, see the results and SAVE AS a report. While the original template suffers from the same data-volume-related problems as the content pack, it allows viewing and changing the M (mashup) code used to access Jira using the Jira REST API.

In theory it wouldn’t be too difficult to write the M code from scratch to access the Jira API, but seeing how it was done for the template (and hence for the content pack) is a helpful starting point.

Problems with the original template

Accessing a large Jira installation using a Power BI report based on the original template can encounter some “issues” that prevent it working. Basically, the results are too big … both “wide” and “long” … causing the refresh to be slow and even fail.

Your Jira server may have thousands of issues (making the API return data “long”), and the content pack M code extracts all fields (making the API return data “wide”). Results are returned from the Jira API as json which is somewhat verbose, so this produces a relatively large set of data which takes a long time to process, even if you don’t load all of the fields into your report.

Solution: Restricting fields returned

Fortunately, you don’t normally need ALL fields to be returned, and the Jira API allows you to specify which fields you want (or even don’t want), making the results narrower. This can have a significant effect on response time and reliability when refreshing the data. I found that for around 50,000 issues, returning all fields resulted in ~500Mb of json data that took nearly an hour to process in Power BI when accessing an in-house Jira server. Reducing the results to only eleven fields brought the data down to 75Mb and ~4 minutes to refresh.

While it’s helpful to limit the number of fields returned, it would be even more helpful (thought it seems, not possible in the Jira API) to only extract individual sub-fields. For example, when we request the Jira field “assignee” we get a json structure like this:

We normally would only want to use the Name and/or DisplayName sub-fields, but the API doesn’t allow us to specify that. Instead, we have to extract the whole field, the throw away the bits we don’t want to keep. We still get a great benefit by limited the results to include only the fields we need, but as you can see the results are still fairly verbose.

Here’s how we extract (in M) only the displayName sub-field and discard the rest using Table.ExpandRecordColumn:

#"Expanded assignee" = Table.ExpandRecordColumn(#"Expanded fields", "assignee", {"displayName"}, {"Assignee"}),

Or we could get two (or more) sub-fields like this:

#"Expanded project" = Table.ExpandRecordColumn(#"Replace null assignee", "project", {"key", "name"}, {"Project Key", "Project Name"})

Getting to know the Jira API search command

Jira issues can be extracted via the API using a browser or any API testing tool and the Jira API “search” command. You must be authenticated to Jira before using the search command, and in a browser that normally just means logging in.

By default, Jira limits the number of issues returned to 1,000. I recommend that when testing in a browser you limit results to a much smaller number via the maxResults parameter. Specifying more than 1,000 in maxResults will not override the server maximum. Note that, as described later, this limitation means Power BI must make multiple API search calls to get “chunks” of data, then combine them.

When using Chrome, the json returned will just show as text like this …

There are various Chrome extensions available to display json data in a more readable format, e.g. JSON Formatter from Callum Locke at https://chrome.google.com/webstore/detail/json-formatter/bcjindcccaagfpapjjmafapmmgkkhgoa

Using this extension in Chrome shows the returned data much more clearly:

There are many other ways to view json nicely, for example save as file something.json, open in Visual Studio Code then right-click and select “Format document”

Some sample API calls

In the browser, just enter this URL: (make sure you’re logged in to Jira first!)

For all results restricted to four output fields (again, with no filters but using maxResults=10 to limit returned data)
https://jira.example.com/rest/api/2/search?maxResults=10&fields=duedate,assignee,reporter,creator

Using Power BI to retrieve Jira issues using the Jira API

This is not intended to be a primer on any of Power BI, M, APIs or Jira. A reasonable knowledge of all of these is assumed.

Power BI can use the M function Web.Contents to call an API with URLs like those shown in the examples above and get the results, normally as json, for further processing.

As noted above, by default Jira limits the number of issues returned to 1,000. This means that Power BI needs to make multiple API search calls to get “chunks” of data (of any size you like as long as it’s not greater than 1,000) then combine them. We usually don’t know how many calls/chunks will be needed so we need some looping mechanism, which is not as natural in M as is might be in other languages. However, there are ways of doing this, for example using recursion, or as in the template, using the M function List.Generate which repeatedly calls a specified function and returns the cumulative results as a list.

So, List.Generate acts like a loop by specifying three functions: an initialiser, a continuation test, and a next step. In our case the initialiser and next step function can be the same function – a function that calls the Jira API to get a chunk of results starting at 0 for the initialiser and higher values for subsequent chunks in the next steps. We need to communicate between these function calls the place we’re up to in the results, and to do that we assign a metadata field to each returned table, storing the maxim row number of the results in that chunk. This value is used as the startAt value for the next chunk. The result of List.Generate is a list (of tables, since that’s what our function returns) that can be further mashed using M to get a single table of issues for analysis in the Power BI report.

I found the M code from the original template to be somewhat over-complicated and I’ve simplified it as shown below. The first two queries are just string constants in M that don’t require a “let” statement. Either or both of these could be turned into parameters if you wanted to turn the report into a template.

The four queries are:

  • FieldsToGet – this is a simple comma-delimited list of fields we want to get for each Jira issue. If you need an new field later, just add it to this list and change the Issues query to extract/transform the field as needed.
  • JiraURL – the URL for the Jira server.
  • JiraData – this query defines a function getChunk, then uses List.Generate to repeatedly call that function to get all of the results. It then joins the resulting list of tables into a single table. At this stage the fields requested from Jira are still in a single column called “fields” that will be expanded next. This query does NOT need to be loaded.
  • Issues – this query takes the output from JiraData and expands the fields required, sets types and does further mashups to get the final result table.

FieldsToGet

"assignee,created,updated,resolutiondate,project,priority,summary,resolution,issuetype,status,reporter"

JiraURL (note – in the template this has been converted to a parameter)

"https://jira.example.com"

JiraData

let
    Fields = FieldsToGet,
    jql = "",
    //jql = "?jql=assignee+%3D+%22RBurke+%22", // test ... restrict to one assignee
    //jql = "?jql=project+%3D+%22Project+Name%22", // test ... restrict to one project
    pageSize = 500,
    maxRows = 0, // 0 = all

    getChunk = (optional previousTable) => 
    let
        fPageSize = () => pageSize,
        fSkipRows = () => if previousTable = null then 0 else Value.Metadata(previousTable)[skipRows],
        contents = Web.Contents(JiraURL&"/rest/api/2/search"&jql,[Query = [fields=Fields, maxResults= Text.From(fPageSize()), startAt = Text.From(fSkipRows())]]),
        json = Json.Document(contents),
        Value = json[issues],
        tableX = Table.FromList(Value, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
        tableY = Table.AddColumn(tableX, "Timestamp", (_) => Time.ToText(Time.From(DateTime.LocalNow()),"hh:mm:ss fff")),
        table = Table.AddColumn(tableY, "Meta", (_) =>  fSkipRows() + fPageSize()) meta [skipRows = fSkipRows() + fPageSize()]
    in
       table,

    listOfPages = List.Generate(
            // how to get first chunk
            () => getChunk(null),
            // condition for returning chunk and continuing
            (previousChunk) => previousChunk <> null // we have data
                                and (maxRows = 0  // no row limit
                                        or Value.Metadata(previousChunk)[skipRows] <= maxRows) // row limit not reached
                                and Table.RowCount(previousChunk) > 0, // not at end of results
            // how to get the next chunk
            (previousChunk) => getChunk(previousChunk)
        ),
    tableOfPages1 = Table.FromList(listOfPages, Splitter.SplitByNothing(), {"Column1"}),
    tableOfPages2 = Table.ExpandTableColumn(tableOfPages1, "Column1", {"Column1", "Meta", "Timestamp"}),
    tableOfPages3 = Table.ExpandRecordColumn(tableOfPages2, "Column1", {"expand", "id", "self", "key", "fields"}),
    #"Changed Type" = Table.TransformColumnTypes(tableOfPages3,{{"Meta", Int64.Type}})
    // further expansion of the require fields is needed here .. see original template
in
    #"Changed Type"

Issues

let
    Source = JiraData,
    #"Expanded fields" = Table.ExpandRecordColumn(Source, "fields", Text.Split(FieldsToGet,",")),
    #"Expanded assignee" = Table.ExpandRecordColumn(#"Expanded fields", "assignee", {"displayName"}, {"Assignee"}),
    #"Expanded reporter" = Table.ExpandRecordColumn(#"Expanded assignee", "reporter", {"displayName"}, {"Reporter"}),
    #"Replace null assignee" = Table.TransformColumns(#"Expanded reporter", {"Assignee", each if _ is null then "Unassigned" else _}),
    #"Expanded project" = Table.ExpandRecordColumn(#"Replace null assignee", "project", {"key", "name"}, {"Project", "Project Name"}),
    #"Transform project name" = Table.ReplaceValue(#"Expanded project", each [Project Name], each [Project Name] & " (" & [Project] & ")", Replacer.ReplaceText, {"Project Name"}),
    #"Expanded priority" = Table.ExpandRecordColumn(#"Transform project name", "priority", {"name", "id", "iconUrl"}, {"Priority", "Priority Id", "Priority Icon URL"}),
    #"Expanded issuetype" = Table.ExpandRecordColumn(#"Expanded priority", "issuetype", {"name", "subtask"}, {"Issue Type", "Is Subtask"}),
    #"Expanded status" = Table.ExpandRecordColumn(#"Expanded issuetype", "status", {"name", "id", "iconUrl"}, {"Status", "Status Id", "Status Icon URL"}),
    #"Expanded resolution" = Table.ExpandRecordColumn(#"Expanded status", "resolution", {"name"}, {"Resolution"}),
    #"Replace null resolution" = Table.TransformColumns(#"Expanded resolution", {"Resolution", each if _ is null then "Unresolved" else _}),
    #"Changed Type" = Table.TransformColumnTypes(#"Replace null resolution",{{"created", type datetimezone}, {"updated", type datetimezone}, {"resolutiondate", type datetimezone}, {"id", Int64.Type}, {"Priority Id", Int64.Type}}),
    #"Add date created" = Table.AddColumn(#"Changed Type", "Date Created", each Date.From([created]), type date),
    #"Add week created" = Table.AddColumn(#"Add date created", "Week Created", each Date.EndOfWeek([Date Created]), type date),
    #"Add date updated" = Table.AddColumn(#"Add week created", "Date Updated", each Date.From([updated]), type date),
    #"Add week updated" = Table.AddColumn(#"Add date updated", "Week Updated", each Date.EndOfWeek([Date Updated]), type date),
    #"Add date resolved" = Table.AddColumn(#"Add week updated", "Date Resolved", each Date.From([resolutiondate]), type date),
    #"Add week resolution" = Table.AddColumn(#"Add date resolved", "Week Resolved", each Date.EndOfWeek([Date Resolved]), type date),
    #"Add Is Resolved" = Table.AddColumn(#"Add week resolution", "Is Resolved", each if [Date Resolved] <> null or List.Contains({"Cancelled","Closed","Done"}, [Status]) then 1 else 0, Int32.Type),
    #"Add Resolution Status" = Table.AddColumn(#"Add Is Resolved", "Resolution Status", each if [Is Resolved] = 1 then "Resolved" else "Unresolved", type text),
    #"Removed columns" = Table.RemoveColumns(#"Add Resolution Status", {"expand","Meta","Timestamp","created","updated","resolutiondate"}),
    #"Renamed Columns" = Table.RenameColumns(#"Removed columns",{{"id", "Issue Id"}, {"key", "Issue Key"}, {"self", "Issue URL"}, {"summary", "Summary"}}),
    #"Transform issue URL" = Table.ReplaceValue(#"Renamed Columns", each [Issue URL], each JiraURL & "/browse/" & [Issue Key], Replacer.ReplaceText, {"Issue URL"})
in
    #"Transform issue URL"