Query and Transform JSON data with JQ: real world examples

2020-01-03

JQ is a very useful and important tool to have in your engineering toolbox, but in my experience only small number of engineers know about it or use it.

The goal of this article is to increase JQ awareness and get at lease a few more people to start using it.


What is JQ?

JQ is a command line tool that allows to Query and Transform JSON data.

Many software engineers use regular expression, sql, graphql, xpath as a declarative way to get the data in a shape they need.

I like to think that JQ is yet another of these tools and needs to be in your engineering tool belt.


Real life use cases of using JQ

1. pretty print JSON data

Do you often google for "JSON pretty print" and paste your JSON into some random website? or open your IDE (like VSCode) to format JSON?

By default JQ pretty-prints the output. So applying identity . operator to any JSON input will return identical pretty-printed data.

For example:

echo '{"firstName":"John"}' | jq .
{
  "firstName": "John"
}

I personally even created a custom bash command on my Mac

pbpaste | jq . | pbcopy

which takes data from clipboard, pretty prints it and puts it into a clipboard.

(I also created my own JSON beautifier web page that I know doesn't send data to the server https://www.devtoolsdaily.com/json_formatter/)


2. Share JSON data highlighting important parts

For example, sharing a list of repos from organization octokit

curl https://api.github.com/orgs/octokit/repos

the JSON output of that is large (> 3000 lines), but imagine you want to share only repo name + how you acquired them, so they can do that next time themselves.

Example: repo names with more than 40 stargazers

curl https://api.github.com/orgs/octokit/repos | jq '.[] | select(.stargazers_count>40) | .full_name'
"octokit/octokit.rb"
"octokit/rest.js"
"octokit/octokit.net"
"octokit/octokit.objc"
"octokit/go-octokit"
"octokit/octokit.graphql.net"
"octokit/fixtures"
"octokit/webhooks.js"
"octokit/graphql-schema"
"octokit/routes"
"octokit/app.js"
"octokit/request.js"
"octokit/graphql.js"
"octokit/auth.js"
"octokit/auth-app.js"

3. Use in combination with other command line tools

Now building on top of an example above. You might want to format returned list or fetch more information about each item. Combining curl, jq and xargs will do the trick.

curl https://api.github.com/orgs/octokit/repos | jq '.[] | select(.stargazers_count>40) | .full_name' | xargs -n 1 echo 'repo name ='
repo name = octokit/octokit.rb
repo name = octokit/rest.js
repo name = octokit/octokit.net
repo name = octokit/octokit.objc
repo name = octokit/go-octokit
repo name = octokit/octokit.graphql.net
repo name = octokit/fixtures
repo name = octokit/webhooks.js
repo name = octokit/graphql-schema
repo name = octokit/routes
repo name = octokit/app.js
repo name = octokit/request.js
repo name = octokit/graphql.js
repo name = octokit/auth.js
repo name = octokit/auth-app.js

more complex example would be to use xargs + curl to fetch more information about each repo.


4. Use JQ in your API testing

Often we write integration tests that compare API results to expected values. In most of the cases we have to only compare certain fields, or exclude some.

So you code might end up being polluted with comparing specific fields.

order_owner_expected = {"firstName": "John", "lastName": "Doe"}
orders = order_api.find_orders_by_owner(123)
assertEquals(orders[0]["owner"]["firstName], order_owner_expected["firstName])
assertEquals(orders[0]["owner"]["lastName], order_owner_expected["lastName])

instead you can make it more declarative.

order_owner_expected = {"firstName": "John", "lastName": "Doe"}
orders = order_api.find_orders_by_owner(123)
owner_received = apply_jq(".orders[0].owner | {firstName, lastName}")
assertEquals(owner_received, order_owner_expected))

How to get started with JQ

  1. read the official documentation
  2. install JQ
  3. try and share your queries in our JQ playground
  4. follow the interactive tutorial to learn JQ