Quickly Get JSON Values Via Jayway JsonPath

There is this Java library that I utilize to quickly get values from JSON objects that is fairly easy to use. This would be Jayway JsonPath. More can be had at their GitHub page. It is still being actively maintained. There appear to be a number of contributors involved in this project at different points in time, and it has gone through several version iterations over the years already.

In my experience, I find it useful for instances where I am faced with large JSON data, especially those with a deeply nested structure, lots of lists/collections/arrays, and I only need to extract specific parts of it; and/or repeated patterns where some basic filters/conditions have to be employed to get the desired output.

For one, I don’t need to create POJOs that mirror the JSON schema and create loops to reach the deepest levels in the JSON. Using JsonPath, I can define a path to that attribute directly then grab its value straight away.

Say for example, I have 126KB JSON data or say something that has 3000+ lines when pretty-printed (Not the largest, I’ve seen many with several thousands of lines more). It has multiple large objects in it, and also within are arrays on top of arrays on top of arrays. This can get nasty quickly.

As an example, I want to get all the titles of chapters that are buried in 3 levels of arrays. The JSON path expression can be written like this:

$.store.report.financial.sections[0:].subjects[0:].chapters[0:].title

The output is that I get a list of these titles, where I can then save it to the database afterwards.

For quick context, the $ is the root element of the path that starts all expressions. Then the “.” in between is like the separator of the path being expressed here. It is kinda like the “/” when I am going through nested directories in a Linux file structure. Lastly, the – [0:] – in the example above means I am dealing with an array, where I want it to start with the 0 index up to the last element.

JsonPath supports both dot-notation and bracket-notation for expressing the JSON path. These are all discussed in their project page.

To take it further, say I want to get titles of chapters that have more than 500 words in it. Let’s assume that the chapters object is defined with other attributes like the following:

{
	"chapters": [{
			"pages": 5,
			"title": "The Quick Brown Fox",
			"topic": "food-expenses",
			"word_count": 579
		},
		{
			"pages": 3,
			"title": "Jumps Over the Lazy Dog",
			"topic": "food-expenses",
			"word_count": 320
		}
	]
}

To get only titles with over 500 words in it, I can set the path with a filter defined:

$.store.report.financial.sections[0:].subjects[0:].chapters[?(@.word_count > 500)].title

The above is an inline predicate. That predicate is defined in the path itself. There are other ways to filter for different things through the use of the Filter API that also comes with the JsonPath library. Or you can also do a custom one too. Again, all of these are detailed in the their GitHub repository.

Multiple predicates is supported and can be chained together. Use of &&, ||, AND, OR is available.

There is built-in support for a good number of functions and operators. JsonPath output can even be mapped directly to a POJO. Although when it comes down to doing that last bit, I think I would rather start using the famous Jackson library to fully handle a JSON object with more complex business logic. But for simple or straightforward tasks, I can do away with a lot of boilerplate Java code and get down to doing other stuff. When I say simple tasks, it does not have to be small projects. It (that task, for lack of a better term) can be part of a bigger one.

Right now, I would say there are a number of ways I’ve used it:

  • Protoyping of software applications
  • Hackathons/hackfests/codefests
  • Simple field mapping from a JSON datasource to a destination (downstream can be anything)
  • Learning/educational purposes

On that 3rd point this becomes apparently useful, in my humble opinion, where the schema of the source is still not entirely known/lacking – because 3rd party – and/or the source attributes to destination have not been fully flushed out by the business-side yet. This happens commonly, for example, in systems integrations. That means, I can assign a non-existing path, or just anything and later on I can change that path very easily when the information is available. The paths can be defined together in a custom YAML configuration file such as this one I’ve describe here – Read YAML In Spring Boot.

Similar Posts: