Friday, July 26, 2013

Custom Deserializer in Jackson and validation

tl;dr: it is important to add input validation to custom json deserializers in Jackson.

In RHQ we make use of Json parsing in a few places - be it directly in the as7/Wildfly plugin, be it in the REST-api indirectly via RESTEasy 2.3.5, that already does the heavy lifting.

Now we have a bean Link that looks like

public class Link {
String rel;
String href;
}

The standard way for serializing this is

{ "rel":"edit", "href":"http://acme.org" }

As we need a different format I have written a custom serializer and attached it on the class.

@JsonSerialize(using = LinkSerializer.class)
@JsonDeserialize(using = LinkDeserializer.class)
@Produces({"application/json","application/xml"})
public class Link {

private String rel;
private String href;

This custom format looks like:

{
"edit": {
"href": "http://acme.org"
}
}

As a client can also send links, some custom deserialization needs to happen. A first cut of the deserializer looked like this and worked well:

public class LinkDeserializer extends JsonDeserializer<Link>{

@Override
public Link deserialize(JsonParser jp,
DeserializationContext ctxt) throws IOException
{
String tmp = jp.getText(); // {
jp.nextToken(); // skip over { to the rel
String rel = jp.getText();
jp.nextToken(); // skip over {
[…]

Link link = new Link(rel,href);

return link;
}

Now what happened the other day was that in some tests I was sending data and our server blew up horribly. Memory usage grew, the garbage collector took huge amounts of cpu time and the call eventually terminated with an OutOfMemoryException.

After some investigation I found that the client did not send the Link object in our special format, but in the original format that I showed first. Further investigation showed that in fact the LinkDeserializer was consuming the tokens from the stream as seen above and then also swallowing subsequent tokens from the input. So when it returned, the whole parser was in a bad shape and was then trying to copy large arrays around until we saw the OOME.

After I got this, I changed the implementation to add validation and to bail out early on invalid input, so that the parser won’t get into bad shape on invalid input:

    public Link deserialize(JsonParser jp,
DeserializationContext ctxt) throws IOException
{

String tmp = jp.getText(); // {
validate(jp, tmp,"{");
jp.nextToken(); // skip over { to the rel
String rel = jp.getText();
validateText(jp, rel);
jp.nextToken(); // skip over {
tmp = jp.getText();
validate(jp, tmp,"{");
[…]

Those validate*() then simply compare the token with the passed expected value and throw an Exception on unexpected input:

    private void validate(JsonParser jsonParser, String input,
String expected) throws JsonProcessingException {
if (!input.equals(expected)) {
throw new JsonParseException("Unexpected token: " + input,
jsonParser.getTokenLocation());
}
}

The validation can perhaps be improved more, but you get the idea.




No comments: