Tuesday, February 19, 2013

Best practice for paging in RESTful APIS (updated)?

In the RHQ-REST api, we return individual objects and also collections of objects. Some of those collections are rather small (number of platforms), while others can grow a lot (number of resources in total or number of alerts fired). For the latter it is advised to do some paging and not return the full result set in one go. Some of the reasons are:
  • Memory consumption in server and client
  • Bandwidth need to transfer the data.
  • Latency to transfer huge amounts of data over slow networks


Inside RHQ we have the concept of a PageList<?> where an internal PageControl object defines the page size and other criteria like sorting. The PageList then only contains the objects from a certain page. I think this is a pretty common setup.

And here is where my question comes:

What is the "best-practice" to represent such a PageList in a RESTful api? So far I have seen two major ways:
  1. Add a Link: header that contains the prev and next relations. This is what RFC 5988 suggests and what projects like AeroGear use. The advantage here is that the body still contains the "raw" data and not meta data. And for both cases of 'single' object and 'collection' the data is at the 'root' of the body. Also paging is available for HEAD requests.

    On the other hand, it may get a bit harder for some client code (JavaScript, jQuery) to access the header and make use of the paging links
  2. Put the prev and next relations in the body of the request next to the collection. This has the advantage that there is no need to parse the http header. Disadvantage is that the real payload is now shifted "one level down" for collections.


    I sort of see the paging links as meta-data and think that this should not be mixed with the payload. Now a colleague of mine said: "Isn't that a state change link for the collection like the 'rel=edit' for a single object?". This sounds odd, but can't be denied.

Actually I have also seen mentioning the use of cookies to send the paging information, but that looks very non-transparent to me, so I am not considering this at all.

Just to be clear: I am explicitly talking about paging of collections and not about affordances of individual objects.

So are there established best practices? How do others do it?

If going for the Link: header: would people rather like to see multiple Link headers (see RFC 2616), one for each relation:

Link: <http://foo/?page=3>; rel='next'
Link: <http://foo/?page=1>; rel='prev'

or rather the combined way:

Link: <http://foo/?page=3>; rel='next', Link: <http://foo/?page=1>; rel='prev'

that is listed in RFC 5988?

[update]

I just saw that URLConnection.getHeaderField(name) does not support the multiple Link: headers as it only returns the last occurrence:

If called on a connection that sets the same header multiple times with possibly different values, only the last value is returned.


While there may be other ways to access all the Link: headers, this is a too obvious pitfall, that can be prevented by not using that style.

3 comments:

matthiaswessendorf said...

Not 100% correct :) The AeroGear clients are supporting several options for "paging through" a REST interface... See here.

This means we can support both: body, webLinking and even custom headers...

Ollie said...

I think the most important best practice is to actually expose the previous/next links to your clients so that it does not has to manually build them itself. It then can also reason about things like "Am I on the first page" by the mere presence of the links.

In terms of the headers VS. the body it becomes more of a blurry line. The benefit of headers is that it's standardized and you can potentially switch media types and clients would not have to change their page navigation behavior as it's still header based.

In case you're going with the response body option, it's probably best to go with a hypermedia-enabled media type (e.g. Atom in the XML world, HAL when it comes to JSON) so that clients have a standardized place to find the links.

One final word on affordances: pagination is a very good example of the concept actually. Without the links present, the client would essentially have no idea about the ability to navigate between pages. By exposing the links you make this explicit.

Hendy Irawan said...

OData uses approach #2.

I personally also prefer #2.