The Skew Shielding Principle: Designing Safer APIs

November 15, 2024 · 2 min read

A software and devops engineer, with a passion for data and audio.

As I was writing an unpublished article about Cargo Cults in software engineering, I have started to reflect on the rules I enforce on myself.

There is one I really see no point in breaking. Surprising, online, not a lot of people discuss it, but I suspect it's because it's sort of obvious for anyone with a bit of experience. It is about always having a finite size limit on queries or responses. I figured out it would be worth sharing as a blog article.

The Principle

When designing APIs that are transmitting data over a network, the following rules should always be enforced:

Never let an API request trigger the excution of a database query that has no limit enforced.
If your response data model includes lists, always have a reasonable size limit for them.
To enforce the list size limits, either:
- truncate the list and state it explicity in the key name.
- replace it with an id that can be used to page through the list in another endpoint.
- produce an an error if you database returns the maximum numbers of records enforced by the query limits.

If one of these rules is broken, your API will be vulnerable to skewed datasets, or the requester will be mislead.

A note on data skews

A skew can be either due to a bug or to the underlying nature of the data. If no fetch limits are enforced in an API, the database when "chewing through a skew" will become slow, and your api will waste a lot resources formatting responses.

The Principle​

The Principle