Community Article

Error Handling in REST APIs: The Shape I Settled On

RFC 7807 plus a code, requestId, errors array, and documentationUrl. The eight fields earning their keep, the status codes everyone confuses, and what changed my mind across four APIs.

Error Handling in REST APIs: The Shape I Settled On

RFC 7807 plus a code, requestId, errors array, and documentationUrl. The eight fields earning their keep, the status codes everyone confuses, and what changed my mind across four APIs.

error-handling

rest-api

api-design

http

backend

By @leoeriksson

January 14, 2026

Updated May 18, 2026

492 views

Rate

My first REST API returned errors as { error: "Invalid input" }. My second returned { status: "error", code: "INVALID_INPUT" }. My third tried to use the IETF Problem Details for HTTP APIs spec (RFC 9457 (originally RFC 7807)) but added five custom fields without thinking. By the fourth, I had the shape I now use everywhere, which is mostly RFC 7807 with one or two pragmatic deviations. This article is that shape, the reasoning behind every field, and what changed my mind on each.

The thesis: error responses are part of your API contract, just like success responses. They deserve the same design care, the same documentation, and the same backward-compatibility guarantees. A well-designed error shape lets clients build rich UI, automate retries correctly, and debug their own integrations without paging your support team. A bad one does the opposite for years.

The shape I settled on

HTTP/1.1 422 Unprocessable Entity
Content-Type: application/problem+json

{
    "type": "https://api.example.com/errors/validation-failed",
    "title": "Validation failed",
    "status": 422,
    "detail": "The order has 2 invalid fields.",
    "instance": "/orders/abc-123/validate",
    "code": "VALIDATION_FAILED",
    "errors": [
        { "field": "shippingAddress.country", "code": "UNSUPPORTED_COUNTRY", "message": "We don't ship to MN yet." },
        { "field": "items[0].quantity", "code": "OUT_OF_STOCK", "message": "Only 3 units available." }
    ],
    "requestId": "req_2x9KqW9P",
    "documentationUrl": "https://docs.example.com/errors/validation-failed"
}

Eight fields, every one earning its keep. Let me walk through them.

type is a URI that uniquely identifies the kind of error. RFC 7807 makes this the primary identifier, dereferencable to docs. I use it as the stable machine identifier rather than code. URIs are namespaced, version-stable, and forces you to think about it as part of your URL structure.

title is the short human-readable summary. Stable across instances. Not localized (use Accept-Language and a separate localized field if needed).

status is the HTTP status code, repeated in the body. Yes, redundant. It saves clients from threading the response status through their parsing pipeline; the body is self-describing.

detail is the specific human-readable message for this instance. Different from title because it can vary. Localizable.

instance is the URI of the specific occurrence. RFC 7807 likes this; I include it. It pairs with logging and gives you a path back to the request.

code is a short machine-readable code. Yes, this duplicates type somewhat. I include both because clients keep asking for short identifiers ("VALIDATION_FAILED") that fit well in switch statements, and type is a URL nobody wants to write into a case clause. The code is the "human-friendly machine readable", the type is the "spec-correct machine readable".

errors is the list of specific sub-errors. Validation errors are the canonical case: a single 422 response describes multiple field-level problems. The shape inside each error is { field, code, message }, sometimes with value (only when safe; do not echo back password fields).

requestId is critical and surprisingly often missing. Every response should include a request id (also in headers as X-Request-Id or traceparent). When a customer reports a problem, "give me your request id" is the fastest path to the relevant logs. Without it, you are searching by timestamp and IP, which is slow and unreliable.

documentationUrl is the deep link to the docs page for this error. Cheap, useful, and a sign of API maturity.

Status codes, used correctly

The single most common mistake I see is misusing status codes. The cheat sheet I keep open:

Status	Meaning	Common confusion
200 OK	Success, response body returned	Sometimes used for failed operations with `error: true`, which is wrong
201 Created	Resource created, `Location` header should point to it	Often used as a generic success on POST; only for creation
204 No Content	Success, no body	Use for successful DELETE
400 Bad Request	The request itself is malformed (bad JSON, missing required fields at the protocol level)	Often confused with 422
401 Unauthorized	The request lacks valid credentials. The client is anonymous or its token is invalid/expired.	Mistakenly used for forbidden
403 Forbidden	The credentials are valid, but the user is not allowed to do this	Mistakenly used for unauthorized
404 Not Found	The resource does not exist (or you choose to lie that it doesn't)	Sometimes used for "you don't have permission", which leaks less info
409 Conflict	The request conflicts with current state (concurrent edit, duplicate key)	Underused; useful for optimistic concurrency
410 Gone	The resource was permanently removed	Useful for sunset endpoints
422 Unprocessable Entity	Valid syntax, invalid semantics (validation failed)	The right code for "your JSON parsed but the values are wrong"
429 Too Many Requests	Rate limit	Should include `Retry-After`
5xx Server Errors	Something went wrong on our side	Do NOT leak internal details

The 401-vs-403 confusion is the one I correct most often. 401 means "I don't know who you are" (no token, expired token, invalid signature). 403 means "I know who you are, but you can't do this" (valid token, insufficient permissions). The wire shape is the same, but the semantic difference matters: clients respond to 401 by triggering a re-login flow, and to 403 by showing a "you don't have access" message. Mixing them up confuses both client logic and human users.

The 400-vs-422 confusion is the second most common. 400 means the request was malformed at the protocol level (broken JSON, missing required headers, invalid query string). 422 means the request was syntactically fine but semantically rejected (a number where a string was expected, an unknown country code, a price below zero). Most validation errors are 422, not 400. If your framework defaults all validation errors to 400, override it; the difference makes client-side handling cleaner.

What I do not put in error responses

Three things stay out of error bodies, every time.

Stack traces. Even in development. They leak file paths, library versions, and code structure to anyone who can hit your API. The stack belongs in your error tracker (Sentry, Honeycomb, Datadog), not the wire response.
SQL or query details. A WHERE clause in an error message is an information disclosure bug. The error tracker, again, is the right home.
Other users' data. Error messages occasionally include data from the wrong record. Audit your error paths for this; I have shipped at least one bug where a 500 included a snippet of another user's row in the message.

The general principle: an error response should help the developer fix their integration, not the attacker probe your system.

Server errors and the support gap

Server errors (5xx) are the ones where the gap between "what we tell the client" and "what we actually log" is widest. The client gets a sanitized message; the server logs the full detail.

HTTP/1.1 500 Internal Server Error
Content-Type: application/problem+json

{
    "type": "https://api.example.com/errors/internal",
    "title": "Internal server error",
    "status": 500,
    "detail": "Something went wrong on our side. We have been notified.",
    "code": "INTERNAL_ERROR",
    "requestId": "req_3y8MqW9P"
}

Internally, that same requestId is logged with:

Internal log entry
  request_id: req_3y8MqW9P
  user_id: u-9
  endpoint: POST /orders
  error: PrismaClientKnownRequestError
    code: P2002
    target: orders_idempotency_key_unique
    message: Unique constraint failed
  stack: ...
  query: INSERT INTO orders (...) VALUES (...)
  trace_id: 0af7651916cd43dd8448eb211c80319c

The customer sends "I got error req_3y8MqW9P at 14:32"; support pastes the request id into the log search and gets the full context. This is the path that scales. Trying to make every error message self-explanatory in the wire response is a losing battle, and including too much detail in production error bodies is a security smell.

Idempotency keys and 409 Conflict

A specific pattern worth calling out, because it interacts with error shape: idempotency keys for state-changing requests. The pattern is to require the client to send a unique Idempotency-Key header on POST requests that create resources. The server stores the response keyed by the idempotency key; if the client retries, it gets the same response back.

Two error cases worth designing for:

Same key, same body: return the original response (whatever it was), idempotently.
Same key, different body: return 409 Conflict with a code: "IDEMPOTENCY_KEY_REUSE" and a clear message. This is almost always a client bug (the client retried with a fresh body but the same key), and you want them to notice immediately rather than silently get stale results.

{
    "type": "https://api.example.com/errors/idempotency-key-reuse",
    "title": "Idempotency key reuse",
    "status": 409,
    "detail": "Idempotency key 'abc-123' was previously used with a different request body.",
    "code": "IDEMPOTENCY_KEY_REUSE",
    "originalRequestId": "req_first",
    "requestId": "req_second"
}

The originalRequestId tells the client where the original request landed, which is a useful debugging hint.

Retry guidance, in the response

Some errors are retryable, some are not. Telling the client which is which inside the response saves them from guessing.

The rule of thumb I use:

4xx errors (except 408 Request Timeout, 429 Too Many Requests, 425 Too Early): not retryable. The client did something wrong; retrying without changes will fail the same way.
5xx errors and 408/425/429: retryable. The server had a transient problem; retry with backoff.

Some clients honor this convention automatically. Others need an explicit signal. I include a retryable boolean in the body when the convention is not enough:

{
    "type": "https://api.example.com/errors/upstream-timeout",
    "title": "Upstream service timed out",
    "status": 504,
    "detail": "The payments provider did not respond within 30 seconds.",
    "code": "UPSTREAM_TIMEOUT",
    "retryable": true,
    "retryAfter": 5,
    "requestId": "req_4z9NqW9P"
}

The retryAfter is in seconds, matching the Retry-After header. Including it in both is redundant but harmless; clients that read headers get the value, clients that parse the body also get it.

Versioning the error shape

Just like the success shape, the error shape is a contract. Adding a new optional field is fine. Renaming or removing a field is breaking. I keep the error shape stable across API versions and treat any changes to it under the same versioning policy as the rest of the API.

The two fields I have considered changing my mind on:

code vs only type. RFC 7807 only mandates type. I added code because clients kept asking for it, and I have not regretted it. The cost is that I now have to keep both in sync; they identify the same set of errors, redundantly.
errors array vs single error. I went back and forth on whether to always wrap errors in an array, even for single-error cases. I settled on: top-level fields describe the overall error; errors array is optional and used only for multi-error cases (validation, batch operations). A 429 Too Many Requests does not have an errors array.

What changed my mind across four iterations

The biggest single change between my second and fourth API was adding requestId. I originally thought it was a debug-mode-only thing. Then I had a customer say "I'm getting a 500 sometimes" and realized I had no way to find their specific request in the logs. Adding requestId to every response, success and error, was a one-day change that ten times paid for itself in customer support time saved.

The second was adopting RFC 7807. Reading the spec made me realize most of what I was inventing already had a standard answer; just following the spec gave my errors instant familiarity to anyone who had worked with another well-designed API. The application/problem+json content type is a small thing, but it tells clients the shape of what they are about to parse.

The third was the field-level errors array. My early APIs returned a single error per validation failure, which forced clients to submit, fix, resubmit, fix, in a loop. Returning all validation errors at once cuts the round-trip count by 5-10x for forms with multiple fields.

Error shape as engineering culture

A small confession to close on. A team's error shape is one of the clearest signals of how mature their API is. A startup's first API is usually { error: "..." } with whatever message the developer typed. A polished public API has a stable, documented, RFC-compliant error shape with request ids and field-level details. Reading a few error responses tells you more about the engineering culture than any blog post the company has written.

If you take one thing from this article, take this: design your error shape on day one, document it like any other interface, version it like any other contract, and fill it with the fields a client developer at 2am would actually want. RFC 7807 plus code, requestId, errors, and documentationUrl is a starting point that has held up across four APIs for me. The spec is short, the implementation is straightforward, and the customer-support cost of getting this right is hours per month for the rest of the API's life. Few engineering investments pay back that consistently.

Back to Articles