A team I joined last year was rebuilding their public API and the first design doc opened with the line "we are picking GraphQL because REST is legacy". Three months later they were back on REST for everything that wasn't a deeply nested admin dashboard. Nothing was wrong with GraphQL. Nothing was wrong with REST. The doc had simply skipped the question that decides this every time: what shape do my real callers need?
I have shipped REST APIs, gRPC services, and GraphQL endpoints in production at three different companies. None of them is a default. Each fits a particular call shape, a particular caller, and a particular operations posture. My stance in this article is simple: pick the protocol that matches the shape of the calls you actually make, not the one with the loudest blog presence this quarter.
Same endpoint, three protocols
Here is the same fictional endpoint, "fetch the order with its line items and the customer's loyalty tier", written three ways. The differences tell you most of what you need to know.
REST, two requests typically:
GraphQL, one request:
gRPC, one typed call:
Three different conversations. REST treats the data as resources you fetch by URL and combine on the client. GraphQL treats the data as one graph and lets the caller pick the fields. gRPC treats the call as a typed function on a typed service.
What each protocol actually optimizes
REST is a discipline more than a protocol. The honest version is: HTTP verbs map to operations, URLs identify resources, status codes carry outcome, and the body carries data in whatever format you agree on (almost always JSON). Caching falls out of HTTP itself. So does idempotency on GET, PUT, and DELETE. So does the entire ecosystem of proxies, load balancers, CDNs, and browser dev tools.
What REST optimizes for is legibility and reach. Anyone with curl can call your API. Anyone with a browser can debug it. Every CDN on Earth knows how to cache GET /products/42. The cost is round trips: if your screen needs five resources, the client makes five fetches, or you stitch a custom aggregate endpoint.
GraphQL replaces "many endpoints" with "one endpoint, one schema, declarative selection". Clients send queries and get back exactly the shape they asked for. The server resolves each field, often by composing several internal services. What GraphQL optimizes for is client flexibility. Mobile clients with strict bandwidth budgets and screens that combine data from five backends are the canonical fit. The cost is that HTTP caching becomes harder (every query is a POST to /graphql by default), error handling lives partially in the response body instead of the status code, and N+1 query patterns appear unless you wire DataLoader or persisted queries carefully.
gRPC is the one I see misrepresented most often. It is not "REST with protobufs". It is a typed RPC framework on top of HTTP/2, with bidirectional streaming, generated client stubs in a dozen languages, and a tight binary wire format. What gRPC optimizes for is internal service-to-service traffic where both sides are yours, latency matters, and you want the compiler to catch shape mismatches at build time. The cost is that browsers cannot speak gRPC natively (you need gRPC-Web with a proxy) and human debugging needs the right tools because you cannot just curl it.
The three questions I run through
When a team asks me which to pick, I run through three questions in order. The answers usually decide before we get to the trade-off table.
- Who is the caller? A browser or third-party developer with no contract with you? REST. A mobile app your team ships and a backend your team owns? Either GraphQL or gRPC. An internal service inside your own cluster? gRPC.
- What is the dominant call shape? Single resource fetch, occasional list, well-defined CRUD? REST. Aggregate views that combine fields from multiple sources, varying per screen? GraphQL. Typed function calls between services with a schema you control on both ends? gRPC.
- How important is HTTP caching? Critical, because reads dominate and CDN cacheability is your scaling story? REST, hands down. Acceptable to cache at the application layer with persisted queries and Redis? GraphQL works. Mostly hot-path internal calls that bypass HTTP caching anyway? gRPC.
If those three answers point in the same direction, the decision is made. If they conflict, that is the conversation worth having in the design doc.
The trade-off table I keep open
| Concern | REST | GraphQL | gRPC |
|---|---|---|---|
| Caller surface | Anyone with HTTP | Mobile/SPA with strict bandwidth | Internal services |
| Wire format | JSON | JSON | Binary protobuf |
| Caching | HTTP-native, CDN friendly | Application-layer, persisted queries | None at HTTP level |
| Idempotency | Per HTTP verb | Per mutation, your discipline | Per RPC, your discipline |
| Schema | OpenAPI (optional but recommended) | Mandatory, runtime-enforced | Mandatory, compile-enforced |
| Field selection | All-or-nothing per endpoint | Caller picks | All-or-nothing per RPC |
| Streaming | SSE, WebSockets, long poll | Subscriptions (extra plumbing) | Native bidirectional |
| Tooling | curl, browser, Postman, every proxy | GraphiQL, Apollo, Relay | grpcurl, BloomRPC, generated stubs |
| Versioning | URL or header | Schema deprecation | Field tags + reserved numbers |
| Browser support | Native | Native (just HTTP POST) | Needs gRPC-Web proxy |
The row I want to highlight is caching. REST's caching story is decades of accumulated infrastructure: ETags, Cache-Control, conditional requests, CDNs that already know what to do. GraphQL gives this up for flexibility. If your read-to-write ratio is 100:1 and you serve a global audience, that trade-off is steep.
Three protocol-pick mistakes I keep seeing
The first mistake is treating GraphQL as a database. Teams build a sprawling schema, expose every internal entity, and then watch their backend get hammered by clients running queries that join four tables in a way no one anticipated. GraphQL works best with a curated schema, persisted queries that the server allow-lists, and DataLoader or equivalent batching at the resolver layer. Without those, you ship a database with a fancy query language to anonymous internet callers.
The second mistake is using REST for what is plainly an internal RPC call. A microservice pinging another microservice with POST /api/v2/internal/recompute_order_total and a JSON body that is really a function-call payload is reinventing gRPC badly. The "REST" parts (verb, URL, status code) are decorative; the actual semantics are a typed function call. Just write the gRPC service. The compiler catches the mismatches you would have caught at runtime, and the latency is lower.
The third mistake is picking gRPC for a public API where most of your callers are JavaScript in browsers. gRPC-Web exists, but it adds a proxy layer, a transcoding step, and an extra dependency in every browser app. If your callers are external developers building integrations, REST or GraphQL is going to be friendlier.
Status codes, idempotency, and the boring middle
The thing nobody puts on a marketing comparison page: half of the difference between protocols is in what they make you spell out vs what they hand you for free.
REST hands you HTTP semantics. GET is idempotent and safe (no side effects). PUT is idempotent (same request, same end state). POST is neither, which is why the idempotency key pattern exists. DELETE is idempotent. The status code carries the outcome: 200 OK for success, 201 Created for new resources, 204 No Content for successful deletes, 400 for bad input, 401 for unauthenticated, 403 for forbidden (you are who you say you are, but you cannot do this), 404 for not found, 409 for conflict (concurrent edit, duplicate key), 422 for valid JSON but invalid semantics, 429 for rate limit, 5xx for server problems. A REST client written by someone who knows HTTP gets retries, caching, and error handling almost for free.
GraphQL collapses most of this. Every successful response is 200 OK. Errors live in an errors array in the response body. A query that asks for a non-existent field is a parse error; a query that fails to resolve a field returns partial data plus an error. Idempotency is your discipline; the protocol does not help. Caching is up to your client and possibly a persisted-queries layer at the edge.
gRPC has its own status codes (OK, INVALID_ARGUMENT, NOT_FOUND, ALREADY_EXISTS, PERMISSION_DENIED, UNAUTHENTICATED, RESOURCE_EXHAUSTED, FAILED_PRECONDITION, ABORTED, INTERNAL, UNAVAILABLE, DEADLINE_EXCEEDED) that map well to common failure modes. Streaming and deadlines are first-class. The trade is human ergonomics: you cannot read a packet in your terminal without grpcurl, and the response is binary.
What I actually pick on a real project
For a public API consumed by external developers and browsers: REST, with OpenAPI specs, proper status codes, and at least one cursor-paginated list endpoint. The boring choice is the right one.
For a mobile app where bandwidth is a real concern and the screens compose data from many internal services: GraphQL, with persisted queries and DataLoader, behind a thin gateway that rejects unrecognized queries.
For internal service-to-service calls where I own both ends and latency matters: gRPC, with the schema in a shared repo and CI checking backward compatibility on every PR.
For a single product with a small team and modest scale: pick one and go. The wrong choice is fixable. Indecision is not. I have seen teams spend three months in API-flavor debates and ship nothing while a competitor with a working REST API took the market.
The shape over the trend
Protocols are tools. REST, GraphQL, and gRPC have non-overlapping sweet spots, and the public discourse usually hides that by treating them as competitors for the same job. They are not. They each solve a different shape of conversation between a caller and a server. Match the protocol to the shape: if your real callers fetch resources, use REST; if they compose graphs, use GraphQL; if they call typed functions inside your cluster, use gRPC. Do that and the protocol stops being interesting, which is what you want, because the interesting parts of your system are the business logic and the failure modes, not the wire format.
