System Design Article
Authentication & Authorization (OAuth2, JWT, RBAC)
Difficulty: Medium
Authentication answers 'who are you?'. Authorization answers 'what are you allowed to do?'. Most systems get both wrong in subtle ways: rolling their own crypto, treating JWTs as a session store, copying RBAC into every service, or never thinking about how to revoke a leaked credential. This lesson covers the standard building blocks: password storage with adaptive hashing, session vs token authentication, OAuth2 and OIDC flows, JWTs and their honest trade-offs, RBAC vs ABAC vs ReBAC, multi-tenant authorization at scale, machine-to-machine auth (API keys, mTLS, workload identity), and the operational concerns (key rotation, revocation, audit). The goal is to leave you able to design and defend the auth architecture for any system, from a single product to a federated multi-tenant platform.
Authentication & Authorization (OAuth2, JWT, RBAC)
Authentication answers 'who are you?'. Authorization answers 'what are you allowed to do?'. Most systems get both wrong in subtle ways: rolling their own crypto, treating JWTs as a session store, copying RBAC into every service, or never thinking about how to revoke a leaked credential. This lesson covers the standard building blocks: password storage with adaptive hashing, session vs token authentication, OAuth2 and OIDC flows, JWTs and their honest trade-offs, RBAC vs ABAC vs ReBAC, multi-tenant authorization at scale, machine-to-machine auth (API keys, mTLS, workload identity), and the operational concerns (key rotation, revocation, audit). The goal is to leave you able to design and defend the auth architecture for any system, from a single product to a federated multi-tenant platform.
295 views
7
Motivation
A new product launches. The first version stores passwords as MD5 hashes and uses a session cookie. Six months later, attackers dump the user table; every password is cracked within a day. The team migrates to bcrypt and short-lived access tokens. A year later they add a mobile app, then third-party integrations, then enterprise SSO. By year three the auth system is six different layers stitched together and nobody fully understands which one is the source of truth.
This is the standard auth journey. Each step is reactive to the previous mistake. The disciplined alternative is to design auth deliberately from day one with a small set of well-understood building blocks:
- A trusted identity provider (yours or federated) that authenticates users.
- An access token with a short lifetime that services validate without phoning home.
- A refresh token with a long lifetime that can be revoked.
- An authorization model (RBAC, ABAC, or ReBAC) that answers 'is this principal allowed to do this action on this resource?'.
- A revocation mechanism for credentials that get compromised.
- An audit log of every authentication and every authorization decision.
Most serious mistakes are not about the cryptography (the libraries are fine). They are about the design: storing too much in JWTs, no revocation path, missing tenant checks, hard-coded secrets, lazy session expiry. Senior engineers are expected to spot these in design reviews.
Why this matters: a single auth bug can compromise every user and every customer at once. There is no other system in your stack with the same blast radius.
Deep Dive
Authentication vs authorization
These are different problems with different solutions:
- Authentication (AuthN): 'who are you?'. The user proves their identity (password, second factor, SSO). Output: a verified principal.
- Authorization (AuthZ): 'what are you allowed to do?'. Given a principal, an action, and a resource, decide allow / deny. Output: a permission decision.
The two are almost always implemented as separate layers. Authentication runs once per session or token issuance; authorization runs per request.
+---------------+ identity +---------------+ decision +-----------+
| AuthN | ----------------> | AuthZ | -------------> | Resource |
| (login flow) | | (per request) | | (allowed) |
+---------------+ +---------------+ +-----------+Mixing them is a common source of bugs: 'is the user logged in?' is not the same as 'is the user allowed to read this document?'.
Password storage: never roll your own
If you must store passwords yourself (versus delegating to an identity provider):
- Use an adaptive password hashing function: Argon2id (preferred), bcrypt, or scrypt. Never SHA-256, MD5, or PBKDF2-with-trivial-iterations.
- Pick a cost factor that takes ~100ms on your server. Adjust upward as hardware improves.
- Store the hash with the algorithm, cost factor, and salt embedded so you can migrate algorithms later.
- Never log passwords, never email them, never print them in error messages.
The library does the work. Your job is to pick the right one and the right cost factor. Anything custom (your own salting, your own rounds, your own 'pepper') is a red flag.
For any non-trivial product, delegate authentication entirely (OIDC via an IdP: Auth0, Okta, AWS Cognito, Google, your enterprise SSO). You no longer store passwords; you store an external identifier. This eliminates an entire class of breaches.
Session vs token authentication
Two dominant patterns:
Server-side sessions (cookies + session store):
Client logs in -> server stores session_id -> sends cookie
Client sends cookie on every request -> server looks up session_id in store -> validatesPros: easy to revoke (delete from store), small cookie, trivially supports per-request data (CSRF token, tenant context).
Cons: requires a session store reachable on every request (Redis or DB lookup). Sticky sessions or distributed cache.
Token authentication (JWT or opaque token):
Client logs in -> server issues signed token (JWT)
Client sends token on every request -> server verifies signature, reads claims, no DB lookupPros: stateless. The service can validate the token by signature alone. Scales beautifully.
Cons: revocation is hard (the token is valid until it expires), token contents are visible (signed but usually not encrypted), token size is bigger than a session id.
The modern hybrid: short-lived JWT access tokens (5-15 minutes) + long-lived opaque refresh tokens (days to weeks) stored server-side. Access tokens are stateless and fast; refresh tokens enable revocation. When access token expires, client uses refresh token to get a new one. Compromised refresh token? Delete from server, all future access tokens fail to issue.
JWTs: the right and wrong uses
A JWT is three base64url-encoded parts joined by dots: header.payload.signature. The signature is over header + payload using a key (asymmetric: RS256 / ES256 with public key verification, or symmetric: HS256 with shared secret).
What JWTs are good for:
- Stateless authentication tokens (short-lived access tokens).
- Identity assertions between systems (OIDC ID tokens).
- Signed payloads where you want self-contained verification (callback URLs with state, signed download links).
What JWTs are NOT good for:
- Session storage. Putting cart contents, profile data, or settings in a JWT means the user cannot change them mid-session and you cannot invalidate them.
- Long-lived authentication. A 30-day JWT cannot be revoked. If it leaks, the attacker has 30 days. Use short-lived access tokens with refresh tokens instead.
- Storing PII. JWTs are signed but usually not encrypted; anyone with the token can read the payload.
- Authorization decisions in isolation. Permissions change; JWTs do not. A user revoked from a project at 12:00 will still appear authorized in the JWT they got at 11:55 until it expires.
The minimum-viable JWT contains: sub (subject id), iss (issuer), aud (audience), iat, exp, and maybe a small set of standard scopes. Not the user's email, not their roles in plaintext, not their tenant configuration.
Key rotation and key management
JWT signing keys must rotate periodically (compromise, hygiene, compliance). The standard pattern:
- Use asymmetric keys (RS256 / ES256). The signing key is private; verifiers use the public key.
- Publish public keys via a JWKS endpoint (
/.well-known/jwks.json). - Each key has a
kid(key id). Tokens reference thekidthey were signed with. - During rotation: introduce the new key, sign new tokens with it, keep the old key in JWKS for verification of in-flight tokens until they expire, then retire it.
Verifiers cache the JWKS with a TTL (5-60 minutes). They never hard-code keys.
For symmetric HS256: rotation is harder because both sides need the new secret. Avoid HS256 in distributed systems; use RS256 / ES256.
OAuth2 and OIDC: the protocol family
OAuth2 is an authorization framework: it lets one app act on behalf of a user at another app. OIDC (OpenID Connect) is a thin layer on top that adds authentication and standardized identity claims.
The four common flows:
| Flow | When |
|---|---|
| Authorization Code + PKCE | Web apps, SPAs, mobile apps. The current default for everything. |
| Client Credentials | Machine-to-machine (server to server). No user involved. |
| Device Authorization | Devices without browsers (CLIs, smart TVs, IoT). |
| Refresh Token | Used after any of the above to get fresh access tokens. |
Deprecated and dangerous: Implicit flow (replaced by Auth Code + PKCE), Resource Owner Password Credentials (the user's password leaves the IdP, unsafe).
Authorization Code + PKCE in one diagram:
[ User ] -> [ Client app ] redirects to [ IdP ]
|
| code_challenge = SHA256(code_verifier)
v
[ IdP login UI ]
|
| user authenticates
v
[ IdP issues authorization code to client ]
|
v
[ Client app -> IdP token endpoint with code + code_verifier ]
|
v
[ IdP returns access_token + refresh_token + id_token ]PKCE (Proof Key for Code Exchange) prevents an attacker who intercepts the authorization code from exchanging it; only the original client knows the code_verifier. PKCE is mandatory for SPAs and mobile and recommended for everything.
RBAC, ABAC, ReBAC
Authorization models, in order of increasing power:
RBAC (Role-Based Access Control): assign users to roles; assign permissions to roles. 'Admins can edit anything. Editors can edit own posts.'
user -> role -> permissions
alice -> admin -> [posts:read, posts:write, posts:delete]
bob -> editor -> [posts:read, own_posts:write]Pros: simple, easy to audit, fits ~80% of real apps. Cons: 'role explosion' as fine-grained needs grow ('finance-team-editor-with-budget-cap-100k'). Hard to express resource-level rules.
ABAC (Attribute-Based Access Control): decisions are functions of attributes (user attributes, resource attributes, action, context). 'A doctor can read a patient's record if the patient is in their assigned care list and the record's department matches.'
allow if user.role == 'doctor'
and resource.type == 'medical_record'
and resource.patient_id in user.assigned_patients
and resource.department == user.departmentPros: arbitrary expressiveness. Cons: harder to audit, harder to optimize, easy to write rules nobody understands a year later.
ReBAC (Relationship-Based Access Control): model authorization as a graph of relationships. 'Alice can edit document D if there is a path from Alice to D via owner / editor / shared-with relations.' Google Zanzibar is the canonical implementation; SpiceDB and OpenFGA are open-source clones.
document:annual_report
owner: user:alice
editor: group:finance_team#member
viewer: org:acme#memberPros: handles complex sharing well (Google Docs, GitHub, Slack), centralized model is auditable. Cons: more infrastructure (a separate authz service), unfamiliar mental model.
The choice: small or simple app -> RBAC. App with rich resource sharing semantics (collaboration, multi-tenant SaaS with team / org structure) -> ReBAC. App with regulatory or context-heavy rules (healthcare, finance) -> ABAC, often layered on ReBAC.
Multi-tenant authorization
A multi-tenant SaaS must guarantee 'a user from tenant A can never see data from tenant B'. The most common cause of cross-tenant breach is a missing tenant filter on a query.
Defenses, in order of strength:
- Tenant id in every query at the application layer. Easy to forget; relies on developer discipline.
- Per-tenant database connection with automatic schema or DB selection by tenant id. Eliminates the missed-filter class of bug entirely (can never see another tenant's table).
- Row-level security (Postgres RLS): the DB enforces 'every query must filter by current_tenant'. The app sets the tenant in the connection; the DB rejects cross-tenant access.
- Per-tenant database (full isolation): one DB per tenant. Strongest guarantee, hardest to operate.
The interview-grade answer: 'I would use Postgres RLS to enforce tenant isolation in the database, plus tenant-aware code in the app, plus an integration test suite that tries to access tenant B as a user from tenant A and asserts denial. Defense in depth.'
Machine-to-machine authentication
Services need to talk to other services. Options:
- API keys: long-lived shared secrets. Easy. Awful for rotation; if leaked, every caller is compromised. Use only for low-stakes external integrations.
- OAuth Client Credentials: server obtains a short-lived access token from an IdP using client_id / client_secret. Good for external integrations.
- mTLS (mutual TLS): each service has a certificate; both sides verify. Strong identity, but PKI is operationally heavy.
- Workload identity (SPIFFE / SPIRE, Kubernetes service accounts, AWS IAM roles for service accounts): the platform issues short-lived credentials to running workloads. The standard for internal service-to-service inside Kubernetes / cloud.
The modern stack: OAuth client credentials for cross-org calls, workload identity (SPIFFE / IRSA / GKE Workload Identity) for internal calls, mTLS as a transport-layer guarantee.
Implementation
Issuing access + refresh tokens
JavaScript (Node + jose)
import { SignJWT } from 'jose';
import crypto from 'crypto';
import { savedRefreshToken } from './db.js';
async function issueTokens(user, privateKey, kid) {
const now = Math.floor(Date.now() / 1000);
const accessToken = await new SignJWT({
sub: user.id,
iss: 'https://auth.example.com',
aud: 'https://api.example.com',
scope: 'read write',
})
.setProtectedHeader({ alg: 'RS256', kid })
.setIssuedAt(now)
.setExpirationTime(now + 600) // 10 minutes
.sign(privateKey);
const refreshToken = crypto.randomBytes(32).toString('base64url');
await savedRefreshToken.insert({
token_hash: hash(refreshToken),
user_id: user.id,
expires_at: new Date(Date.now() + 30 * 24 * 3600 * 1000), // 30 days
});
return { accessToken, refreshToken };
}Python (PyJWT + DB)
import jwt
import secrets
import time
from hashlib import sha256
def issue_tokens(user, private_key, kid):
now = int(time.time())
access_token = jwt.encode(
{
'sub': user.id,
'iss': 'https://auth.example.com',
'aud': 'https://api.example.com',
'scope': 'read write',
'iat': now,
'exp': now + 600,
},
private_key,
algorithm='RS256',
headers={'kid': kid},
)
refresh_token = secrets.token_urlsafe(32)
db.execute(
'INSERT INTO refresh_tokens (token_hash, user_id, expires_at) VALUES (%s, %s, %s)',
(sha256(refresh_token.encode()).hexdigest(), user.id,
time.time() + 30 * 24 * 3600),
)
return access_token, refresh_tokenNote: refresh tokens are stored hashed (never plaintext), exactly like passwords.
Verifying a JWT in a service
JavaScript (Node middleware)
import { jwtVerify, createRemoteJWKSet } from 'jose';
const JWKS = createRemoteJWKSet(new URL('https://auth.example.com/.well-known/jwks.json'));
export async function authMiddleware(req, res, next) {
const auth = req.headers.authorization;
if (!auth || !auth.startsWith('Bearer ')) return res.status(401).end();
try {
const { payload } = await jwtVerify(auth.slice(7), JWKS, {
issuer: 'https://auth.example.com',
audience: 'https://api.example.com',
});
req.principal = { id: payload.sub, scope: payload.scope?.split(' ') ?? [] };
next();
} catch (err) {
res.status(401).json({ error: 'invalid_token' });
}
}Python (FastAPI dependency)
import jwt
from fastapi import Header, HTTPException
from jwt import PyJWKClient
jwks_client = PyJWKClient('https://auth.example.com/.well-known/jwks.json')
def verify_token(authorization: str = Header(...)):
if not authorization.startswith('Bearer '):
raise HTTPException(status_code=401)
token = authorization[7:]
try:
signing_key = jwks_client.get_signing_key_from_jwt(token).key
payload = jwt.decode(token, signing_key, algorithms=['RS256'],
audience='https://api.example.com',
issuer='https://auth.example.com')
return {'id': payload['sub'], 'scope': payload.get('scope', '').split()}
except jwt.PyJWTError:
raise HTTPException(status_code=401)The JWKS client caches keys with TTL; rotation works without code changes.
Postgres row-level security for tenant isolation
-- One-time setup
ALTER TABLE projects ENABLE ROW LEVEL SECURITY;
CREATE POLICY tenant_isolation ON projects
USING (tenant_id = current_setting('app.tenant_id')::uuid);
-- Per-request, after auth middleware extracts tenant from token:
SET LOCAL app.tenant_id = '11111111-2222-3333-4444-555555555555';
-- Now any query like:
SELECT * FROM projects;
-- only returns rows where tenant_id matches the session setting.A missing tenant filter at the application layer becomes a no-results query (instead of a cross-tenant leak). Defense in depth.
A ReBAC check (SpiceDB-style)
# schema
definition user {}
definition document {
relation owner: user
relation editor: user
relation viewer: user | group#member
permission edit = owner + editor
permission view = edit + viewer
}
# data
document:annual_report#owner@user:alice
document:annual_report#viewer@group:finance#member
# check at request time
can user:bob view document:annual_report ?
# evaluates the relation graph; returns true / falseThe authorization service answers boolean checks in ~10 ms even for complex graphs. The application asks 'can this user perform this action on this resource?' and trusts the answer.
When to Use
Delegate authentication to an IdP when
- You are not in the auth business (you sell something other than identity).
- You want SSO, MFA, social login, enterprise federation without building them.
- Your product is enterprise-targeted and customers expect SAML / OIDC integration.
Self-host authentication when
- The product is auth (you ARE the IdP).
- Compliance / regulatory requirements forbid third-party storage of identity.
- You have a security team and a real reason to take on the operational burden.
Use sessions when
- Single web app with same-origin requests.
- Strong revocation needs (banking, admin tools).
- Small to mid scale where session lookup latency is negligible.
Use tokens (JWT) when
- Many services (microservices), no central session store.
- Mobile + web + integration partners all sharing the same auth.
- Stateless services that scale horizontally without session affinity.
Use the modern hybrid (short JWT + refresh) when
- You want stateless verification per request AND revocation when needed.
- This is the right default for any non-trivial product.
Use RBAC when
- The number of distinct permission sets is small (single-digit roles).
- Permissions do not depend on resource attributes (just 'role can do X').
- The team is small and audit trails matter.
Use ReBAC when
- The product has rich sharing semantics (documents, projects, teams).
- Resources have multiple owners / collaborators with overlapping permissions.
- The model resembles 'X has access to Y because they are a member of Z which was shared with the parent of Y'.
Use ABAC when
- Permissions depend on context (time of day, location, request attributes).
- Regulatory rules require complex policy expressions.
- Often layered on top of RBAC or ReBAC for the dynamic checks.
Case Studies
Google Zanzibar
Google's Zanzibar paper (2019) described the centralized authorization service that powers access control across Google products (Drive, Calendar, Photos, Cloud). It introduced ReBAC at scale: a graph of relationships between users, groups, and resources, evaluated for billions of authz checks per second with strong consistency guarantees (Zookies for read-after-write). The paper triggered the wave of open-source clones (SpiceDB, OpenFGA, Permify, AuthZed).
Lesson: at large multi-product scale, centralizing authorization beats per-service RBAC. The cost of a separate authz service is repaid in audit, consistency, and the ability to move fast on new sharing features.
Auth0 / Okta as IdP
Auth0 (now part of Okta) is the canonical example of the 'do not build auth yourself' choice. Thousands of products delegate authentication entirely to Auth0 and integrate via standard OIDC. The result: SSO, MFA, social login, enterprise federation, password breach detection, etc., all delivered in days rather than months.
Lesson: unless you are in the auth business, delegate. The library and protocol stack is mature; reinventing it is wasted engineering and increased risk.
Stripe Restricted API Keys
Stripe's API key model evolved from 'one secret key per account' to per-environment keys, restricted keys (scoped to specific resources), webhook signing secrets, and rolling keys. Public docs and changelog entries describe the migrations. The lesson visible in the API: a single high-power credential is dangerous; scoped, rotatable, revokable credentials are safer.
Lesson: API keys are not all the same. Scope them, rotate them, and design for revocation from day one.
GitHub fine-grained personal access tokens
GitHub migrated from classic PATs (account-wide power) to fine-grained PATs that are scoped to specific repositories and to specific permissions, with explicit expiry. Public docs and the rollout posts described the rationale and the customer impact. This was a deliberate move toward least-privilege at the credential level.
Lesson: long-lived broad credentials are a liability. Even at the cost of more user friction, scoped + expiring credentials prevent the catastrophic-leak failure mode.
Multi-tenant SaaS cross-tenant leakage incidents
Many engineering blog posts have described post-mortems for cross-tenant leaks in multi-tenant SaaS: a missed WHERE tenant_id = ? filter, a cache key without tenant prefix, an admin tool that forgot to scope by tenant. The remediations are consistent: defense in depth (Postgres RLS + app filters + integration tests + tenant prefix in every cache key + audit logs that flag cross-tenant queries).
Lesson: assume some developer will eventually forget the tenant filter. Architect so a missing filter produces zero results, not a cross-tenant leak.
Quick Review
- Authentication is 'who'; authorization is 'what'. Keep them as separate layers.
- Delegate authentication to an IdP unless you are in the auth business. The risk reduction is enormous.
- For tokens, use short-lived JWT access tokens + long-lived opaque refresh tokens stored server-side. Stateless verification with revocation.
- Sign JWTs with asymmetric keys (RS256 / ES256), publish via JWKS, rotate keys with a
kidindex, never bake keys into code. - Authorization Code + PKCE is the right OAuth2 flow for almost everything. Implicit and ROPC are dead.
- RBAC for simple apps, ReBAC for sharing-rich products, ABAC for context-heavy rules. Often layered.
- Multi-tenant: tenant id in every query AND Postgres RLS AND tests that try to break tenant isolation. Defense in depth.
- Machine-to-machine: workload identity for internal, OAuth client credentials for cross-org, mTLS as a transport guarantee, API keys only for low-stakes integrations.
Real-World Examples
How real systems implement this in production
Zanzibar is Google's centralized authorization service, described in a 2019 paper, that powers access control across Drive, Calendar, Photos, and Cloud. It models permissions as a graph of relationships and serves billions of authz checks per second with strong consistency via 'Zookies' for read-after-write. The paper triggered the wave of open-source ReBAC implementations (SpiceDB, OpenFGA, Permify) that smaller companies use today.
Trade-off: A centralized authz service unifies policy across products and gives strong audit and consistency, but adds infrastructure complexity that small organizations cannot justify; per-service RBAC is fine until sharing semantics get rich.
Auth0 (now part of Okta) is the canonical example of 'delegate authentication to a specialist'. Thousands of products integrate via OIDC and gain SSO, MFA, social login, enterprise federation, breached-password detection, and adaptive auth without building any of it themselves. Documentation and customer stories cover the migration patterns from custom auth to delegated.
Trade-off: Delegating saves enormous engineering effort and reduces breach risk, but creates a critical external dependency (an IdP outage is your outage) and ongoing per-user costs that scale with user base.
Stripe's API key model evolved from a single account-wide secret key to per-environment keys, scoped restricted keys, webhook signing secrets, and rolling keys. Public docs and changelog entries describe each step. The trajectory is consistent: from 'one powerful credential' to 'many scoped, rotatable, revokable credentials' so a single leak is contained.
Trade-off: Scoped credentials reduce the blast radius of a leak but add operational overhead for customers who must manage more credentials and a more complex rotation story.
GitHub migrated from classic PATs (account-wide power, often long-lived) to fine-grained PATs that are scoped to specific repositories and specific permissions, with explicit expiration dates. The rollout was gradual, with public posts explaining the rationale: long-lived broad credentials are the most common cause of catastrophic leaks. The new model trades user friction for least-privilege.
Trade-off: Fine-grained tokens dramatically reduce the impact of a leaked credential but add real friction (developers must specify scopes and re-authorize when expiring), which slows some workflows.
Quick Interview Phrases
Key terms to use in your answer
Common Interview Questions
Questions you might be asked about this topic
Delegate authentication to an IdP (Auth0 / Okta / Cognito or roll OIDC on top of an open-source provider like Keycloak). Users authenticate via Authorization Code + PKCE; result is a short-lived access token (5-15 min, RS256-signed JWT) and a long-lived opaque refresh token stored server-side. Services validate the JWT via JWKS without phoning home. Authorization is layered: tenant isolation enforced by Postgres RLS, role checks via RBAC for simple cases, ReBAC (SpiceDB / OpenFGA) for resource-sharing semantics. Refresh tokens hashed in DB; revocation is delete from DB. Audit log of every login and every authz decision. MFA via the IdP. SSO via OIDC / SAML for enterprise tenants.
Honest answer: you do not, directly. JWTs are valid until expiry. Mitigations: 1) Keep access tokens short-lived (5-15 minutes). 2) Use long-lived refresh tokens stored server-side; revoke by deleting from DB; new access tokens fail to issue. 3) Maintain a small, fast denylist of revoked token ids (jti) checked on each request, only practical if revocations are rare. 4) Use opaque tokens instead of JWTs for cases needing immediate revocation (you accept the per-request lookup cost). The hybrid (short JWT + opaque refresh) is the standard answer for both stateless verification and effective revocation.
RBAC: users -> roles -> permissions. Simple, auditable, fits most apps. Choose for typical SaaS with a small set of distinct user types (admin, editor, viewer). ABAC: rules over user / resource / action / context attributes. Maximally expressive, harder to audit. Choose for context-heavy rules (time of day, location, regulatory). ReBAC (Zanzibar-style): graph of relationships, permissions are paths through the graph. Choose for products with rich sharing (Google Docs, GitHub, Slack) where 'X has access to Y because they are a member of Z which was shared with the parent of Y' is a natural way to think. Real systems often combine: ReBAC for the structural model, ABAC for context-dependent gates, RBAC for coarse role checks.
Defense in depth. 1) Postgres row-level security: the DB enforces every query filters by current_tenant; missing filter returns zero rows, not other tenants' data. 2) Application code passes tenant context explicitly through the request lifecycle (middleware sets it after auth). 3) Cache keys prefixed with tenant id so Redis cannot leak across tenants. 4) Per-tenant database connection pools or schemas if isolation must be stronger. 5) Integration test suite that authenticates as tenant A and tries every endpoint with tenant B's resource ids; assert 404 / 403, never 200. 6) Audit log entries flagged when a query crosses tenants. 7) For highest-stakes workloads, per-tenant database. Mention the human factor: assume some developer will forget; architect so the DB is the safety net.
Internal service-to-service: workload identity. SPIFFE / SPIRE issues short-lived SVID certificates per workload; AWS IRSA, GKE Workload Identity, and Kubernetes ServiceAccount tokens give the same shape on cloud platforms. mTLS is the transport-layer guarantee; service mesh (Istio / Linkerd) automates issuance. External callers (third-party integrations): OAuth Client Credentials flow yields short-lived access tokens with scopes; client_id / client_secret pair is rotatable. API keys only for low-stakes external integrations and only if scoped + rotatable. Avoid long-lived shared secrets between internal services; the platform should issue identity, not the application.
Interview Tips
How to discuss this topic effectively
Always separate authentication from authorization in your answer. 'AuthN runs once at login; AuthZ runs on every request' is a senior tell.
Default to delegating authentication. 'Unless we are in the identity business, I would use an IdP (Auth0, Cognito, our SSO provider) and treat user identity as an external dependency' is the right opener.
For tokens, propose the hybrid: short JWT access + long opaque refresh stored server-side. This is the right answer to 'how do you handle revocation?'.
Mention defense in depth for multi-tenant: 'Postgres RLS plus tenant filter in app code plus an integration test suite that tries cross-tenant access'. This is what staff engineers say.
When asked about authorization at scale, name Zanzibar / SpiceDB / OpenFGA. ReBAC is the current best practice for products with rich sharing semantics; knowing it is current.
Common Mistakes
Pitfalls to avoid in interviews
Storing user roles or permissions in long-lived JWTs
JWTs are immutable once signed. A user revoked from a project at 12:00 still has 'project_editor' in the JWT they got at 11:55 until expiry. Either keep tokens very short-lived (minutes) or look up authorization per request from a fast store rather than encoding it in the token.
Treating JWT as a session store and putting cart contents, settings, or PII inside it
JWTs are signed but usually not encrypted; anyone with the token reads the payload. They are also immutable per token. Use them only for short-lived authentication assertions. Sessions belong in a session store; profile data belongs in a database.
Hard-coding the JWT signing key in code or environment variables with no rotation
Use asymmetric keys (RS256 / ES256), publish public keys via JWKS, identify keys with a kid header. Rotate keys on a schedule. Verifiers cache JWKS with TTL and pick up new keys automatically. A leaked symmetric secret means re-signing every token in flight.
Relying on application-layer tenant filters as the only line of defense
A single missing `WHERE tenant_id = ?` is a cross-tenant data leak. Use defense in depth: Postgres row-level security so the database refuses cross-tenant queries, plus tenant-aware code, plus an integration test suite that tries to access tenant B as a user from tenant A and asserts denial.
Using long-lived API keys with full account power as the default for integrations
A leaked all-power API key is catastrophic. Issue scoped, time-bounded credentials (Stripe restricted keys, GitHub fine-grained PATs, OAuth client credentials with explicit scopes). Audit and rotate regularly. Even a developer-friction trade-off is worth it for least-privilege.
