Gary Bernhardt named the pattern "functional core, imperative shell" in a 2012 talk, but the underlying idea predates the name by decades and shows up under different labels: "hexagonal architecture", "ports and adapters", "pure-impure separation". I use Bernhardt's name because it is the shortest and the most descriptive, and because the two-layer split is the actually-load-bearing property regardless of which architecture book you read it from.
My stance: this is the most consistently helpful architectural rule I know for application-level code. It is more useful than every other "layered architecture" I have tried, including DDD's tactical patterns, because it operationalizes a single decision ("is this code pure or not") instead of asking you to pick from six concentric circles. Adopting it has made the test suites I work in three to five times faster and significantly more reliable. I will explain what the rule says, what each layer's responsibilities are, where it breaks down, and the trade-off I am still negotiating with my own teams about.
The rule, in one paragraph
Write your decisions as pure functions. Write your side effects as a thin wrapper around those decisions. The pure layer (the "functional core") takes inputs, returns a result, and never reads from a database, sends an email, hits an HTTP API, or calls Date.now(). The wrapping layer (the "imperative shell") does all the I/O: it reads what the pure layer needs, calls the pure layer, and writes the result. A function in the imperative shell calls into the functional core; a function in the functional core never calls back into the shell.
That is it. The whole pattern. The reason it is hard to apply is not that the rule is complicated; it is that codebases naturally drift toward mixing the two, and you have to actively keep them apart.
A worked example
Here is a piece of business logic that mixes the two layers. It comes from a billing service I have seen in two different codebases.
Notice what is mixed: the database lookup is interleaved with the calculation, the calculation reaches into new Date(), and the result is written back to the database and to Stripe and to the emailer. To test the charge calculation, you would need to set up the database with a user, set up Stripe in test mode (or mock Stripe), control time, and assert on the invoice that got created. That is a lot of moving parts for what is, at its heart, a multiplication.
The split:
Four wins. The decision is a pure function: feed it a subscription and a date, get back a result. Tests against the decision are five lines, no setup. The shell's role is mechanical: load, decide, write. The shell is mostly visible at a glance because it has nothing to think about, just data movement. The Stripe amount is now in cents in one place (the decision), so a unit conversion bug shows up in one test. The shell is a thin layer the team can reason about as orchestration; the core is where the rules live.
What goes in the core, what goes in the shell
A quick taxonomy that I have found useful when I am refactoring existing code:
The rule of thumb is: if the function depends on "what time is it", or "what is in the database", or "what did Stripe say", it belongs to the shell. If the function only depends on its arguments and produces a result that depends only on its arguments, it belongs to the core.
A single function that does both is the failure mode you are trying to avoid. Refactor it into a core function that takes the world as input and a shell function that gathers the world and applies the result.
Where I see the pattern break down
Three honest critiques.
First, the pattern produces longer call paths. A request handler calls a shell function, which calls a core function, which returns a decision, which the shell function then acts on. Three layers where there used to be one. For the simplest CRUD endpoint, this is overkill. I do not apply the pattern to the world's hundredth "PATCH /users/:id with three fields" endpoint; I apply it to the part of the codebase where decisions live.
Second, some operations are genuinely impure all the way down and refactoring them into core+shell adds zero value. A function that streams a file from S3 to another bucket is a pipeline of side effects. Pretending the "decide which bucket" step deserves its own pure function is performative, not useful. If extracting the core gives you a function with one or two inputs and zero non-trivial logic, you are not gaining testability; you are adding ceremony.
Third, the shell can become its own god object. If the shell is a giant applyMonthlyCharge function with twenty I/O calls and one core function call in the middle, the shell is now what needs decomposition. The pattern moves complexity into the shell; you still have to keep the shell honest. My rule of thumb: a shell function that does more than three I/O operations needs its own decomposition, usually into smaller shell functions that each have their own core function.
A subtle benefit: error modeling
A second-order benefit I did not see at first: pure functions encourage you to model errors as data. The decision returns either a { kind: 'no-charge' } or a { kind: 'charge', ... }, not a thrown exception. The shell pattern-matches on the result and decides what to do with each case.
This is much easier to reason about than exceptions thrown across layer boundaries, because the shell sees every possible outcome of the decision in the type system. When a fourth case is added ({ kind: 'card-declined-fast-fail' }), the TypeScript exhaustiveness checker tells the shell that it is missing a case. With exceptions, you would have to remember which exception types could be thrown and add a catch for the new one.
I have found this pattern ("the core returns a discriminated union, the shell switches on it") to be the single biggest readability improvement I have made in long-running services. Once a team gets into the habit, code reviews go faster because the cases are right there in the return type.
What it is NOT
A few clarifications, because I have seen each one cause confusion:
- It is not the same as functional programming. You do not need monads, currying, or Haskell. You need pure functions in the core. TypeScript, Python, Java, and Go all have pure functions; you just have to avoid writing impure ones.
- It is not the same as MVC. The model in MVC is often impure; the controller is impure too. The functional core is a strictly tighter constraint than "the model layer".
- It is not the same as DDD. DDD's domain layer is supposed to be pure, but it allows side effects through repositories. Functional core, strictly applied, does not. This is an honest trade-off; for some teams the pure-everything rule is too strict and a hybrid is what ships.
- It does not require a hexagonal directory structure or a giant interfaces folder. It is a discipline about what calls what; the file layout follows from that, not the other way around.
The trade-off I am still negotiating
The one thing I am still working on with my teams: how strict to be about "pure means pure". A function that calls Math.random() is impure. A function that throws an exception is, technically, impure (the throw is a kind of effect). A function that mutates its input is impure. Different teams have different tolerance for which of these counts as "violating" the core.
My current line: the core can throw on programmer error (assertion violations) but never on expected business outcomes. The core can do local mutation if the function returns a fresh value (a builder-style internal helper that mutates a local accumulator and returns the final value is fine; one that mutates an input is not). The core does not call Math.random or Date.now. If randomness or time is needed, it is an input to the function.
Reasonable people disagree on those exact lines. The thing they tend to agree on is that the discipline is worth keeping, even if the boundaries are negotiated per-codebase.
Why this beats "layered architecture" for most codebases
The last thing I want to say: the version of layered architecture that puts "controller layer, service layer, repository layer, database" in concentric circles is descriptively true of many codebases and prescriptively useless. The decision "is this in the service layer or the repository layer" rarely makes the code better; it just gives you another box to argue about. The decision "is this pure or impure" is mechanical and the test it implies ("can this function run without any setup") is actionable. That is the difference. Functional core, imperative shell is not the prettiest architecture diagram, but it is the one that holds up across years of refactoring, because the rule it enforces is operational, not decorative.
