The Business Case for Psychological Safety: What the Numbers Actually Show
Gallup, McKinsey, MIT Sloan, vendor decks – which psychological safety numbers hold up, which don't, and what you can defensibly tell a CFO.

A slide came through my LinkedIn feed a few weeks ago: navy background, white text, "Psychological safety = 25% more revenue." No source. No footnote. It was doing numbers.
The stat is real, which is the frustrating part. It comes from a 2024 MIT Sloan Management Review article by Per Hugander and Amy Edmondson about SEB, a 168-year-old Nordic bank whose senior management team beat its yearly revenue targets by 25% after an intervention built around psychological safety. What the slide left out: one team, one market segment, one company, no control group. That's not a return you can promise. It's a story about what happened once.
So here's the honest business case for psychological safety. The strongest evidence is large and correlational – Gallup's meta-analysis of 49,928 work units ties top-quartile engagement to 23% higher profitability, and McKinsey's Organizational Health Index links top-quartile health to roughly triple the shareholder returns. The flashiest numbers – 25% revenue lifts, $4.3 million per manager, 230% ROI – come from single case studies and vendor claims that can't carry causal weight. Both halves of that sentence matter. Most of what circulates online keeps only the first.
I should disclose my bias up front: I want these numbers to be true. Adam White and I read more than 200 studies while designing conflict-training scenarios, including the research on whether a tabletop RPG can teach conflict skills, and psychological safety is load-bearing in nearly all of it. Which is exactly why I'm careful. A business case built on numbers that collapse under one skeptical question is worse than no business case at all – you don't just lose the argument, you lose the room.
What's the strongest evidence that psychological safety pays?
The strongest evidence comes from three places – a Gallup meta-analysis, McKinsey's organizational health data, and a growing stack of peer-reviewed effect sizes – and none of them will hand you a tidy ROI multiple.
Gallup's meta-analysis covers 263 studies across 192 organizations: 49,928 work units, roughly 1.4 million employees. Business units in the top quartile of engagement show 23% higher profitability, 18% higher sales productivity, and 14% higher overall productivity than bottom-quartile units. Separate Gallup work from 2017 credits improvements in psychological safety specifically with a 27% reduction in turnover and productivity gains of 12–14%. Two caveats, and they're real. The meta-analysis measures engagement, which is a close cousin of psychological safety rather than the thing itself. And quartile comparisons are correlational – nobody randomly assigned 25,000 work units to be disengaged.
McKinsey's Organizational Health Index draws on 2,500+ organizations and more than 7 million survey responses. Top-quartile healthy organizations deliver about three times the total shareholder returns of bottom-quartile ones, carry a 68% probability of above-median financial performance versus 31% at the bottom, and run operating margins 4–5 points above industry medians. During COVID they were 59% less likely to show signs of financial distress. It's a genuinely large longitudinal dataset. It also measures "organizational health" as a bundle – psychological safety is one ingredient, not the recipe.
Then there's the research without dollar signs, which is quietly the most trustworthy of the lot. The CIPD's 2024 evidence review aggregated more than 200 effect sizes from studies covering 30,000+ participants and found an average correlation of r = 0.53 between psychological safety and outcomes like learning behaviour, information sharing, and performance. A 2024 meta-analysis of 94 samples (N = 19,180) put the team-level innovation correlation at r = 0.44. In social science, those are big numbers. They sit on top of the finding that started the field: Amy Edmondson's 1990s hospital studies, where the better-performing teams reported more errors. They weren't making more mistakes – they felt safe enough to admit them. More than 1,000 papers since have kept confirming the pattern, including lower patient mortality in medicine.
And if you need dollars with confidence intervals attached, one study delivers. A BMJ Leader analysis of 138 physicians and 282 nurse anesthetists found that where psychological safety was low, 67.3% of the nurse anesthetists reported turnover intent, versus 19.7% where it was high – an odds ratio of 8.93, with a 95% confidence interval of 4.27 to 18.68. At a modeled replacement cost of $30,890 per person, raising their psychological safety to the physicians' level would save the studied group about $5.15 million a year. It's cross-sectional, and the costs are modeled rather than audited. But it shows its work, which is more than most of this genre can say.
Where do the famous numbers actually come from?
Four numbers do most of the circulating, and every one of them is weaker than its headline.
"Psychological safety training produced 25% more revenue." The SEB case again. A two-hour session on psychological safety and perspective-taking was credited as the turning point for one senior management team, which pooled its knowledge better and finished 25% above yearly revenue targets in a strategically important segment. Single company. Single team. No control group, and the account comes from the people who ran the intervention. None of that makes it false – it makes it a case study, which is an existence proof, not an expected return.
"Managers with high psychological safety generate $4.3 million more revenue per year." This traces to a 2022 PR Newswire data release. No published sample size, no confidence intervals, no study design. It's also exactly the kind of finding reverse causality loves: teams that are winning tend to feel safer, so high-revenue managers may score well on safety because of the revenue rather than the other way around.
"Psychological safety returns 230% per dollar invested." A compilation figure that circulates through Psychology Today and Niagara Institute roundups. Trace it back and you land on survey-based perception data – people estimating the return, not accountants measuring one. Perception surveys have their uses. Calculating ROI isn't among them.
"Disengagement costs the global economy $8.8 trillion." From Gallup's State of the Global Workplace report – an extrapolation equal to 9% of global GDP. It's fine for conveying the scale of the problem and useless for your P&L, because no company can recover its "share" of a global modeled estimate. If you want a cost figure a finance leader will actually engage with, build your own from the research on the cost of conflict avoidance at work, which offers per-employee numbers you can localize.
One more pattern worth naming: numbers that migrate from weak sources into strong ones. Google's Project Aristotle really did study 180+ teams over two years, and it really did rank psychological safety as the top factor in team effectiveness. The "19% higher productivity" and "31% more innovation" figures usually stapled to it come from secondary summaries, not from anything Google published. The prestige of the study rubs off on numbers it never produced.
Every number, with its receipts
Here's the full set in one place, ranked roughly by how much weight each can hold.
| The number | Where it comes from | Study design | What it can support |
|---|---|---|---|
| 23% higher profit, 18% higher sales productivity | Gallup meta-analysis: 263 studies, 49,928 work units | Correlational meta-analysis of engagement, a close proxy for safety | "Associated with" – the strongest broad claim available |
| r = 0.53 average effect on team outcomes | CIPD 2024 evidence review: 30,000+ participants | Aggregated peer-reviewed effect sizes, 2020–2024 | Safety correlates strongly with learning, sharing, performance |
| 3x total shareholder returns | McKinsey OHI: 2,500+ orgs, 7M+ survey responses | Longitudinal observational; safety is one ingredient of "health" | Healthy orgs outperform; safety's isolated share is unknown |
| ~$5.15M modeled annual savings | BMJ Leader clinical study, N = 420 | Cross-sectional survey with modeled turnover costs and CIs | Low safety predicts turnover intent; the dollars are modeled |
| 6% vs. 3% revenue growth | McKinsey analysis of ~1,800 firms | Observational; measures a bundle of people practices | The bundle outperforms; safety's slice isn't isolated |
| 25% above revenue target | MIT Sloan / SEB (Hugander & Edmondson, 2024) | Single case study, no control group | An existence proof, not an expected return |
| $8.8 trillion disengagement cost | Gallup, State of the Global Workplace | Global extrapolation (9% of world GDP) | Scale-setting only; not a recoverable line item |
| $4.3M more revenue per manager | PR Newswire data release (2022) | Observational claim; no method published | Nothing you'd put in a board deck |
| 230% average ROI | Psychology Today / Niagara Institute compilations | Survey-based perception estimate | Perception, not measured financials |
Why doesn't correlation settle it?
Because at least three other explanations fit the same data. Winning causes safety: teams beating their targets feel safer, so safety scores track performance without producing it. Good management causes both: the leader who builds safety also tends to build clear goals and sane processes, and those confounds never show up in a quartile comparison. And bundling: McKinsey's analysis of roughly 1,800 firms found "People-Focused" companies grew revenue at 6% a year from 2019 to 2021 versus 3% for typical performers – but those firms invest in a whole stack of human-capital practices at once. Safety is in the bundle somewhere. The bundle is what got measured.
I'm a product manager, not a methodologist, and you don't need to be either. One question does most of the work: how would this number look if the causation ran backwards? The Gallup and CIPD findings survive that question as associations – large, consistent, replicated across a thousand-plus studies. The vendor numbers mostly don't.
The direction isn't hopeless, though. A Harvard Business School analysis of 27,000+ workers surveyed before and during COVID found that a one-standard-deviation increase in safety lowered burnout scores by 0.72 points and raised willingness to stay by 0.63. And a longitudinal study of New York City public schools found that safety without accountability blunts goal focus – modest safety paired with high felt accountability still produced strong performance. That last finding matters if you're making this case internally, because it preempts the obvious objection that safety means softness. The evidence says it doesn't, as long as the expectations stay.
What can you defensibly tell a CFO?
You can claim association with confidence, causation with humility, and vendor multiples not at all.
The defensible version sounds like this. Across a meta-analysis of nearly 50,000 work units, top-quartile teams showed 23% higher profitability and 18% higher sales productivity – an association, and the most consistent one in the literature. In a peer-reviewed clinical study, low psychological safety carried a near-9x odds ratio for turnover intent, with modeled replacement costs running to seven figures. Twenty-five-plus years and over 1,000 studies keep finding that safer teams learn faster, perform better, and quit less. McKinsey's shareholder-return data belongs in the deck too, presented as what it is: evidence about organizational health broadly, with safety as one ingredient.
What you can't defensibly claim is a multiple. Not 25% revenue growth from a training – that happened once, at one bank, without a control group. Not $4.3 million per manager – nobody has published the method behind that number. Not 230% ROI – that's a perception survey wearing an ROI costume. The moment you promise a multiple, you've staked your credibility on the least reliable numbers in the room, and any CFO worth the title will find the seam.
The stronger move is to promise measurement instead. Baseline your own team's turnover, its time lost to unresolved conflict, and its safety scores before any intervention, and measure again after. Pre- and post-training surveys cost almost nothing to run, and measuring conflict resolution skills is more tractable than most leaders assume. "The external evidence is consistently positive, and we'll generate our own" beats any borrowed statistic – and it's the only claim that gets stronger after the meeting.
The honest numbers are good enough. The inflated ones are how you lose the room.
More from Why Games Work

Aesthetic Distancing: Why Playing a Character Makes Hard Conversations Easier
The drama-therapy concept behind 'it's my character, not me': how a fictional mask lowers identity threat so people practice conflict they'd normally avoid.

Can a Tabletop RPG Really Teach Conflict Skills? What the Research Says
An honest evidence review: what simulation research, aesthetic distance, and two pilot sessions say a tabletop RPG can and can't teach a workplace team.

Virtual vs. In-Person Roleplay Training: What Changes and What Doesn't
The evidence on moving roleplay training to video: what survives (practice, debriefs), what degrades (energy, nonverbal cues), and how to pick a format.
Put this into practice
Operation Aetherfall is a complete, pilot-tested scenario kit — facilitator guide, printable table pack, and assessment set — for running this kind of training with your own team.