PEC.com
Budget blind spots

The hidden costs of running a platform engineering team

The visible cost is salaries plus tooling. Finance signs off on those. Hidden costs eat 15 to 25 percent of the total and show up nowhere in the initial proposal. This is the inventory, with rough quantification for each line item.

Most platform engineering budget proposals are too optimistic because they miss seven specific line items that are real, measurable, and predictable. Adding them up honestly turns a $1.5M visible budget into a $1.7M to $1.9M total. Not catastrophic but material, and it is better to know upfront than to discover it via a mid-year finance review.

The seven hidden line items

01

On-call premium

5-10% of loaded salary

Stipend $300-$1,500/month plus productivity drag from interrupted sleep and context switches.

02

Documentation debt

3-6 months lost per departing senior

Under-documented platforms create expensive handover gaps. Budget documentation time as a first-class line item.

03

Internal advocacy

0.5-1 FTE for teams over 10

Roadshows, lunch-and-learns, office hours, scorecard reviews. Adoption does not happen by itself.

04

Training of platform engineers

$3k-$10k per engineer per year

Cloud certifications, Kubernetes, infrastructure-as-code training. Keeps pace with upstream change.

05

Migration and legacy

6-18 months double-running cost

Moving product teams onto new golden paths is slower than anyone plans for.

06

Context-switching tax

10-25% of productive time

Platform engineers pulled into incident response lose their roadmap capacity.

07

Build-vs-buy churn

~10% of tooling spend

Decisions get reversed. Budget for sunk cost on an 18-month cycle.

Quantifying the hidden total

The seven line items do not sum to 50 percent on top of visible budget because they overlap and compound. Realistic aggregate is 15 to 25 percent.

Worked example for a 10-person platform team at $2M visible salary plus $500k tooling plus $100k infra ($2.6M visible):

  • On-call premium: $100k-$260k (5-10% of salary for on-call team)
  • Documentation debt amortised: $80k-$150k (estimated value of knowledge transfer not happening)
  • Internal advocacy: 0.5 FTE = $120k-$200k
  • Training: $30k-$100k ($3k-$10k per engineer)
  • Migration double-running (if there is an active migration): $100k-$400k for the year
  • Context-switching tax: $200k-$500k (10-25% of productive time across the team)
  • Build-vs-buy churn: $50k (10% of tooling spend)

Total hidden: $680k-$1.66M on top of $2.6M visible. That is 26 to 64 percent in absolute terms, which is why the honest rule of thumb is 15 to 25 percent once you account for the fact that not all items apply to all teams in all years. Teams without active migrations save the biggest chunk; teams without live on-call skip on-call premium.

Which hidden costs are investments versus running costs

Separating the two matters for how you defend them in finance reviews.

Investments (worth protecting)

  • Documentation. Cutting documentation time looks free until an engineer leaves and takes three months of productivity with them.
  • Training. The platform team that falls behind upstream (new cloud primitives, new Kubernetes APIs, new security practices) loses its leverage and eventually attrits to companies that pay for growth.
  • Internal advocacy. Adoption is the single most important platform metric, and advocacy is how adoption happens. Cutting advocacy is how platforms become shelfware.

Running costs (accept and budget)

  • On-call premium. The work happens whether or not you pay for it; failing to pay for it means morale erodes and senior engineers leave for companies that do.
  • Context-switching tax. Incident response pulling engineers off roadmap is how incident response works. Plan for it.
  • Migration double-running. Product teams switch over at their own pace. Forcing faster migration often breaks things and costs more in incident response than running old and new in parallel.

Reducible (work to eliminate)

  • Build-vs-buy churn. Decisions that get reversed 18 months later mean wasted engineering and wasted licence spend. Invest in better decision processes (explicit criteria, three-year TCO modelling, pilot runs before committing).

Presenting hidden costs to finance

The mistake most platform leaders make is not showing hidden costs to finance until they surface as variances. Show them up front. A finance team that sees a proposal with $2.6M visible + $500k hidden = $3.1M planned is happier than a finance team that sees a $2.6M proposal followed by $500k of mid-year surprises.

Concretely, include a hidden-cost line in every platform engineering budget proposal. Name it "Operational overhead" or "Platform running costs" rather than "Hidden" (the latter reads like admission rather than foresight). Break down the components so finance can see the logic. Update the estimate quarterly based on actuals.

Frequently asked questions

How much do hidden costs add to a platform engineering budget?
Rule of thumb: 15 to 25 percent on top of the visible budget. For a visible $1.5M platform team, plan for $1.7M to $1.9M all-in. The exact number depends on on-call burden (higher if the team takes production pager), documentation maturity (higher if you have knowledge silos), and migration workload (higher if you are moving product teams onto new golden paths). Most finance proposals miss these entirely.
What is the single biggest hidden cost usually missed?
Migration and legacy double-running. When you introduce a new golden path, product teams do not adopt it overnight. You run the old and new in parallel for six to eighteen months, which means paying for both infrastructures, both tool sets, and both knowledge bases. For a large migration this alone can exceed the nominal new-platform investment. Budget it explicitly in any platform roadmap that involves replacing existing capabilities.
Is training an actual cost or an investment?
Both. Kubernetes certifications, cloud credentials, and IaC training run $3k to $10k per platform engineer per year. This is a visible budget line that often gets cut in down years. Cutting it looks cheap but creates compounding expertise debt: the team cannot adopt new upstream capabilities, falls behind on security best practices, and has higher attrition to companies that pay for growth. Treat it as a cost of keeping the team operational, not as discretionary.
What is the on-call premium worth?
A platform team that takes production pager for shared infrastructure adds 5 to 10 percent to loaded salary cost through stipends and the productivity drag of interrupted sleep. Typical stipend $300 to $1500 per month per engineer on rotation. The productivity drag is the harder number: a night call costs the following day of focus. For a team of 10 with weekly rotation and one call per week on average, that is 50 days of lost focus per year, which at loaded day rates is $30k-$50k of invisible cost.
Which hidden costs are worth accepting versus cutting?
Documentation, training, and internal advocacy are investments: cutting them saves money short-term but costs more long-term through attrition, knowledge loss, and adoption stalls. On-call stipends are a cost of doing the work and should not be cut; the work still happens without the stipend but morale erodes. Migration double-running and context-switching tax are unavoidable consequences of how platform work happens; budget them rather than trying to eliminate them. Build-vs-buy churn is the one to actively reduce through better decision processes.