One of the prevailing DevOps mantras is “You build it, you run it!”. While I agree to the overall concept, it’s not one that comes without several very important prerequisites. I’m not running anything unless those are not met. Let’s take a look at what they are.

The problem

Short summary: management expecting 24/7 coverage from a 5/8 team. It should not come as a surprise that this doesn’t work, for more than one reason.

On-call hours are working hours

People don’t work in their spare time, so don’t call or email them outside of working hours. Emergencies can always happen of course: that’s why overtime was invented, or time-for-time. Structural on-call is not overtime; it’s planned work. On-call hours should be treated as working hours; working on nights, weekends and public holidays typically means increased hourly rates as well.

People should be compensated both for hours where nothing happens as well as for hours where there are incidents. Being on-call without incidents is still work, as those hours are still claimed for work and can’t be spent freely, so they’re not free/spare time.

People need to be available on short notice, keep a laptop nearby (even when sleeping), stay sober, stay in an area with decent network coverage, plan for interruption, etc. If a company is putting claims or restrictions on free time then it’s not free time, period.

Last, but not least: on-call should be explicit in an employment contract. Number of regular work hours, number of on-call hours, and exact compensation for each. On-call hours are NOT overtime or undefined. If it’s not in your contract you’re likely being screwed.

On-call hours are planned hours

On-call or not on-call, people will still work 40 hours per week on average. And since they will be working more during off-hours (compared to a non-on-call team), this automatically implies they will be working less during regular hours. So a team needs to plan/budget for 24/7 coverage. And yes, that’s quite a bit more expensive than 5/8 coverage, since opening hours of the ‘shop’ are increased from 40 to 168 hours and include nights, weekend and public holidays.

Rotations and shifts

That’s not the problem of individual team members though; they still have 40-hour contracts. So rotations/shifts have to be setup to create full 168 hour coverage using 40 hour contracts. Note that this is not some new, revolutionary concept. Stores, hospitals, factories and other business that are open 24/7 have been doing this for centuries. Doesn’t mean they always run at full capacity; they typically do during business hours and run with a minimal crew outside of business hours.

Summary

‘You build it, you run it’ sounds like a simple concept (which it is), but it forces organisations to overhaul their entire way of working, including planning, logistics and compensation structures. It’s not something you can just slap on an existing team that works during regular business hours, and there’s a price to pay in terms of overhead (hello shift/rotation handovers!) and costs. Whether that’s worth it for your business I’ll leave up to you.


<
Previous Post
Why are people afraid to deploy on Friday?
>
Next Post
So you don’t deploy on Friday. But does the same apply to all your dependencies?