Measuring is not a Magic Wand

Driving change takes just a little bit more work

Oct 23, 2023

The Meeting

Sitting down with your coffee in the morning, your heart sinks as you see a new block on your calendar for the day; ‘Developer Velocity Strategy’. The invite list is a discouraging mix of VPs, senior directors and architects adding up to a solid 20 invitees, the description says little past ‘lets discuss how we’re finally going to fix our velocity issues’, but hey at least there’s a doc linked as a pre-read!

But then the doc is only two thirds of one page, and seems to really just be the statement “Driving down our PR size will dramatically increase our velocity”, a cursory explanation as to why small changes are a desirable thing, and a couple of links to some recent engineering performance benchmarking reports coming out of a couple of startups selling engineering org analytics tooling, clicking through you see familiar looking metrics and breakdowns.

At the appointed hour you join the zoom, wait out the 7 minutes of laggards for the meeting to startup; 6 of you are on the zoom, 7 in the room and who knows where the others who accepted are. The meeting starts over the sound of someone hitting their laptop so hard you wonder if they are digging for oil, you can just make out the word for word read of the pre-read, and then the floor is opened up for discussion.

You raise your zoom hand to wait your turn, hoping the room will notice. After 10 minutes they haven’t, so you speak up, “what evidence do we have that smaller PRs will increase velocity, and what are we going to do to drive down PR size?”. The response wanders around the credentials of the report source, why these metrics are interesting, why this one metric out of the dozen or so in the report is meaningful, and then tails off, failing to answer either question, so you rephrase slightly ‘what tactical efforts are we considering to push developers to make smaller changes?’. And thats when it happens.

”Oh, we don’t need to drive anything, that which gets measured gets improved”

You watch, dumbfounded, as 3/4 of the attendees nod, some smiling. Silence is now the way forward, groupthink has accepted the magic ability of counting to change an outcome. There is another half hour of, well, waffle, but everyone seems very happy that we’ll be printing gold any minute now and the meeting disbands.

The Diagnosis

What the meeting has missed is that these reports typically classify teams based on measures, its basically clustering on characteristics that typically at most implies correlation. Causal links between performance level and a practice is sometimes there, but in a prescribed direction, and normally in the direction of the ability of the team to provide business value (revenue or some other real world thing). The internal document though implicitly assumes a causal link between a characteristic and being high performing, and is then saying that by adopting a practice, we will drive to that performance. This is on its own extremely unlikely; not mathematically guaranteed impossible, but unlikely for sure.

If we had more smaller changes, its likely we would be delivering more things by count, but by value? Maybe, maybe not, its honestly tough to know. Smaller changes seem to be more likely on a team with existing effective delivery (lead time, change failure rate etc), so does small changes improve delivery, or does good delivery enable small changes? In all likelihood, unknown.

But then there is the widely accepted platitude.

These broad blanket short statements have a comforting simpleness to them; they sound nice, we want to believe them.

However, it would take you no time at all to look around the world and find absolutely horrifying examples of things we know the measure or count of, and that only continue to get worse.

In short, we’ve confused correlation and causation, and we’ve dropped into accepting simple sounding things for truths, and we’re trying to remove context to make a problem more personal and tractable.

Progress Despite Surroundings

These scenarios are pretty common, and pretty demotivating all around. That said we normally do have the ability in some size of domain to work on change more systematically, and if you’re employed by a business there is an (at least) implicit responsibility to try to improve the status quo.

As always, telling folks that they are wrong and/or have misunderstood really fundamental things is not a play with a high success rate, and the skepticism that comes through in the framing of the questions pretty much takes us right up to that door.

In the room, try to get folks to agree to some targeted experiments, nudge towards a larger context of the metrics not just one piece. Often if we frame this as ‘whats two or three things that would be good candidates to measure, work on, and see how things shift?’ we can get to a more fruitful discussion, and also plausibly to there are no good candidates and maybe dropping the whole idea.

We can also with things we own make sure we track the full context. Not only can we do this, we should be. Knowing and working on our effectiveness and the collective tooling and processes that contributes that is ultimately our own responsibility.

At the end of the day, we just want to convert the simplistic statement into ‘if we theorize about a characteristic we’d like to alter, then measure the state, try some changes and measure again, then we can improve’, also known as basically the scientific method, or software engineering as we also sometimes

to call it.

However, if you still end stuck in the simplistic circular arguments, and struggle to iterate on even your own sphere, it might be time to start looking for an escape hatch.

Measuring is not a Magic Wand

Driving change takes just a little bit more work

The Meeting

The Diagnosis

Progress Despite Surroundings

Discussion about this post