The Productivity Myth in a WFH World
As a preface to this - yes, I’m aware there was an election last week. I’m curious to do some digging in the data to see what if any lessons might be learned, but as of today the results are far from complete and I think learning should wait. In the meantime I have two thoughts - first, you may be interested in my piece from early October on why election forecasts are dumb and you should ignore them [Standard Errors]. I’d add another thought I didn’t cover - if the election forecasts are based on bad polling, then the “probabilities” they report are actually nonsense from a metaphysical point of view. Second, in reading coverage of the results don’t forget the infamous XKCD take on sports [XKCD]:
Instead of election coverage, as the third wave of coronavirus gets going and we all prepare to spend more time at home I wanted to talk about our great national work from home experiment. The coronavirus has led to an explosion of remote work, and overnight a huge share of American workers have gone from working in offices to working at home - roughly 42% of workers, according to a Stanford survey [Stanford SIEPR]. Of those not working from home, more (33%) are out of work than continuing to work on-premise (26%). These WFH figures have likely dropped a bit since the peak of stay-at-home orders in April/May but remain far above the pre-Covid status quo.
The Working From Home Challenge
This has led to a bit of a challenge for America’s managers, at least the most traditional ones. There are a paired set of concerns for many managers - “are my workers being less productive at home?”, and “if my workers are home, how can I effectively supervise their level of productivity?”. Often they say the first but mean the second.
There is a commonly-cited study on the value of remote work for productivity that I’ve been seeing pop up more and more, and you should regard it as trash. See a typical citation and framing from the BBC here [link]: ”the company’s staff became notably more productive by working from home four days a week”.
So there’s an immediate “external validity” concern, which is what happens when research poorly reflects the real world. Most obviously, the employees were not being sent home involuntarily in the middle of a global pandemic. The employees with children were not being told they were responsible for caring for them full-time, and by the way school is on Zoom now.
This does not mean the findings are unreasonable. The research design is not particularly trash [ResearchGate]; it randomly split a large group of 249 employees for a large Chinese travel agency and sent half of them home for four days a week. One note of caution here is that the experimentees were selected from a group of volunteers - and needed a spare room to participate, which many American families may not have. The experiment even had a clear, quantifiable metric on which to measure results:
Then the two teams were tracked. It was easy to do, as these workers had repetitive, straightforward tasks that could easily be quantified, and whose productivity could easily be measured – making bookings in the system or making phone calls, for example.
This precision, in turn, is why the frequent citation of this study is trash: “repetitive, straightforward tasks that can easily be quantified” is the polar opposite of most white-collar work. Much of that highly routine work is being turned over to self-service or automated arrangements - the customer is after all a travel agency, which have largely been driven out of business in the US in favor of self-service solutions. It’s hard to imagine how this would translate to engineers, for example - you could look at lines of code or number of commits, but either of these are pretty poor measurements of value-added. It’s going to be even fuzzier when looking at e.g., lawyers or graphic designers, or corporate executives.
Supervision by surveillance
This problem represents a killer issue for any desire to automate assessment for most non-routine work: how do you define the outcome you’re measuring? For people that are not handling simple, comparable, easily discretized outcomes - bookings made, for example - there is almost no conceivable alternative to subjective judgment. There are boatloads of new “AI startups” that promise to take the subjective judgment out of the equation, but if you ask me that just moves the problem back a few layers.
Enaible seems like a pretty typical example, based on this writeup in the MIT Technology Review [MIT]:
It is developing machine-learning software to measure how quickly employees complete different tasks and suggest ways to speed them up. The tool also gives each person a productivity score, which managers can use to identify those employees who are most worth retaining—and those who are not.
The issue of course is that this software encodes a hugely questionable subjective judgment: that speed and productivity are the same. This is hardly going to be the case for many professional roles: the most obvious are accounting or engineering, where small mistakes made in haste can lead to incredibly costly or dangerous downstream consequences. Speed or “productivity”-based metricization will also poorly represent many fuzzier or more creative fields, particularly those such as e.g. marketing or branding where they may be hugely asymmetric payoffs to getting your message just right.
The drive for (false feelings of) control
Ultimately I think the drive to capture white-collar productivity in numbers speaks to a desire for control and lack of trust. This hunger and desire for things like productivity scoring no doubt contributes to many of the environments that have adapted most poorly to remote work. There are many of these environments, with Harvard Business Review reporting that roughly 40% of managers do not think they can effectively run their team remotely [HBR]. Furthermore, the reason for this seems to be related to control or perceived lack thereof:
Second, managers who reported lower job autonomy in their own work, close monitoring from their own boss, and a high degree of mistrust from their own boss had more negative beliefs about remote working, and greater mistrust of their workers. These findings suggest a social learning process in which managers learn how to supervise and treat their workers by observing their own managers. We believe this is because, when their own managers treat them with mistrust by closely monitoring them, they associate this behavior with being a manager and replicate it in their own leadership action. In other words, we believe they start to think that close monitoring and micromanaging is what the organizations expects of them.
It does not take a rigorous statistical analysis to see that there is likely a link between this tendency, apprehension of remote work, and desire for a machine that will supervise your workers when you cannot. This all seems to tie into the same desire, which is to turn other people into mere extensions of one’s will and ultimately into substitutable work units. I don’t think machine learning is going to enable that level of control, but it will someday perhaps be good enough to offer managers the comforting illusion of control.