Take-Home Interview Challenges are Biased

Ian Hellström | 14 February 2020 | 8 min read

A common practice in hiring in the tech industry is the take-home interview challenge, which, despite its neutral appearance, is implicitly biased and can actually decrease diversity in an industry that is already predominantly young, white, and male.

Those who do unpaid work, that is, overwhelmingly the female half of the global population, are at a disadvantage when it comes to take-home interview challenges (THICs).

What is unpaid work? Cooking, for instance, or dropping off the kids at school, shopping, visiting grandpa in the old folks’ home, cleaning the house, driving the children to football practice or piano lessons, bringing the car to the garage, helping mother go to her doctor’s appointment, unclogging the sink, doing laundry and dishes, dealing with bureaucracy, and so on. These are the often thankless tasks that need to happen in the background, without which households and society as a whole could not function.

Even though the year is 2020, the majority of such unpaid tasks—roughly three-quarters—is still done by women. In non-white households, such tasks are done almost exclusively by women. If you do not believe these claims, please read ‘Invisible Women’ by Caroline Criado Perez. It ought to be mandatory reading for anyone in business, particularly in the data space, as we have the power to expose and redress the imbalance.

The Problem

THICs take a significant amount of time, and typically more than what companies claim. From my experience, the typical THIC is listed as a task that is supposed to be completed within 4–6 hours, but they can quickly gobble up 8 hours or more. Of course companies ask how long it took each candidate, but it’s not in the candidate’s interest to mention a longer duration; you can never be quite certain how it affects people’s perceptions. There is a high risk of scope creep, too, as THICs often come with open-ended ‘bonus’ questions.

High-quality feedback on the code returned is rare, and in those cases where it is provided it is often done in the form of an interrogation rather than dialogue. At some companies, THICs are used to screen candidates, that is, before the first face-to-face contact between humans. It feels impersonal and it adds to the risk of candidates finding out that the culture, role, or renumeration is not as they expected at a later step. That’s a terrible way to waste everyone’s time.

The reason THICs exist is simple: employers need to know whether prospective employees can code. The less time spent on each candidate the better. The problem, however, is that the interview process is supposed to be a conversation: companies evaluate candidates and vice versa. But there is already information asymmetry in recruitment, since companies have access to details the candidate would typically need to have to make an early and informed decision, such as salary, benefits, health of the company, and many more. While candidates can withhold information or embellish their accomplishments, companies can request references to ensure they have a more complete picture. Prospective employees can contact company insiders for their thoughts, but that may be too risky or not yield any viable results at all. Would you talk to a random person who claims to have applied for a job at your company? Probably not.

Why are THICs a problem? Because they limit diversity.

The Assumptions

THICs have a few built-in assumptions:

  1. Candidates have the time and energy to complete multi-hour tests at home;
  2. Candidates have a private computer set up for software development;
  3. THICs are indicative of on-the-job performance.

The first assumption favours people who do not have a lot of duties outside of work and/or who have a single-minded focus on writing software. It’s perfectly fine if you love your craft so much that you spend your time off exploring new programming languages or contributing to open-source projects. For a lot of people, software development is ‘just a job’, and that ought to be enough if they do it well. If a company only wants to hire what I call 24/7 coders, they are hiring the stereotypical young, white, male programmer, and that can very easily lead to a brogrammer culture.

The 24/7 coder is also hidden in the second assumption, but, more importantly, it has a built-in bias against people from lower socioeconomic backgrounds. In families that are less well off, there may only be a single computer in the household, and it may not always be ideal for software development. In such cases the entire family uses the computer and the software developer(s) of the household may not have the luxury of hogging the keyboard for hours on end, especially since they may not have the time after all domestic tasks have been done.

To add insult to injury, according to Pew Research:

Reliance on smartphones for online access is especially common among younger adults, non-whites[,] and lower-income [people].

What that means is that non-white, low-income applicants may not even have broadband connections. Try completing a coding challenge on a phone with patchy 3G!

The corollary of is that 24/7 coders from middle-class backgrounds are favoured by THICs. If completing a THIC is requisite to land almost any tech job, many women (with families) and anyone with fewer resources at their disposal may face hurdles a young, white, male developer who lives in his parents’ house never even needs to experience. Futhermore, it decreases career mobility for said groups, which is the most common way people bump their salaries.

In summary, it’s harder for women and poor people (of all genders) to enter the technology industry, and once they are in, it may still be tough to improve their financial position by switching companies. That’s not cool, people!

The third assumption is on shaky foundation, as I have yet to see a conclusive study on the correlation between THICs and actual on-the-job performance. Success in any job requires social skills, the ability to be a productive member of a team, share and absorb knowledge, and depending on the role business acumen. No THIC I’ve come across comes close to assessing those.

Not the Solution

Online tests or competitive platforms, such as Codility, HackerRank, Kaggle, and the likes are in my view cut from the same cloth as THICs. They may be limited in time and scope, but since their main use is in screening applicants they are not conducive to opening a dialogue either. In a sense such tests suffer from the same flaw as the often-dreaded whiteboarding sessions in that they are done in a settings that is not the developer’s preferred environment. Except, of course, a whiteboarding session is done with people from the company present in the room.

Paying candidates to complete THICs is not a scalable solution either. Yes, it would indicate companies value their prospective employees’ time, but the only way that can be done sensibly is late in the process. If you, as a company, want to know if a certain candidate is capable of crafting decent code, that’s information you want to have earlier than the step before the offer. It can be used to rank equivalent candidates, but it seems a bit arbitrary to do that on the basis of a single project.

A Possible Solution

Is it important to assess coding abilities? Yes.

Is a coding challenge the only way to assess said skills? No.

Is a programming exercise on a whiteboard any better or worse? Not really.

Interviews, whether online or in person, are unavoidable. Wasting candidates’ time with homework is not. If you absolutely must force a THIC onto people, describe the hiring process on the company’s career pages, so people can decide whether it’s worth applying at all. Companies that do not want to risk reducing the number of qualified applicants that way need to ask themselves whether it is the transparency that is really the problem. Hint: it’s not. It might also help to track how many applicants choose to withdraw upon hearing about a THIC, although I doubt that is often measured.

Some coding is definitely sensible early on in the process, but after an initial face-to-face. Basic problems can easily be done live on a laptop or over the internet with a shared editor. Note that more difficult problems and the use of frameworks often imply the need for an IDE, and that can be tricky as different people may have different requirements. If the candidate is in the same room, the use of a laptop can add friction: an unfamiliar OS, a different IDE, or an unknown keyboard layout, especially with international recruits. Some people who suffer from OCDs may not be comfortable enough to grab a greasy mouse or clack away on keyboards with breadcrumbs and undefined leftovers stuck underneath the space bar. Others with disabilities may not be able to type on any arbitrary laptop anyway. None of that makes these people any less qualified.

Of course it’s also possible to shove code under candidates’ noses and ask them to review it. They may not have the context, but if they have indicated familiarity with the language and frameworks used it should tell you how they assess their peers’ code. After all, a code review can also be a great opportunity to ask pertinent questions and challenge assumptions, so the candidates can show off whether they can objectively and calmly review a simple application or module without focusing on the inconsequential details, such as formatting, different coding standards, personal preferences, or styles. Most professional teams have that automated or at least documented anyway.

Another option is to have them write tests or debug a problem. Again, in person. A unit test to a black-box function is easy enough and if the desired behaviour is described properly it should not take too much time during the interview. At least it allows candidates to ask questions, and the interviewers can notice whether any task is unclear. I have seen candidates who do not even open up a browser to search for a cryptic exception message that would have led to the answer within minutes, even though they were told they can look up stuff online. That’s a red flag.

Although companies may be reluctant to deviate from a standardized process, why not offer alternatives? Candidates may not be confident enough to choose between a THIC and a video conference with shared editor, but it can be made clear that they are equivalent and down to personal preference. On the other hand, do you really need to assess the basic coding abilities of a senior or staff engineer who has worked at well-known technology companies? Just because you had to go through a trial by fire as a junior several years ago does not mean everyone needs to be subjected to it, again and again. A conversation about projects, accomplishments, problems, solutions, implementation details, architecture, testing strategies, systems design, case studies, and so on is most of the time a sensible proxy for all kinds of bells and whistles that place the burden on the candidates rather than share it equitably between both parties. It does require experienced people in the room to assess whether the person is truly competent and not merely familiar with the material. Yes, it takes time, but the alternative is self-selection bias that eventually leads to a homogeneous rather than diverse workforce.

Why Am I Talking About This?

Diversity is a topic that needs as many allies as possible, especially since women and minorities who bring up diversity at work are often penalized for it. The more balanced the workforce is and the more it represents society’s demographics as a whole, the more products and services work for everyone. It’s about damn time we stop treating more than half the world as an edge case.