In AI-Powered Recruiting, Algorithm Design Matters — a Lot
Artificial intelligence (AI) and machine learning are all the rage in recruiting and hiring circles today. Unfortunately, much of what is on offer is overhyped, of shoddy construction, or otherwise falls short.
Even the big guys don’t always get it right: By now, the story of Amazon’s abandoned algorithm that systematically selected male candidates over women is a well-known cautionary tale.
But that doesn’t mean AI-powered recruiting is destined to fail. We just need to be sure our AI tools are programmed properly. When they are, AI can have amazing results.
Algorithm Design Has a Major Impact on Recruiting Outcomes
In the paper “Hiring as Exploration,” coauthors Danielle Li (professor, MIT Sloan School of Management), Lindsey R. Raymond (PhD candidate, MIT Sloan), and Peter Bergman (professor, Columbia University) set out to investigate three different hiring algorithm designs: one built on a static supervised learning model (SL); one built on an updating SL; and a third built on an upper confidence bound (UCB), which incorporated an exploration bonus.
The static SL model is, essentially, the standard modern hiring algorithm. It uses a historical data set of previous hires and interviewees to identify characteristics that are predictive of success in a given role. It then compares incoming applicants to these characteristics to predict candidate quality.
“In these cases, the algorithm would recommend current candidates that have characteristics that are most similar to those correlated with success in the past,” explains Li.
The updating SL operates in much the same way as the static SL, with one key difference: Whereas a static SL model uses the same data set to assess all applicants, updating SL uses a data set that is regularly updated with information about new applicants and hires.
The UCB model differs from the other two in that it uses an exploration bonus, which is a way for the algorithm to capture how much a company could learn from selecting a certain candidate, based on how different a candidate is from previously selected applicants. Whereas the static and updating SL models rank candidates according to how good they might be, the exploration-based UCB model balances this assessment with a consideration of how much a company stands to learn from a candidate.
“For example, suppose that you want to hire some software engineers, and all your employees are PhDs in computer science from Stanford,” Li says. “If you hire another PhD from Stanford, you might think that they would be a good employee, but you wouldn’t learn much about their quality relative to what you already know, since you’ve already hired a ton of people like them. In contrast, suppose you look at an English major from Michigan. That person is rarer, so separately from whether you think they are good or not, you do stand to learn a lot relative to what you know by selecting them.”
Some Algorithms Are Better Than Others
The researchers applied their algorithms to job applications for consulting, financial analysis, and data science roles at a Fortune 500 firm. When it came to selecting applicants for first-round interviews for these roles, 23 percent of the UCB model’s selected applicants were Black or Hispanic, compared to just 2 percent with the SL model and 5 percent with the updating SL.
Li says the outcome wasn’t very surprising to her.
“I think most people believe that there are lots of smart, competent people across all walks of life, but often they don’t get an opportunity because firms think it’s too hard to find them,” she says. “What we show is that by using algorithms that focus on exploration, we can do a better job of finding them.”
While the UCB model did select more Black and Hispanic candidates, it also selected fewer women than the static and updating SL models did. Thirty-nine percent of the UCB model’s selections were women, compared to 41 percent with the SL model and 50 percent with the updating SL model. That said, all three algorithms beat out human recruiters, who selected 35 percent women.
Li says there are a few different reasons why the UCB model may have chosen comparatively fewer women. While the UCB algorithm grants exploration bonuses based on a candidate’s rarity, gender is not the only factor. In the researchers’ data, men appeared to be “more likely to have unusual majors or work histories, relative to women, so this means that sometimes they [got] higher bonuses,” Li says.
Another possible reason: Women candidates may have been of a higher quality on average, making them more likely to be hired.
“A static learning model will see this in the data and want to select a lot of women, but a UCB model will say, ‘Hold up, I know you want to select women based on what you know right now, but maybe you also want to explore and consider some people you haven’t hired before, like people who have degrees from foreign universities,'” Li says. “Those people are more likely to be men.”
“Think about the SL model as a child that wants to eat its favorite food right now,” Li adds. “The UCB is like the adult that reminds it to try new things because it might like them even better.”
Selecting the Right Tool for the Job
While the UCB model demonstrated positive results in diversifying hiring pipelines, Li cautions against assuming that means the UCB model is the way to go with every application of AI.
“One shouldn’t think about ‘AI’ or ‘big data’ as a single black box that we throw at problems in order to solve them,” Li says. “In practice, there are lots of algorithms designed to be good at different things, so people interested in using AI should at least look carefully and try to choose the right type of tool for the job.”
For example, Li notes that SL models are great at image recognition because “a cat is a cat is a cat, and the category of cat isn’t constantly evolving in a way that demands new training data.” When it comes to hiring, on the other hand, candidate pools do constantly evolve, making the SL model less suited to the task at hand.
Li also urges recruiters and hiring managers to remember that algorithms are more “knowable” than we might think.
“It’s not just that you put a worker through an algorithm and it spits out a reply,” she says. “You can actually look inside the algorithm and know what it is trying to maximize and what approach it’s taking. Taking a moment to ask some questions about how the algorithm works can go a long way toward choosing which algorithm to use — or deciding not to use one at all.”