We've just been through a round of recruiting data scientists for our UK team. Hiring is always an instructive process, and we thought this would be a good opportunity to talk about the qualities we look to staff up our data science consulting team. We have already spoken about data preparation, and the sort of work data scientists can be expected to do. In this article, we are looking at what businesses who are wanting to hire data scientists, perhaps for the first time, should look for concerning the essential qualities and skill-sets of potential candidates.
Good data scientists aren't easy to find and successfully hire because the demand for them is rising exponentially, with a shortfall of over 140,000 in the US economy alone, according to McKinsey.
There are some basics. A degree in an applied science subject, such as computer science, applied mathematics, statistics or engineering, is a primary educational requirement. This qualification is essential and is not something that can be picked up by doing a crash course in machine learning. A solid mathematical background is crucial actually to get data science models to work.
Some competence in coding and an understanding of what the works of data is also a plus. We particularly like our data scientists to be competent in Python and its data manipulation and analysis software library, Pandas, as well as having a reasonable amount of experience with SQL. Beyond that, there are some other softer skills that we look for:
Being comfortable getting lost
Data science consulting is different from other IT disciplines in that it is far less defined. "Data science" can mean five things to five different people, and there can be hundreds of different types of problems to solve. Any of these might require three different tools that the data science team may or may not have used before. So we tend to favor candidates who are comfortable using reporting and modelling tools that they have not used before, rather than those who are very skilled in a few languages or methods but dislike the ambiguity of working with new approaches.
Logical problem solvers
The range of approaches necessitates candidates who can think logically about how best to solve a problem. Data analytics problems tend not to be binary. There are usually multiple ways to approach a problem. We favor candidates who think logically about a problem and can demonstrate why they have chosen to approach it in a certain way, and why they have discounted other methodologies. We also like candidates who can defend their approaches and can advocate their approach to their peers-even if some of them may favor a different approach. Rigorous debate is good!
Communicate, communicate, communicate
The traditional idea of geeky techies who won't let anyone drag them away from his/her beloved computer screens, and who are brilliant at writing algorithms, but can't communicate with others, tends not to work for us. Staying firmly out of view may have worked in the past with more traditional IT jobs, but in 2018 a data scientist's ability to communicate across the business and with customers is vital because of how their work will impact the company and its employees. The data scientist may well want to change data handling practices across the business. Unless the data scientist can communicate the desirability and necessity of these changes in a persuasive manner and using non-technical language, they are unlikely to achieve their goals. Also, the needs of the business and the customers it serves are central to the data science work, and so a data scientist also needs to be a good listener.
Realism about the job
Data science still has some "rock star" kudos attached to it, which can lead to misconceptions about what the job entails. This image can be exacerbated by data science boot camps that can sometimes make data science appear more glamorous (and easy) then it is. With a typical data scientist consultant spending 75% of their time engaged in preparing data and building pipelines for data validation and data cleansing. That's not to say that the job isn't fun, but we look for candidates with a dose of realism about what the job is.
An idea of ethics
Ethics are paramount in our industry. We like to compare data science to accountancy to make this point. Accountants have access to the company cash and the books and so are generally well-positioned to commit fraud, should they be minded too. However, in 2018, data is among the most valuable assets most companies have. Its security is paramount, and the data scientists create the security and access to that data every day. We want potential hires to have an understanding of the role they play in maintaining this, and in how people access the data.
For data scientists, it's also about how they use that data, and what the outputs of any work they do may have on people. Weapons of Math Destruction is required reading for all of our new starters. We want our data science team to understand the social implications of what it is they do.
And what don't we care about in a data science consultant?
Counter-intuitively, some of the best candidates we have hired have had no experience of AWS (our default technology choice) only limited knowledge of Python, and no formal data science experience. These things are useful, but they don't, in isolation, trump the things we've outlined above. Curious people who like to try new things, and can then define and defend an approach to the problem, make the best data scientists.