Algorithms are an essential feature of data science. So much so that Airbnb, who have a sophisticated data team, recently divided their group into Analytics, Algorithms, and Inference teams, showing how essential algorithms are. Most people have heard of them but are unaware of what exactly they are, or what they do beyond say understanding that Google uses algorithms to ensure spammers don't appear as the number search term on their search engine and Facebook use them to ensure fake news doesn't appear in our timelines. However, the emergence of the algorithm economy is driving uses which go far beyond Google and Facebook and are useful in trying to resolve any kind of business problem.
Any business which is navigating through the digital world and doesn't use algorithms must harness the power of algorithms to enhance their business.
What's the business problem?
Algorithms can solve many business problems. For instance, if recruitment is a big problem in your company and you have plenty of applicants but choosing the best one is turning out to be difficult, algorithms can help reduce say 100 candidates to a pool of five whom the company can then interview. Ensuring that the workforce is not too homogenous can also be done through algorithms, whether because the company wants diversity in the workforce rather than say just employing men, or because they want employees with a diverse range of experiences and personalities.
If fraud is a big problem, an algorithm can be advantageous in identifying the kind of behavior that customers engaged in fraud have shown in the past in actual customers. Imagine the algorithm flagging three transactions out of a base of a few thousand. These suspicious transactions can then be analyzed by employees, whereas it would be economically unrealistic to get staff to analyze each of the several thousand transactions. Enterprises also use algorithms in trying to predict who among candidates for a job may steal from the company itself, vital given the importance of having loyal and honest staff. Imagine there are some important decisions the company's top-level executives need to make about spending or buying over the next year. Algorithms can help those controlling the business by ensuring that, of all the big data the company has that could relate to these spending decisions, it is the critical pieces of information which will end up in front of these executives in reports.
Other areas where algorithms can make a real difference to the business include recommendations. The recommendations we all know are when a website sees you have read a specific article or watched a particular video and recommends other articles or videos, perhaps based on this single choice or probably on your personal history of choosing on the website. Recommendations are particularly useful in e-commerce sites.
Algorithms are also used extensively in marketing and to help marketers understand their customers and potential customers better. We have already blogged about Customer Lifetime Value, segmentation of customers and churn. Well written algorithms can help segment customers, build a CLV for each customer, and make predictions about churn.
How do we build an algorithm?
So you have some business problems and a great data science team, so how do you go about constructing appropriate algorithms that resolve the issues? Now it may be that the data science team is not that part of the business with the best insights into what the real business problems are. It is essential that, if this is the case, then those who are best placed within the business to know what the real business problems are, collaborate with the data science team rather than thinking the data scientists will somehow magically know what the real business problems are. Once these are understood, the next step is to outline whichever business problem the team wants to fix, by documenting it. This process of writing about the issue will help the team think about the dilemmas involved. It is much easier to propose solutions than to identify core problems, and, given human nature, many people would rather others see them as someone who comes up with a brilliant solution rather than actually having to do the harder, more tedious work of identifying real business problems that the company can do something to fix. This last bit is essential. It's no use a UK company identifying Brexit as a business problem given the company cannot do anything about such political issues.
If the team identify the wrong problem, and the company then spends time and money trying to fix it, but, as things pan out three months later, it becomes clear that the team identified the wrong problem, then the business will have wasted lots of time and money. Hence our stressing the critical importance of identifying the right issues in the first place.
The second step is to see what data are available. If this isn't the first business problem the team is approaching, then you should already have a good idea of what overall data is in the enterprise. However, not all of the data will likely be relevant to this particular problem, so the next step is to see what relevant datasets are available as well as what datasets aren't related. It is also a good idea for the team to ask themselves if there are data which aren't currently available but which could really help resolve the business problem, and then to see whether the company can obtain those data itself, or perhaps buy them as third-party data from another company. Assuming that the team has already resolved data quality for the data stored by the company, it is vital that any new datasets are adequately cleansed and validated. Having only high-quality data is essential as algorithms which have to work with weak quality data are going to fail, unless, of course, the algorithm has been specifically written to address data quality issues, as algorithms are great for doing this, and many a budding data scientist will find they write their first algorithm to address data quality issues. A data cleansing algorithm is required whenever there are datasets which have mismatched or low-quality data. Integrating different datasets is a classic data quality issue due to standardization issues.
Once genuine business problems have been identified, and all the relevant data that could shed light on the question are ready, the team can apply an algorithm to the data. These algorithms aren't magic that will automatically solve the problem merely by parsing them through the data. However, algorithms can provide information from which the company can resolve the issue, and are a crucial element in the toolbox of any competent data science team.
Welcome to the algorithm economy.