One of the more exciting new fields in statistics and data analytics is that of machine learning. If you haven’t heard of it, you’ve almost definitely been affected by it. Companies and organisations from Netflix, Google, Apple, General Electric and even the NBA utilise and have used machine learning analysis to inform decision making. But, exactly what is machine learning, and why is everyone making such a fuss?
Arthur Samuel defined machine learning as the “Field of study that gives computers the ability to learn without being explicitly programmed.” Effectively, a machine learning program is one which improves its performance in the task it was programmed for, without the original programmer needing to do much more than write the initial algorithm. Applications are often in predicting changes with respect to a variety of variables, but there are many applications.
Machine learning tasks can be categorised into one of three groups:
- Supervised Learning
- Unsupervised Learning
- Reinforcement Learning
In this group of tasks, the computer is given a set of inputs and the desired output. For instance, a good example given here is the email Spam filter. A set of emails (the inputs) labelled as Spam by a user (the label is the desired output) are given to the computer. The machine learning program then processes the data set to determine, using a particular algorithm which was decided on beforehand by the programmer, what the relationship is between the input and the desired output. So, the computer will learn what you think Spam is and continuously improve. Put simply, this means fewer Nigerian 419 scams.
In this group of tasks, the computer isn’t told what the correct output is…that’s because the programmer doesn’t necessarily know! All they want to know is what would be a good way to group or “cluster” the data together, into groups that make sense. For example if you have a massive database of pictures, but no labels on them, you could feed the database through an unsupervised algorithm which would deduce common characteristics among pictures, and group them accordingly.
Of course, the computer wouldn’t know what to actually call the groups of common pictures, but a human could simply give the categories names afterwards.
This group of tasks was well summarised here as “learning by trial and error”. This type of task is well suited to computers, which can perform repetitive tasks at immense speed. The computer is allowed to interact with an environment and is “rewarded” (likely based on input provided by the programmer) - encouraging that approach.
For example: imagine that you wanted a robot that walked, but you didn’t have the time or were too lazy to spend hours pre-defining how it should walk (i.e that it needs to put one leg forward first, then the next). What you could do is let it “learn” by itself how to walk, and give it a “reward” for moving forward. In this way, it will work out for itself how to move forward, motivated by the reward. This video is a great visual example of reinforcement learning in action.
There are huge possibilities for machine learning in the field of Finance. For instance, consumer credit scores could be assigned based on machine learning algorithms’ results. For example, instead of a bank comparing the credit performance of its consumers based on factors that it chooses beforehand, it could simply let a machine learning algorithm loose in its massive database to decide for itself what the best variables are for assigning credit scores in the future.
Or, a firm could develop a high-frequency trading algorithm which utilises machine learning to uncover relationships between different stocks and economic variables which may not have been considered beforehand.
Finally, machine learning is not just being used for financial gain. The field of Medicine is yet another area where machine learning has value, with algorithms being used to aid diagnosis and even identify melanoma on the skin!
The point? Expect to hear a lot more about machine learning in the future.