by George Seif

Deep Learning vs Classical Machine Learning

Over the past several years, deep learning has become the go-to technique for most AI type problems, overshadowing classical machine learning. The clear reason for this is that deep learning has repeatedly demonstrated its superior performance on a wide variety of tasks including speech, natural language, vision, and playing games. Yet although deep learning has such high performance, there are still a few advantages to using classical machine learning and a number of specific situations where you’d be much better off using something like a linear regression or decision tree rather than a big deep network.

In this post we’re going to compare and contrast deep learning vs classical machine learning techniques. In doing so we’ll identify the pros and cons of both techniques and where/how they are best used.

Deep Learning > Classical Machine Learning

  • Best-in-class performance: Deep networks have achieved accuracies that are far beyond that of classical ML methods in many domains including speech, natural language, vision, and playing games. In many tasks, classical ML can’t even compete. For example, the graph below shows the image classification accuracy of different methods on the ImageNet dataset; blue colour indicates classical ML methods and red colour indicates a deep Convolutional Neural Network (CNN) method. Deep learning blows classical ML out of the water here.
 Deep Learning vs Classical Machine Learning
  • Scales effectively with data: Deep networks scale much better with more data than classical ML algorithms. The graph below is a simple yet effective illustration of this. Often times, the best advice to improve accuracy with a deep network is just to use more data! With classical ML algorithms this quick and easy fix doesn’t work even nearly as well and more complex methods are often required to improve accuracy.
Deep Learning vs Classical Machine Learning
  • No need for feature engineering: Classical ML algorithms often require complex feature engineering. Usually, a deep dive exploratory data analysis is first performed on the dataset. A dimensionality reduction might then be done for easier processing. Finally, the best features must be carefully selected to pass over to the ML algorithm. There’s no need for this when using a deep network as one can just pass the data directly to the network and usually achieve good performance right off the bat. This totally eliminates the big and challenging feature engineering stage of the whole process.
  • Adaptable and transferable: Deep learning techniques can be adapted to different domains and applications far more easily than classical ML algorithms. Firstly, transfer learning has made it effective to use pre-trained deep networks for different applications within the same domain. For example, in computer vision, pre-trained image classification networks are often used as a feature extraction front-end to object detection and segmentation networks. The use of these pre-trained networks as front-ends eases the full model’s training and often helps achieve higher performance in a shorter period of time. In addition, the same underlying ideas and techniques of deep learning used in different domains are often quite transferable. For example, once one understands the underlying deep learning theory for the domain of speech recognition, then learning how to apply deep networks to natural language processing isn’t too challenging since the baseline knowledge is quite similar. With classical ML this isn’t the case at all as both domain specific and application specific ML techniques and feature engineering are required to build high-performance ML models. The knowledge base of classical ML for different domains and applications is quite different and often requires extensive specialized study within each individual area.

Classical Machine Learning > Deep Learning

  • Works better on small data: To achieve high performance, deep networks require extremely large datasets. The pre-trained networks mentioned before were trained on 1.2 million images. For many applications, such large datasets are not readily available and will be expensive and time consuming to acquire. For smaller datasets, classical ML algorithms often outperform deep networks.
  • Financially and computationally cheap: Deep networks require high-end GPUs to be trained in a reasonable amount of time with big data. These GPUs are very expensive yet without them training deep networks to high performance would not be practically feasible. To use such high-end GPUs effectively, a fast CPU, SSD storage, and fast and large RAM are all also required. Classical ML algorithms can be trained just fine with just a decent CPU, without requiring the best of the best hardware. Because they aren’t so computationally expensive, one can also iterate faster and try out many different techniques in a shorter period of time.
  • Easier to interpret: Due to the direct feature engineering involved in classical ML, these algorithms are quite easy to interpret and understand. In addition, tuning hyper-parameters and altering the model designs is more straightforward since we have a more thorough understanding of the data and underlying algorithms. On the other hand, deep networks are very “black box” in that even now researchers do not fully understand the “inside” of deep networks. Hyper-parameters and network design are also quite a challenge due to the lacking theoretical foundation.


There you have it! Your comparison of Classic Machine Learning and Deep Learning. I hope you enjoyed this post and learned something new and useful. If you did, feel free to give it some claps.

Deep Learning vs Classical Machine Learning

Decisions decisions……