Deep learning

What is deep learning and how can it be used? Find out more here.

Definition

Deep learning is a branch of machine learning that is based on artificial neural networks with many layers. "Deep" refers to the depth (number of hidden layers) in the network. This depth allows very complex functions to be learned, as each layer extracts more abstract features from the previous ones. In recent years, deep learning has enabled huge advances in areas that were previously difficult to crack – such as image and speech recognition – by utilizing large amounts of data and computing power to perform automatic feature learning.

How deep learning works

In traditional (shallow) machine learning, features often had to be defined manually. For example, in image processing: extracting edges, textures etc., then feeding them to an algorithm. Deep learning automates this process. A multilayer neural network learns end-to-end: from raw input (pixels, raw audio, raw text) to output (e.g. class labels), all transformations are learned by the neurons. Earlier layers learn simple patterns (e.g. edges in images), middle layers learn combinations of these (e.g. shapes, contours) and late layers learn very abstract concepts (faces, objects). This hierarchical feature formation is the heart of deep learning.

It is important to note that deep networks need a lot of data and are trained with backpropagation to adjust the weights. The availability of large data sets (ImageNet, gigantic text corpora) and modern hardware (GPUs, TPUs) was a decisive enabler for deep learning.

Applications

Deep learning has achieved state-of-the-art results in numerous fields:

Computer vision: image recognition, facial identification, medical image analysis (e.g. detection of tumors in scans), autonomous driving (traffic sign recognition, environment analysis).
Language processing and translation: systems such as Google's translator have been massively improved by DL. Voice assistants understand voice input thanks to deep networks. Chatbots generate fluent texts with transformers.
Speech synthesis: deep fakes for voices, realistic-sounding voice output.
Robotics: Sensor fusion and control using deep neural networks (e.g. gripper arm control using cameras).
Games and simulations: Deep reinforcement learning (combination of DL and RL) has mastered complex games and helps in the optimization of networks, traffic flows, etc.
Science: Prediction of protein structures (AlphaFold), discovery of correlations in large physical or biological data sets.

Meaning

Deep learning is often used almost synonymously with the current success story of AI. It has ended the "AI winter" and ushered in a new era in which machines achieve or surpass human performance in many benchmark tests. However, deep learning is not everything: it requires a lot of data and computing power, has problems with explainability and can be uncertain outside of its training distribution. Nevertheless, the paradigm of chasing data through many layers of processing to learn complex structures is currently the most successful learning method - and research is working on making it even more efficient, robust and understandable.

Back to the overview