Transformer
What is a transformer in the context of machine learning and how is it used? Find out more here.
Definition
The Transformer is a neural network architecture based on the concept of self-attention and was originally developed for speech processing. It has replaced RNNs/CNNs in many areas and forms the basis for modern LLMs (e.g. GPT, BERT).
Special features
Applications
Transformers have revolutionized machine translation, text generation and text classification and are now also used in computer vision (vision transformers) and multimodal models (text-image combinations).