Small Language Model

What is an SLM and what can it be used for? How does an SLM differ from an LLM? You can find the answers here.

What is an SLM?

A small language model (SLM) is a language AI model that has significantly fewer parameters than large language models (LLMs) and has been trained on smaller, more specific data sets. SLMs are often specialized for certain domains or tasks. Due to their smaller size, they require fewer computing resources and can be trained and deployed more quickly. However, their general linguistic knowledge is more limited than that of gigantic models.

Features and advantages

Small language models are specifically trained with curated data sources that are relevant for a particular application. For example, an SLM could be trained exclusively on legal texts to serve as a helper for lawyers. Some characteristics are:

  • Domain expertise: a "legalese" SLM knows many legal terms and phrases that a general model might not handle as precisely.

  • Efficiency: Fewer parameters mean less memory requirements and often faster execution. SLMs can possibly also be used on devices with limited hardware (edge devices) where an LLM would be too large.

  • Training time: Due to the smaller size and the focused data set, training and fine-tuning times are shorter. Updates (if new data is added) are also possible more quickly.

  • Fewer hallucinations? Since the model has limited knowledge, it may be less inclined to fantasize about it. However, it may also simply have to pass when something outside its area of expertise is asked.

Use cases

SLMs are used where tailor-made work is required instead of a scattergun approach. In companies, a small language model can be trained with the company's own documentation to answer employee queries (e.g. "How do I apply for leave?" based on internal HR guidelines). In medical applications, an SLM could be trained specifically on cardiology literature to assist doctors with specialist questions – but it would not necessarily be familiar with other medical fields. The general trend is to take a large pre-trained base model and, via fine-tuning or prompt engineering, derive a smaller, specialized model that can do exactly what is needed – an SLM in the broader sense.

SLM vs. LLM

Small Language Models are related to Large Language Models. There is no fixed limit to exactly what "small" means – it depends on the context. In an age where billion-parameter models are common, models with a few hundred million or less could be considered "small". Importantly, bigger is not always better. If the task is narrowly defined, a leaner model with a focus dataset can provide more accurate results because it is not distracted by irrelevant general training data. In addition, SLMs are often more cost-efficient to operate and more environmentally friendly (lower power consumption). In AI strategy, many therefore rely on training large models and then distilling or fine-tuning them to obtain practical SLMs for real-world use.

Back to the overview