Small Language Model
What is an SLM and what can it be used for? How does an SLM differ from an LLM? You can find the answers here.
What is an SLM?
A small language model (SLM) is a language AI model that has significantly fewer parameters than large language models (LLMs) and has been trained on smaller, more specific data sets. SLMs are often specialized for certain domains or tasks. Due to their smaller size, they require fewer computing resources and can be trained and deployed more quickly. However, their general linguistic knowledge is more limited than that of gigantic models.
Features and advantages
Small language models are specifically trained with curated data sources that are relevant for a particular application. For example, an SLM could be trained exclusively on legal texts to serve as a helper for lawyers. Some characteristics are:
Use cases
SLMs are used where tailor-made work is required instead of a scattergun approach. In companies, a small language model can be trained with the company's own documentation to answer employee queries (e.g. "How do I apply for leave?" based on internal HR guidelines). In medical applications, an SLM could be trained specifically on cardiology literature to assist doctors with specialist questions – but it would not necessarily be familiar with other medical fields. The general trend is to take a large pre-trained base model and, via fine-tuning or prompt engineering, derive a smaller, specialized model that can do exactly what is needed – an SLM in the broader sense.
SLM vs. LLM
Small Language Models are related to Large Language Models. There is no fixed limit to exactly what "small" means – it depends on the context. In an age where billion-parameter models are common, models with a few hundred million or less could be considered "small". Importantly, bigger is not always better. If the task is narrowly defined, a leaner model with a focus dataset can provide more accurate results because it is not distracted by irrelevant general training data. In addition, SLMs are often more cost-efficient to operate and more environmentally friendly (lower power consumption). In AI strategy, many therefore rely on training large models and then distilling or fine-tuning them to obtain practical SLMs for real-world use.