Will 2024 be the Year of Small Language Models (SLMs) ?

Should we be taking Microsoft's Phi-2 seriously?

Dec 14, 2023

∙ Paid

Satya Nadella on stage at Microsoft Ignite 2023 announcing Phi-2.

Hey Everyone,

Nearing the end of 2023 we are seeing a bevy of small language models (SLMs) with supposedly good benchmarks from Mistral, Together AI, Meta, Google and even Microsoft.

📶 From our sponsor: 📶

Launch Your First RAG App in 10 Minutes or Less

Start for free

Microsoft Research should not be underestimated even if these benchmark comparisons cannot be taken too seriously.

Microsoft Research has announced the release of Phi-2, a small language model (SLM) demonstrating remarkable capabilities for its size. Small language models got a lot better in just a few months of 2023 which means in 2024 they could produce some interesting results.

Phi-2 model has not been tested to ensure that it performs adequately for any production-level application. It appears to be more like a lead magnet for Azure AI Studio (a playground for researchers).

Phi-2 is a Transformer-based model with a next-word prediction objective, trained on 1.4T tokens from multiple passes on a mixture of Synthetic and Web datasets for NLP and coding. The training for Phi-2 took 14 days on 96 A100 GPUs. Phi-2 is a base model that has not undergone alignment through reinforcement learning from human feedback (RLHF), nor has it been instruct fine-tuned.

The lead for Phi-2 to the best of my knowledge is Mojan Javaheripi. She works on "Physics of Large Language Models" and is broadly interested in enhancing LLMs through new data sources, training regimens, and model architectures.

Mojan Javaheripi - Microsoft Research

Phi-2 was released on December 12th, 2023.

Read the Blog

Small language models will evolve at an unprecedented pace in 2024 where I see Apple becoming a potential winner as well.

The previous papers for reference: - Phi-1 (Textbooks Are All You Need, https://arxiv.org/abs/2306.11644) - Phi-1.5 (Textbooks Are All You Need II, https://arxiv.org/abs/2309.05463) papers. Courtesy of Sebastian Raschka, PhD Some of the Twitter X mentions around Phi-2 are worth reading this week.

Continue reading this post for free, courtesy of Michael Spencer.

Or purchase a paid subscription.