Will 2024 be the Year of Small Language Models (SLMs) ?
Should we be taking Microsoft's Phi-2 seriously?
Hey Everyone,
Nearing the end of 2023 we are seeing a bevy of small language models (SLMs) with supposedly good benchmarks from Mistral, Together AI, Meta, Google and even Microsoft.
📶 From our sponsor: 📶
Launch Your First RAG App in 10 Minutes or Less
Powered by the Most Advanced Data Pipeline for Retrieval Augmented Generation
Microsoft Research should not be underestimated even if these benchmark comparisons cannot be taken too seriously.
Microsoft Research has announced the release of Phi-2, a small language model (SLM) demonstrating remarkable capabilities for its size. Small language models got a lot better in just a few months of 2023 which means in 2024 they could produce some interesting results.
Phi-2 model has not been tested to ensure that it performs adequately for any production-level application. It appears to be more like a lead magnet for Azure AI Studio (a playground for researchers).
Phi-2 is a Transformer-based model with a next-word prediction objective, trained on 1.4T tokens from multiple passes on a mixture of Synthetic and Web datasets for NLP and coding. The training for Phi-2 took 14 days on 96 A100 GPUs. Phi-2 is a base model that has not undergone alignment through reinforcement learning from human feedback (RLHF), nor has it been instruct fine-tuned.
The lead for Phi-2 to the best of my knowledge is Mojan Javaheripi. She works on "Physics of Large Language Models" and is broadly interested in enhancing LLMs through new data sources, training regimens, and model architectures.
Mojan Javaheripi - Microsoft Research
Phi-2 was released on December 12th, 2023.
Small language models will evolve at an unprecedented pace in 2024 where I see Apple becoming a potential winner as well.
The previous papers for reference: - Phi-1 (Textbooks Are All You Need, https://arxiv.org/abs/2306.11644) - Phi-1.5 (Textbooks Are All You Need II, https://arxiv.org/abs/2309.05463) papers. Courtesy of
Some of the Twitter X mentions around Phi-2 are worth reading this week.Keep reading with a 7-day free trial
Subscribe to AI Supremacy to keep reading this post and get 7 days of free access to the full post archives.