AI Supremacy

AI Supremacy

Guides 🦮

Run Language Models on Your Computer with LM-Studio

A practical guide to running local models and picking the right one for speed or accuracy.

Michael Spencer's avatar
Benjamin Marie's avatar
Michael Spencer and Benjamin Marie
Feb 12, 2026
āˆ™ Paid
Made with Gemini / Nano Banana.

Good Morning,

You can do so much with AI. The building and DIY aspect also keeps getting more nuanced, and more powerful. Open-source AI is providing a new array of capabilities even at the local and individual level.

I’m a huge fan of Benjamin Marie and I’ve wanted to share more about his work for so long. Today, we finally have the chance. Ben is an independent AI researcher (LLM, NLP) with two really useful blogs and I have a huge respect for his work: (don’t let the funny names fool you, these are serious resources).

The Kaitchup – AI on a Budget šŸ…

Hands on AI tutorials and news on how adapting language language models in a DIY setting to your tasks and hardware using the most recent techniques and models.

The Kaitchup – AI on a Budget
Weekly tutorials and news on adapting large language models (LLMs) to your tasks and hardware using the most recent techniques and models. The Kaitchup proposes a collection of 170+ AI notebooks regularly updated.
By Benjamin Marie

The Kaitchup publishes invaluable weekly tutorials with info that’s hard to find elsewhere.

  • By being a paid subscriber to The Kaitchup, you also get access to all the AI notebooks (160+), hands-on tutorials, and more in-depth analyses of recently published scientific papers.

Learn Deeper

Read The Salt šŸ§‚

Reviews and in-depth analysis of bleeding edge AI research and how-tos. The Salt is a newsletter for readers who are curious about the Science behind AI. If you want to stay informed of recent progress in AI without reading much, The Salt is for you! I do my best to offer articles that might be interesting for a wide variety of readers.

The Salt - Curated AI
The Salt offers weekly reviews and in-depth analyses of the latest AI papers. If you want to stay informed of recent progress in AI without reading much, The Salt is for you!
By Benjamin Marie

Benjamin’s technical and practical knowledge is invaluable depending on how deep down the rabbit-hole you want to go in DIY with models. It’s not overly technical but it is on technical topics, useful for a wide range of readers interested experimenting locally DIY with models or in small teams.

Selected Works

I asked him for a basic beginners tutorial on how to run LLMs locally (something I sometimes get questions about). He’s able to add so much practical know-how and insights into the latest models where for me he is an authority. If a new model comes out his opinion represents hands-on experience and being up to date on the latest scientific papers.

  1. Qwen3-VL: DeepStack Fusion, Interleaved-MRoPE, and a Native 256K Interleaved Context Window.

  2. Did the Model See the Benchmark During Training? Detecting LLM Contamination

  3. Making LLMs Think Longer: Context, State, and Post-Training Tricks

Benjamin Marie is an independent researcher focused on hands-on AI and the tools around modern language models. He helps people and companies cut costs by adapting models to their specific tasks and hardware. I hope you learn something from it. While my work doesn’t touch on machine learning professionals that much, more and more individuals and small teals are playing with these open-source models locally.

So I’m very proud to be able to bring you a guide like this:

Run Language Models on Your Computer with LM-Studio

A practical guide to running local models and picking the right one for speed or accuracy.

Image generated with ChatGPT

Running large language models (LLMs) locally used to mean wrestling with the GPU’s software layer (like CUDA), scattered model formats, and a lot of trial-and-error. Today, it’s surprisingly approachable. With tools like Ollama or LM Studio, you can download a model, load it in a few clicks, and start chatting on your own machine, without sending prompts to a cloud service.

This article walks through the practical path from ā€œinstalling the appā€ to ā€œrunning my first local model,ā€ and then zooms out to the part that really matters: what determines whether a model runs smoothly (or not) on your hardware. Along the way, we’ll cover installing LM Studio, the memory (simple) math behind model sizes, how to pick trustworthy GGUF builds and compression levels, how to sanity-check model output, and why ā€œthinkingā€ models can be dramatically better on hard prompts while also being noticeably slower.

The goal is not to turn you into an engineer. It’s to give you enough intuition to choose models confidently, understand what an application like LM Studio is telling you, and avoid the most common ā€œwhy is this slow / why is this wrongā€ surprises.

User's avatar

Continue reading this post for free, courtesy of Michael Spencer.

Or purchase a paid subscription.
Benjamin Marie's avatar
A guest post by
Benjamin Marie
Research scientist in NLP/AI.
Subscribe to Benjamin
Ā© 2026 Michael Spencer Ā· Privacy āˆ™ Terms āˆ™ Collection notice
Start your SubstackGet the app
Substack is the home for great culture