Article·simonwillison.net

llmmachine-learningcontent-creationdeploymentopen-sourceqwen3.6-35b-a3bclaude-opus-4.7

Qwen3.6-35B-A3B on my laptop drew me a better pelican than Claude Opus 4.7

Discover how powerful local AI models can outperform cloud services for specific tasks. Learn to deploy and run advanced models like Qwen locally on your laptop, reducing costs and improving privacy for creative or specialized applications.

intermediate30 min6 steps

The play

Identify Your Local AI Task
Determine a specific AI task (e.g., text generation, image generation, summarization) where you currently use a cloud model or could benefit from local processing. Think about tasks where privacy, latency, or cost are critical.
Research Suitable Local Models
Search for open-source or locally deployable models that excel at your identified task. Look for models optimized for edge devices or specific hardware, such as variations of Qwen, Llama, or Stable Diffusion that can run on consumer hardware.
Choose a Local Inference Platform
Select a platform or framework for running local models. Popular choices include Ollama (for LLMs), LM Studio, or direct inference engines like `llama.cpp` or `diffusers` depending on the model type.
Install Platform and Download Model
Set up your chosen local runner (e.g., install Ollama). Then, download a target model (e.g., a Qwen variant) to your local machine using the platform's commands or direct download.
Run and Evaluate Performance
Execute your chosen AI task with the local model. Compare its output quality, generation speed, and resource usage against cloud-based alternatives (if you have them) or against your expectations for the task.
Leverage Local AI Benefits
Based on your evaluation, integrate the local model into your workflow or applications. Capitalize on benefits like enhanced data privacy, reduced latency, and lower operational costs compared to continuous cloud API calls.

Starter code

# Install Ollama (if not already installed)
# curl -fsSL https://ollama.com/install.sh | sh

# Pull a Qwen model (e.g., Qwen:7B or Qwen:14B)
ollama pull qwen:7b

# Run the model and ask it for a creative description
ollama run qwen:7b "Describe a pelican in flight over a stormy sea, in a poetic style, focusing on its wingspan and grace."

Source

Articlesimonwillison.net