Article·simonwillison.net
llmprompt-engineeringevaluationfine-tuningresearchapi-designdeployment
Changes in the system prompt between Claude Opus 4.6 and 4.7
System prompt changes between LLM versions (e.g., Claude Opus 4.6 to 4.7) significantly impact model behavior and application performance. Developers must re-evaluate and adapt prompt engineering strategies to maintain consistency and optimize outputs.
intermediate1 hour6 steps
The play
- Acknowledge System Prompt EvolutionUnderstand that underlying system prompts defining an LLM's persona and constraints can change between model versions, even subtly, leading to significant shifts in responses.
- Proactively Identify ChangesBefore migrating or updating LLM versions, review release notes or conduct comparative testing to identify any modifications to the model's default or implied system prompt.
- Re-evaluate Prompt EngineeringThoroughly re-evaluate your existing prompt engineering strategies, including RAG pipelines or fine-tuning approaches, as performance benchmarks from a previous version may no longer apply.
- Adjust for New BehaviorsAdjust your prompts, application logic, or data pipelines to account for new model behaviors, unexpected outputs, or altered instruction adherence introduced by system prompt changes.
- Implement Robust TestingEstablish robust evaluation frameworks and continuous testing to validate consistent and predictable model behavior across different LLM versions, ensuring application stability.
- Adopt Continuous AdaptationTreat prompt engineering as an ongoing adaptation process rather than a one-time setup, incorporating version control and continuous monitoring for your AI applications.
Starter code
```python
# Example of how system prompts might evolve and require re-evaluation
# System prompt for an older LLM version (e.g., Claude Opus 4.6)
SYSTEM_PROMPT_V_OLD = """You are a helpful, concise AI assistant. Answer questions directly.
"""
# Potentially updated system prompt for a newer LLM version (e.g., Claude Opus 4.7)
# Note the subtle changes in persona, constraints, or desired output style.
SYSTEM_PROMPT_V_NEW = """You are an expert AI assistant, skilled in providing detailed and structured explanations.
Always ask for clarification if the user's request is ambiguous. Prioritize accuracy.
"""
# In your application, you might explicitly set the system prompt:
# from anthropic import Anthropic
# client = Anthropic(api_key="YOUR_ANTHROPIC_API_KEY")
# response = client.messages.create(
# model="claude-3-opus-20240229", # Or whatever the latest model is
# max_tokens=1024,
# system=SYSTEM_PROMPT_V_NEW, # Ensure this is the correct prompt for the model version
# messages=[
# {"role": "user", "content": "Explain quantum entanglement."}
# ]
# )
# print(response.content)
# Action: Compare your current application's behavior with SYSTEM_PROMPT_V_OLD vs. SYSTEM_PROMPT_V_NEW
# to understand the impact of potential underlying changes in the LLM's default system prompt.
```Source