tutorialintermediate
Extended thinking Feb 2025 • Thinking Use Claude's extended thinking for transparent step-by-step reasoning with budget management.
cookbook
View original on cookbookThis cookbook demonstrates Claude 3.7 Sonnet's extended thinking feature, which provides transparent step-by-step reasoning with budget management. Extended thinking enables Claude to show its internal reasoning process through thinking content blocks before delivering final answers. The guide covers setup, basic examples, streaming with thinking, token counting, redacted thinking, and error handling with practical Python code examples.
Key Points
- •Enable extended thinking by setting `thinking: {"type": "enabled", "budget_tokens": 2000}` in API requests to access Claude's enhanced reasoning capabilities
- •Extended thinking creates transparent thinking content blocks that show Claude's step-by-step reasoning before the final response
- •Use the `print_thinking_response()` helper function to parse and display different content block types: thinking, redacted_thinking, and text
- •Implement streaming with extended thinking using `client.messages.stream()` and handle content_block_delta events for real-time reasoning display
- •Token counting is essential for managing context windows; use `client.messages.count_tokens()` to track input tokens before API calls
- •Thinking blocks include optional signatures for verification; check `block.signature` attribute when available for authenticity validation
- •Redacted thinking blocks appear in certain contexts; handle them separately from regular thinking blocks using `block.type == "redacted_thinking"`
- •Budget tokens parameter controls the maximum tokens allocated for reasoning; balance between reasoning depth and response latency
- •The puzzle example demonstrates how extended thinking helps identify logical errors by showing complete reasoning chains rather than just conclusions
- •Implement error handling for edge cases where thinking might be redacted or unavailable in specific deployment contexts
Found this useful? Add it to a playbook for a step-by-step implementation guide.
Workflow Diagram
Start Process
Step A
Step B
Step C
Complete
Concepts
Artifacts (3)
extended_thinking_setup.pypythonscript
import anthropic
import os
# Initialize the client
client = anthropic.Anthropic()
# Helper function to print thinking responses
def print_thinking_response(response):
"""Pretty print a message response with thinking blocks."""
print("\n ==== FULL RESPONSE ====")
for block in response.content:
if block.type == "thinking":
print("\n 🧠 THINKING BLOCK:")
# Show truncated thinking for readability
print(block.thinking[:500] + "..." if len(block.thinking) > 500 else block.thinking)
print(f"\n [Signature available: {bool(getattr(block, 'signature', None))} ]")
if hasattr(block, 'signature') and block.signature:
print(f"[Signature (first 50 chars): {block.signature[:50]} ...]")
elif block.type == "redacted_thinking":
print("\n 🔒 REDACTED THINKING BLOCK:")
print(f"[Data length: {len(block.data) if hasattr(block, 'data') else 'N/A'} ]")
elif block.type == "text":
print("\n ✓ FINAL ANSWER:")
print(block.text)
print("\n ==== END RESPONSE ====")
# Helper function to count tokens
def count_tokens(messages):
"""Count tokens for a given message list."""
result = client.messages.count_tokens(
model="claude-sonnet-4-6",
messages=messages
)
return result.input_tokensbasic_thinking_example.pypythonscript
def basic_thinking_example():
response = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=4000,
thinking={
"type": "enabled",
"budget_tokens": 2000
},
messages=[{
"role": "user",
"content": "Solve this puzzle: Three people check into a hotel. They pay $30 to the manager. The manager finds out that the room only costs $25 so he gives $5 to the bellboy to return to the three people. The bellboy, however, decides to keep $2 and gives $1 back to each person. Now, each person paid $10 and got back $1, so they paid $9 each, totaling $27. The bellboy kept $2, which makes $29. Where is the missing $1?"
}]
)
print_thinking_response(response)
basic_thinking_example()streaming_thinking_example.pypythonscript
def streaming_with_thinking():
with client.messages.stream(
model="claude-sonnet-4-6",
max_tokens=4000,
thinking={
"type": "enabled",
"budget_tokens": 2000
},
messages=[{
"role": "user",
"content": "Solve this puzzle: Three people check into a hotel. They pay $30 to the manager. The manager finds out that the room only costs $25 so he gives $5 to the bellboy to return to the three people. The bellboy, however, decides to keep $2 and gives $1 back to each person. Now, each person paid $10 and got back $1, so they paid $9 each, totaling $27. The bellboy kept $2, which makes $29. Where is the missing $1?"
}]
) as stream:
current_block_type = None
current_content = ""
for event in stream:
if event.type == "content_block_start":
current_block_type = event.content_block.type
print(f"\n --- Starting {current_block_type} block ---")
current_content = ""
elif event.type == "content_block_delta":
if event.delta.type == "thinking_delta":
print(event.delta.thinking, end="", flush=True)
current_content += event.delta.thinking
elif event.delta.type == "text_delta":
print(event.delta.text, end="", flush=True)
current_content += event.delta.text
elif event.type == "content_block_stop":
if current_block_type == "thinking":
print(f"\n [Completed thinking block, {len(current_content)} characters]")
elif current_block_type == "redacted_thinking":
print("\n [Redacted thinking block]")
print(f"--- Finished {current_block_type} block --- \n")
current_block_type = None
elif event.type == "message_stop":
print("\n --- Message complete ---")
streaming_with_thinking()