tutorialintermediate

Extended thinking Feb 2025 • Thinking Use Claude's extended thinking for transparent step-by-step reasoning with budget management.

March 8, 2026cookbook

This cookbook demonstrates Claude 3.7 Sonnet's extended thinking feature, which provides transparent step-by-step reasoning with budget management. Extended thinking enables Claude to show its internal reasoning process through thinking content blocks before delivering final answers. The guide covers setup, basic examples, streaming with thinking, token counting, redacted thinking, and error handling with practical Python code examples.

Key Points

•Enable extended thinking by setting `thinking: {"type": "enabled", "budget_tokens": 2000}` in API requests to access Claude's enhanced reasoning capabilities
•Extended thinking creates transparent thinking content blocks that show Claude's step-by-step reasoning before the final response
•Use the `print_thinking_response()` helper function to parse and display different content block types: thinking, redacted_thinking, and text
•Implement streaming with extended thinking using `client.messages.stream()` and handle content_block_delta events for real-time reasoning display
•Token counting is essential for managing context windows; use `client.messages.count_tokens()` to track input tokens before API calls
•Thinking blocks include optional signatures for verification; check `block.signature` attribute when available for authenticity validation
•Redacted thinking blocks appear in certain contexts; handle them separately from regular thinking blocks using `block.type == "redacted_thinking"`
•Budget tokens parameter controls the maximum tokens allocated for reasoning; balance between reasoning depth and response latency
•The puzzle example demonstrates how extended thinking helps identify logical errors by showing complete reasoning chains rather than just conclusions
•Implement error handling for edge cases where thinking might be redacted or unavailable in specific deployment contexts

Found this useful? Add it to a playbook for a step-by-step implementation guide.

Workflow Diagram

Start Process

Step A

Step B

Step C

Complete

Quality★★★★★

Concepts

Prompt Injection Defense Skills & Tools Tool Use

Artifacts (3)

extended_thinking_setup.pypythonscript

import anthropic
import os

# Initialize the client
client = anthropic.Anthropic()

# Helper function to print thinking responses
def print_thinking_response(response):
    """Pretty print a message response with thinking blocks."""
    print("\n ==== FULL RESPONSE ====")
    for block in response.content:
        if block.type == "thinking":
            print("\n 🧠 THINKING BLOCK:")
            # Show truncated thinking for readability
            print(block.thinking[:500] + "..." if len(block.thinking) > 500 else block.thinking)
            print(f"\n [Signature available: {bool(getattr(block, 'signature', None))} ]")
            if hasattr(block, 'signature') and block.signature:
                print(f"[Signature (first 50 chars): {block.signature[:50]} ...]")
        elif block.type == "redacted_thinking":
            print("\n 🔒 REDACTED THINKING BLOCK:")
            print(f"[Data length: {len(block.data) if hasattr(block, 'data') else 'N/A'} ]")
        elif block.type == "text":
            print("\n ✓ FINAL ANSWER:")
            print(block.text)
    print("\n ==== END RESPONSE ====")

# Helper function to count tokens
def count_tokens(messages):
    """Count tokens for a given message list."""
    result = client.messages.count_tokens(
        model="claude-sonnet-4-6",
        messages=messages
    )
    return result.input_tokens

basic_thinking_example.pypythonscript

def basic_thinking_example():
    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=4000,
        thinking={
            "type": "enabled",
            "budget_tokens": 2000
        },
        messages=[{
            "role": "user",
            "content": "Solve this puzzle: Three people check into a hotel. They pay $30 to the manager. The manager finds out that the room only costs $25 so he gives $5 to the bellboy to return to the three people. The bellboy, however, decides to keep $2 and gives $1 back to each person. Now, each person paid $10 and got back $1, so they paid $9 each, totaling $27. The bellboy kept $2, which makes $29. Where is the missing $1?"
        }]
    )
    print_thinking_response(response)

basic_thinking_example()

streaming_thinking_example.pypythonscript

def streaming_with_thinking():
    with client.messages.stream(
        model="claude-sonnet-4-6",
        max_tokens=4000,
        thinking={
            "type": "enabled",
            "budget_tokens": 2000
        },
        messages=[{
            "role": "user",
            "content": "Solve this puzzle: Three people check into a hotel. They pay $30 to the manager. The manager finds out that the room only costs $25 so he gives $5 to the bellboy to return to the three people. The bellboy, however, decides to keep $2 and gives $1 back to each person. Now, each person paid $10 and got back $1, so they paid $9 each, totaling $27. The bellboy kept $2, which makes $29. Where is the missing $1?"
        }]
    ) as stream:
        current_block_type = None
        current_content = ""
        for event in stream:
            if event.type == "content_block_start":
                current_block_type = event.content_block.type
                print(f"\n --- Starting {current_block_type} block ---")
                current_content = ""
            elif event.type == "content_block_delta":
                if event.delta.type == "thinking_delta":
                    print(event.delta.thinking, end="", flush=True)
                    current_content += event.delta.thinking
                elif event.delta.type == "text_delta":
                    print(event.delta.text, end="", flush=True)
                    current_content += event.delta.text
            elif event.type == "content_block_stop":
                if current_block_type == "thinking":
                    print(f"\n [Completed thinking block, {len(current_content)} characters]")
                elif current_block_type == "redacted_thinking":
                    print("\n [Redacted thinking block]")
                print(f"--- Finished {current_block_type} block --- \n")
                current_block_type = None
            elif event.type == "message_stop":
                print("\n --- Message complete ---")

streaming_with_thinking()