Skip to main content

How do I use AWS Bedrock with the SDK?

Yes, the OpenHands SDK supports AWS Bedrock through LiteLLM. Since LiteLLM requires boto3 for Bedrock requests, you need to install it alongside the SDK.

Step 1: Install boto3

Install the SDK with boto3:
# Using pip
pip install openhands-sdk boto3

# Using uv
uv pip install openhands-sdk boto3

# Or when installing as a CLI tool
uv tool install openhands --with boto3

Step 2: Configure Authentication

You have two authentication options:Option A: API Key Authentication (Recommended)Use the AWS_BEARER_TOKEN_BEDROCK environment variable:
export AWS_BEARER_TOKEN_BEDROCK="your-bedrock-api-key"
Option B: AWS CredentialsUse traditional AWS credentials:
export AWS_ACCESS_KEY_ID="your-access-key"
export AWS_SECRET_ACCESS_KEY="your-secret-key"
export AWS_REGION_NAME="us-west-2"

Step 3: Configure the Model

Use the bedrock/ prefix for your model name:
from openhands.sdk import LLM, Agent

llm = LLM(
    model="bedrock/anthropic.claude-3-sonnet-20240229-v1:0",
    # api_key is read from AWS_BEARER_TOKEN_BEDROCK automatically
)
For cross-region inference profiles, include the region prefix:
llm = LLM(
    model="bedrock/us.anthropic.claude-3-5-sonnet-20240620-v1:0",  # US region
    # or
    model="bedrock/apac.anthropic.claude-sonnet-4-20250514-v1:0",  # APAC region
)
For more details on Bedrock configuration options, see the LiteLLM Bedrock documentation.

Does the agent SDK support parallel tool calling?

Yes, the OpenHands SDK supports parallel tool calling by default. The SDK automatically handles parallel tool calls when the underlying LLM (like Claude or GPT-4) returns multiple tool calls in a single response. This allows agents to execute multiple independent actions before the next LLM call.
When the LLM generates multiple tool calls in parallel, the SDK groups them using a shared llm_response_id:
ActionEvent(llm_response_id="abc123", thought="Let me check...", tool_call=tool1)
ActionEvent(llm_response_id="abc123", thought=[], tool_call=tool2)
# Combined into: Message(role="assistant", content="Let me check...", tool_calls=[tool1, tool2])
Multiple ActionEvents with the same llm_response_id are grouped together and combined into a single LLM message with multiple tool_calls. Only the first event’s thought/reasoning is included. The parallel tool calling implementation can be found in the Events Architecture for detailed explanation of how parallel function calling works, the prepare_llm_messages in utils.py which groups ActionEvents by llm_response_id when converting events to LLM messages, the agent step method where actions are created with shared llm_response_id, and the ActionEvent class which includes the llm_response_id field. For more details, see the Events Architecture for a deep dive into the event system and parallel function calling, the Tool System for understanding how tools work with the agent, and the Agent Architecture for how agents process and execute actions.

Does the agent SDK support image content?

Yes, the OpenHands SDK fully supports image content for vision-capable LLMs. The SDK supports both HTTP/HTTPS URLs and base64-encoded images through the ImageContent class.

Check Vision Support

Before sending images, verify your LLM supports vision:
from openhands.sdk import LLM
from pydantic import SecretStr

llm = LLM(
    model="anthropic/claude-sonnet-4-5-20250929",
    api_key=SecretStr("your-api-key"),
    usage_id="my-agent"
)

# Check if vision is active
assert llm.vision_is_active(), "Model does not support vision"

Using HTTP URLs

from openhands.sdk import ImageContent, Message, TextContent

message = Message(
    role="user",
    content=[
        TextContent(text="What do you see in this image?"),
        ImageContent(image_urls=["https://example.com/image.png"]),
    ],
)

Using Base64 Images

Base64 images are supported using data URLs:
import base64
from openhands.sdk import ImageContent, Message, TextContent

# Read and encode an image file
with open("my_image.png", "rb") as f:
    image_base64 = base64.b64encode(f.read()).decode("utf-8")

# Create message with base64 image
message = Message(
    role="user",
    content=[
        TextContent(text="Describe this image"),
        ImageContent(image_urls=[f"data:image/png;base64,{image_base64}"]),
    ],
)

Supported Image Formats

The data URL format is: data:<mime_type>;base64,<base64_encoded_data>Supported MIME types:
  • image/png
  • image/jpeg
  • image/gif
  • image/webp
  • image/bmp

Built-in Image Support

Several SDK tools automatically handle images:
  • FileEditorTool: When viewing image files (.png, .jpg, .jpeg, .gif, .webp, .bmp), they’re automatically converted to base64 and sent to the LLM
  • BrowserUseTool: Screenshots are captured and sent as base64 images
  • MCP Tools: Image content from MCP tool results is automatically converted to base64 data URLs

Disabling Vision

To disable vision for cost reduction (even on vision-capable models):
llm = LLM(
    model="anthropic/claude-sonnet-4-5-20250929",
    api_key=SecretStr("your-api-key"),
    usage_id="my-agent",
    disable_vision=True,  # Images will be filtered out
)
For a complete example, see the image input example in the SDK repository.

More questions?

If you have additional questions: