Providers Module¶

Directory: src/selectools/providers/ Files: base.py, openai_provider.py, anthropic_provider.py, gemini_provider.py, ollama_provider.py, fallback.py

Table of Contents¶

Overview
Provider Protocol
Provider Implementations
Message Formatting
Native Tool Calling
Cost Calculation
Implementation Details

Overview¶

Providers are adapters that translate between selectools' unified interface and specific LLM APIs. They handle:

API authentication and configuration
Message format conversion
Role mapping
Image encoding (for vision models)
Streaming implementation
Usage statistics extraction
Error handling

Design Goal¶

Provider Agnosticism: Switch LLM backends with one line of code, no refactoring required.

Provider Protocol¶

Interface Definition¶

from typing import Protocol, runtime_checkable, List, Optional, Union, AsyncGenerator
from ..types import Message, ToolCall
from ..tools import Tool
from ..usage import UsageStats

@runtime_checkable
class Provider(Protocol):
    """Interface every provider adapter must satisfy."""

    name: str                    # Provider identifier
    supports_streaming: bool     # Can stream responses
    supports_async: bool = False # Has async methods

    def complete(
        self,
        *,
        model: str,
        system_prompt: str,
        messages: List[Message],
        tools: Optional[List[Tool]] = None,  # Native tool calling
        temperature: float = 0.0,
        max_tokens: int = 1000,
        timeout: float | None = None,
    ) -> tuple[Message, UsageStats]:
        """Return assistant Message (with optional tool_calls) and usage stats.

        Note: Message.content may be None when the LLM responds with only
        tool_calls. The agent normalizes None content to "" internally.
        """
        ...

    def stream(self, *, model, system_prompt, messages, **kwargs):
        """Yield assistant text chunks (no usage stats)."""
        ...

    async def acomplete(
        self,
        *,
        model: str,
        system_prompt: str,
        messages: List[Message],
        tools: Optional[List[Tool]] = None,
        temperature: float = 0.0,
        max_tokens: int = 1000,
        timeout: float | None = None,
    ) -> tuple[Message, UsageStats]:
        """Async version of complete()."""
        ...

    async def astream(
        self,
        *,
        model: str,
        system_prompt: str,
        messages: List[Message],
        tools: Optional[List[Tool]] = None,
        temperature: float = 0.0,
        max_tokens: int = 1000,
        timeout: float | None = None,
    ) -> AsyncGenerator[Union[str, ToolCall], None]:
        """Async streaming with native tool call support.

        Yields:
            str: Text content deltas
            ToolCall: Complete tool call objects when ready
        """
        ...

Key Requirements¶

Sync Methods: complete() and stream() must be implemented
Return Types: complete() returns (Message, UsageStats) — Message may contain tool_calls
Streaming: stream() yields strings; astream() yields Union[str, ToolCall]
Native Tool Calling: Pass tools parameter for provider-native function calling
Async: Recommended for performance; acomplete() and astream()

Provider Implementations¶

All providers support namespace imports from the selectools.providers package:

from selectools.providers import OpenAIProvider, AnthropicProvider, GeminiProvider, OllamaProvider

OpenAI Provider¶

from selectools.providers import OpenAIProvider
from selectools.models import OpenAI

provider = OpenAIProvider(
    api_key="sk-...",  # Or set OPENAI_API_KEY env var
    default_model=OpenAI.GPT_4O.id
)

# Features:
# - Streaming support
# - Async support (acomplete/astream)
# - Vision support (image_path in messages)
# - Full usage stats
# - Native tool calling (function calling API)
# - Auto max_tokens → max_completion_tokens for GPT-5/4.1/o-series

API: OpenAI Chat Completions API

Token Parameter Handling: Newer OpenAI models (GPT-5.x, GPT-4.1, o-series, codex) reject the legacy max_tokens parameter and require max_completion_tokens. The provider auto-detects the model family and sends the correct parameter — no user action needed.

Anthropic Provider¶

from selectools.providers import AnthropicProvider
from selectools.models import Anthropic

provider = AnthropicProvider(
    api_key="sk-ant-...",  # Or set ANTHROPIC_API_KEY
    default_model=Anthropic.SONNET_4_5.id
)

# Features:
# - Streaming support
# - Async support
# - Vision support (model-dependent)
# - Full usage stats
# - Native tool calling (function calling API)

API: Anthropic Messages API

Gemini Provider¶

from selectools.providers import GeminiProvider
from selectools.models import Gemini

provider = GeminiProvider(
    api_key="...",  # Or set GEMINI_API_KEY or GOOGLE_API_KEY
    default_model=Gemini.FLASH_2_0.id
)

# Features:
# - Streaming support
# - Async support
# - Vision support (model-dependent)
# - Free embeddings
# - Native tool calling (function calling API)

API: Google Generative AI

Ollama Provider¶

from selectools.providers import OllamaProvider
from selectools.models import Ollama

provider = OllamaProvider(
    host="http://localhost:11434",  # Default
    default_model=Ollama.LLAMA_3_2.id
)

# Features:
# - Local execution (privacy-first)
# - Zero cost
# - Streaming support
# - No API key required

API: Ollama REST API

Implementation note: OpenAIProvider and OllamaProvider both inherit from _OpenAICompatibleBase (Template Method pattern), sharing message formatting, response parsing, and streaming logic. Only pricing, error messages, and token parameter naming differ between them.

Local Provider (Testing)¶

from selectools.providers.stubs import LocalProvider

provider = LocalProvider()

# Features:
# - No network calls
# - No API costs
# - Returns user's last message
# - Perfect for testing

Message Formatting¶

Unified Message Format¶

from selectools.types import Message, Role

Message(role=Role.USER, content="Hello")
Message(role=Role.ASSISTANT, content="Hi there!")
Message(role=Role.TOOL, content="Result", tool_name="search")
Message(role=Role.USER, content="What's in this image?", image_path="./photo.jpg")

Provider-Specific Formatting¶

OpenAI Format¶

def _format_messages(self, system_prompt: str, messages: List[Message]):
    payload = [{"role": "system", "content": system_prompt}]

    for message in messages:
        role = message.role.value

        # Map TOOL role to ASSISTANT (OpenAI doesn't have TOOL role)
        if role == Role.TOOL.value:
            role = Role.ASSISTANT.value

        payload.append({
            "role": role,
            "content": self._format_content(message),
        })

    return payload

def _format_content(self, message: Message):
    if message.image_base64:
        # Vision: multimodal content
        return [
            {"type": "text", "text": message.content},
            {
                "type": "image_url",
                "image_url": {"url": f"data:image/jpeg;base64,{message.image_base64}"},
            },
        ]
    return message.content

Anthropic Format¶

def _format_messages(self, messages: List[Message]):
    formatted = []

    for message in messages:
        role = message.role.value

        # Anthropic uses "user" and "assistant" only
        if role == Role.TOOL.value:
            role = "assistant"

        formatted.append({
            "role": role,
            "content": message.content
        })

    return formatted

# System prompt is separate parameter
client.messages.create(
    model=model,
    system=system_prompt,  # Not in messages array
    messages=formatted
)

Gemini Format¶

def _format_messages(self, system_prompt: str, messages: List[Message]):
    # Gemini combines system and conversation
    formatted = [{"role": "user", "parts": [system_prompt]}]

    for message in messages:
        role = "user" if message.role == Role.USER else "model"

        formatted.append({
            "role": role,
            "parts": [message.content]
        })

    return formatted

Native Tool Calling¶

Overview¶

All providers support native function calling APIs, which provide structured tool calls directly in the response instead of requiring text parsing.

How It Works¶

Agent passes tools parameter to complete()/acomplete()
Provider converts tool schemas to provider-native format
LLM returns structured tool calls in Message.tool_calls
Agent detects tool_calls and executes them directly (no regex parsing needed)

Provider Formats¶

OpenAI¶

# Tools converted to OpenAI function format
tools=[{"type": "function", "function": {"name": "...", "parameters": {...}}}]

# Response contains tool_calls
response.choices[0].message.tool_calls  # List of tool call objects

Anthropic¶

# Tools converted to Anthropic tool format
tools=[{"name": "...", "description": "...", "input_schema": {...}}]

# Response contains tool_use content blocks
response.content  # May contain ToolUse blocks with name and input

Gemini¶

# Tools converted to Gemini function declarations
tools=[Tool(function_declarations=[...])]

# Response candidates contain function calls
response.candidates[0].content.parts  # May contain function_call parts

Fallback¶

If a provider doesn't support native tool calling (e.g., Ollama), or if native calls are not present in the response, the agent falls back to regex-based parsing via ToolCallParser.

Cost Calculation¶

Usage Stats Extraction¶

Each provider extracts token counts from API responses:

OpenAI¶

response = client.chat.completions.create(...)

usage_stats = UsageStats(
    prompt_tokens=response.usage.prompt_tokens,
    completion_tokens=response.usage.completion_tokens,
    total_tokens=response.usage.total_tokens,
    cost_usd=calculate_cost(model, prompt_tokens, completion_tokens),
    model=model,
    provider="openai"
)

Anthropic¶

response = client.messages.create(...)

usage_stats = UsageStats(
    prompt_tokens=response.usage.input_tokens,
    completion_tokens=response.usage.output_tokens,
    total_tokens=response.usage.input_tokens + response.usage.output_tokens,
    cost_usd=calculate_cost(model, input_tokens, output_tokens),
    model=model,
    provider="anthropic"
)

Gemini¶

response = model.generate_content(...)

usage_stats = UsageStats(
    prompt_tokens=response.usage_metadata.prompt_token_count,
    completion_tokens=response.usage_metadata.candidates_token_count,
    total_tokens=response.usage_metadata.total_token_count,
    cost_usd=calculate_cost(model, prompt_tokens, completion_tokens),
    model=model,
    provider="gemini"
)

Cost Calculation¶

from selectools.pricing import calculate_cost

cost = calculate_cost(
    model="gpt-4o",
    prompt_tokens=1000,
    completion_tokens=500
)

# Looks up pricing from models registry:
# OpenAI.GPT_4O: prompt_cost=2.50, completion_cost=10.00 per 1M tokens
# Cost = (1000/1M * 2.50) + (500/1M * 10.00) = $0.0025 + $0.005 = $0.0075

Implementation Details¶

OpenAI Provider¶

class OpenAIProvider(Provider):
    name = "openai"
    supports_streaming = True
    supports_async = True

    def __init__(self, api_key: str | None = None, default_model: str = "gpt-5-mini"):
        from openai import OpenAI, AsyncOpenAI

        self.api_key = api_key or os.getenv("OPENAI_API_KEY")
        if not self.api_key:
            raise ProviderConfigurationError(...)

        self._client = OpenAI(api_key=self.api_key)
        self._async_client = AsyncOpenAI(api_key=self.api_key)
        self.default_model = default_model

    def complete(self, *, model, system_prompt, messages, temperature, max_tokens, timeout):
        formatted = self._format_messages(system_prompt, messages)
        model_name = model or self.default_model

        # Auto-detect max_tokens vs max_completion_tokens per model family
        token_key = (
            "max_completion_tokens"
            if _uses_max_completion_tokens(model_name)
            else "max_tokens"
        )
        args = {
            "model": model_name,
            "messages": formatted,
            "temperature": temperature,
            token_key: max_tokens,
            "timeout": timeout,
        }

        response = self._client.chat.completions.create(**args)

        content = response.choices[0].message.content
        usage_stats = self._extract_usage(response, model_name)

        return content or "", usage_stats

    def stream(self, *, model, system_prompt, messages, temperature, max_tokens, timeout):
        formatted = self._format_messages(system_prompt, messages)
        model_name = model or self.default_model

        token_key = (
            "max_completion_tokens"
            if _uses_max_completion_tokens(model_name)
            else "max_tokens"
        )
        args = {
            "model": model_name,
            "messages": formatted,
            "temperature": temperature,
            token_key: max_tokens,
            "stream": True,
            "timeout": timeout,
        }

        response = self._client.chat.completions.create(**args)

        for chunk in response:
            delta = chunk.choices[0].delta
            if delta and delta.content:
                yield delta.content

Async Streaming (`astream`)¶

All providers implement astream() for E2E streaming with native tool support:

async def astream(self, *, model, system_prompt, messages, tools=None, ...):
    """Yield text deltas and ToolCall objects."""
    # Stream response from provider
    async for chunk in self._async_client.chat.completions.create(stream=True, ...):
        # Yield text deltas
        if delta.content:
            yield delta.content

        # Accumulate tool call deltas
        if delta.tool_calls:
            # ... accumulate until complete ...
            yield ToolCall(tool_name=name, parameters=args, id=tc_id)

The agent's astream() method consumes these and: - Yields StreamChunk objects for text - Executes tool calls when received - Continues the agent loop until completion

Error Handling¶

def complete(self, ...):
    try:
        response = self._client.chat.completions.create(...)
        return content, usage_stats
    except Exception as exc:
        raise ProviderError(f"OpenAI completion failed: {exc}") from exc

Async Implementation¶

async def acomplete(self, *, model, system_prompt, messages, ...):
    formatted = self._format_messages(system_prompt, messages)
    model_name = model or self.default_model

    token_key = (
        "max_completion_tokens"
        if _uses_max_completion_tokens(model_name)
        else "max_tokens"
    )
    args = {
        "model": model_name,
        "messages": formatted,
        "temperature": temperature,
        token_key: max_tokens,
        "timeout": timeout,
    }

    response = await self._async_client.chat.completions.create(**args)

    content = response.choices[0].message.content
    usage_stats = self._extract_usage(response, model_name)

    return content or "", usage_stats

Best Practices¶

1. Set API Keys via Environment¶

export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."
export GEMINI_API_KEY="..."

# No need to pass api_key
provider = OpenAIProvider()

2. Use Model Constants¶

from selectools.models import OpenAI, Anthropic, Gemini

# ✅ Good - Type-safe, autocomplete
provider = OpenAIProvider(default_model=OpenAI.GPT_4O_MINI.id)

# ❌ Bad - Prone to typos
provider = OpenAIProvider(default_model="gpt-4o-mini")

3. Handle Provider Errors¶

from selectools.providers.base import ProviderError

try:
    response, stats = provider.complete(...)
except ProviderError as e:
    logger.error(f"Provider failed: {e}")
    # Fallback logic

4. Test with Local Provider¶

from selectools.providers.stubs import LocalProvider

# Development/testing
if os.getenv("ENV") == "test":
    provider = LocalProvider()
else:
    provider = OpenAIProvider()

Adding a New Provider¶

Steps¶

Create provider file in src/selectools/providers/
Implement Provider protocol
Handle message formatting
Extract usage stats
Add to exports in __init__.py

Template¶

from ..types import Message
from ..usage import UsageStats
from ..pricing import calculate_cost
from .base import Provider, ProviderError

class MyProvider(Provider):
    name = "my_provider"
    supports_streaming = True
    supports_async = False

    def __init__(self, api_key: str, default_model: str = "default-model"):
        self.api_key = api_key
        self.default_model = default_model
        # Initialize client

    def complete(self, *, model, system_prompt, messages, temperature, max_tokens, timeout):
        # Format messages
        formatted = self._format_messages(system_prompt, messages)

        try:
            # Call API
            response = self.client.complete(...)

            # Extract content
            content = response.text

            # Extract usage
            usage_stats = UsageStats(
                prompt_tokens=response.prompt_tokens,
                completion_tokens=response.completion_tokens,
                total_tokens=response.total_tokens,
                cost_usd=calculate_cost(model, ...),
                model=model,
                provider=self.name
            )

            return content, usage_stats

        except Exception as exc:
            raise ProviderError(f"{self.name} failed: {exc}") from exc

    def stream(self, ...):
        # Stream implementation
        for chunk in response:
            yield chunk.text

    def _format_messages(self, system_prompt, messages):
        # Convert to provider's format
        pass

Testing¶

def test_openai_provider():
    provider = OpenAIProvider(api_key="test-key", default_model="gpt-4o-mini")

    messages = [Message(role=Role.USER, content="Hello")]

    response, stats = provider.complete(
        model="gpt-4o-mini",
        system_prompt="You are helpful",
        messages=messages,
        temperature=0.0,
        max_tokens=100
    )

    assert isinstance(response, str)
    assert stats.total_tokens > 0
    assert stats.cost_usd >= 0

def test_provider_switching():
    # Same agent code works with any provider
    for provider in [OpenAIProvider(), AnthropicProvider(), GeminiProvider()]:
        agent = Agent(tools=[...], provider=provider)
        response = agent.run([Message(role=Role.USER, content="Test")])
        assert response.content

FallbackProvider¶

Overview¶

FallbackProvider wraps multiple providers in priority order with automatic failover and circuit breaker protection. If the primary provider fails, the next one is tried automatically.

Usage¶

from selectools import FallbackProvider, OpenAIProvider, AnthropicProvider
from selectools.providers.stubs import LocalProvider

provider = FallbackProvider([
    OpenAIProvider(default_model="gpt-4o-mini"),
    AnthropicProvider(default_model="claude-haiku"),
    LocalProvider(),
])

agent = Agent(tools=[...], provider=provider)

Circuit Breaker¶

After consecutive failures, a provider is temporarily skipped:

provider = FallbackProvider(
    providers=[openai, anthropic, local],
    max_failures=3,          # Skip after 3 consecutive failures
    cooldown_seconds=60,     # Skip for 60 seconds
    on_fallback=lambda name, error: log.warning(f"Skipping {name}: {error}"),
)

Failure Conditions¶

The provider falls through to the next on:

Timeout errors
HTTP 5xx (server errors)
HTTP 429 (rate limits)
Connection errors

Protocol Support¶

FallbackProvider implements the full Provider protocol:

complete() — sync completion
acomplete() — async completion
stream() — sync streaming
astream() — async streaming

Properties¶

provider.supports_streaming — True if any child provider supports streaming
provider.supports_async — True if any child provider supports async
provider.name — "fallback"

Providers Module¶

Table of Contents¶

Overview¶

Design Goal¶

Provider Protocol¶

Interface Definition¶

Key Requirements¶

Provider Implementations¶

OpenAI Provider¶

Anthropic Provider¶

Gemini Provider¶

Ollama Provider¶

Local Provider (Testing)¶

Message Formatting¶

Unified Message Format¶

Provider-Specific Formatting¶

OpenAI Format¶

Anthropic Format¶

Gemini Format¶

Native Tool Calling¶

Overview¶

How It Works¶

Provider Formats¶

OpenAI¶

Anthropic¶

Gemini¶

Fallback¶

Cost Calculation¶

Usage Stats Extraction¶

OpenAI¶

Anthropic¶

Gemini¶

Cost Calculation¶

Implementation Details¶

OpenAI Provider¶

Async Streaming (astream)¶

Error Handling¶

Async Implementation¶

Best Practices¶

1. Set API Keys via Environment¶

2. Use Model Constants¶

3. Handle Provider Errors¶

4. Test with Local Provider¶

Adding a New Provider¶

Steps¶

Template¶

Testing¶

FallbackProvider¶

Overview¶

Usage¶

Circuit Breaker¶

Failure Conditions¶

Protocol Support¶

Properties¶

Further Reading¶

Async Streaming (`astream`)¶