Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

Provider adapters

Chapter 1 built an adapter from scratch for a hypothetical non-standard API, then introduced the CompletionsAdapter for OpenAI-compatible providers. This chapter goes deeper on the CompletionsProvider pattern that most real providers use.

Two paths to an adapter

Path 1: Implement ModelAdapter/ModelSession/ModelTurn directly
  └── For non-standard APIs (custom REST, gRPC, WebSocket)
  └── Full control, full responsibility
  └── ~200-500 lines of translation code

Path 2: Implement CompletionsProvider (via agentkit-adapter-completions)
  └── For OpenAI-compatible chat completions APIs
  └── ~50-100 lines: config + hooks
  └── Transcript conversion, tool serialization, streaming, error handling — all handled

Most providers speak the OpenAI chat completions format (or close variants). For these, CompletionsProvider is the right choice. It handles the ~1000 lines of translation that every completions-compatible adapter needs.

The CompletionsProvider trait

#![allow(unused)]
fn main() {
pub trait CompletionsProvider: Send + Sync + Clone {
    type Config: Serialize + Clone + Send + Sync;

    fn provider_name(&self) -> &str;
    fn endpoint_url(&self) -> &str;
    fn config(&self) -> &Self::Config;

    // Hooks — defaults pass through unchanged:
    fn preprocess_request(&self, builder: RequestBuilder) -> RequestBuilder { builder }
    fn apply_prompt_cache(&self, body: &mut Map<String, Value>, request: &TurnRequest) -> Result<(), LoopError> { Ok(()) }
    fn preprocess_response(&self, _status: StatusCode, _body: &str) -> Result<(), LoopError> { Ok(()) }
    fn postprocess_response(&self, _usage: &mut Option<Usage>, _metadata: &mut MetadataMap, _raw: &Value) {}
}
}

The trait has three required methods (name, URL, config) and four optional hooks. Here’s what each hook is for:

Request lifecycle with hooks:

  TurnRequest
       │
       ▼
  Build JSON body (transcript → messages, tools → tools array)
  Merge Config fields into body
       │
       ├── preprocess_request(builder) ← add auth headers, custom headers
       │
       ├── apply_prompt_cache(body, request) ← map normalized cache requests
       │
       ▼
  HTTP POST to endpoint_url()
       │
       ▼
  Read response
       │
       ├── preprocess_response(status, body) ← check for API errors in 200 responses
       │
       ▼
  Parse into ModelTurnEvents
       │
       ├── postprocess_response(usage, metadata, raw) ← extract provider-specific fields
       │
       ▼
  Return events to loop

What CompletionsAdapter handles

The generic CompletionsAdapter<P> handles all the common work:

ConcernImplementation
Vec<Item>messages[]Maps all ItemKind and Part variants
Vec<ToolSpec>tools[]Converts name, description, JSON Schema
Multimodal content encodingImages as image_url, audio as input_audio
P::Config → request bodySerialize and merge fields
SSE stream parsingChunk reassembly, delta emission
Tool call accumulationCollect streaming JSON fragments into complete calls
finish_reasonFinishReasonMap provider strings to enum variants
usageUsageMap token counts and cost
CancellationRace HTTP future against TurnCancellation
Error status codesConvert 4xx/5xx into LoopError

The Config associated type

The Config type is where providers differ most. Each provider has different parameter names and supported options:

Providermax_tokens fieldExtra fields
OpenAImax_completion_tokensfrequency_penalty, presence_penalty
Ollamanum_predicttop_k
Mistralmax_tokens
Groqmax_completion_tokens
vLLMmax_tokens

By making Config an associated type with Serialize, each provider declares exactly the fields it supports with their correct names. The adapter serializes the struct and merges it into the request body — no field name mapping needed.

Building a provider: the pattern

Every provider crate follows the same structure:

agentkit-provider-{name}/
  src/lib.rs
    ├── {Name}Config         // User-facing config (new, with_temperature, from_env, etc.)
    ├── {Name}RequestConfig  // Serializable request fields (#[serde(skip_serializing_if)])
    ├── {Name}Provider       // CompletionsProvider impl
    └── {Name}Adapter        // Newtype over CompletionsAdapter<{Name}Provider>
                             // Implements ModelAdapter by delegation

The user-facing API:

#![allow(unused)]
fn main() {
let adapter = OllamaAdapter::new(
    OllamaConfig::new("llama3.1:8b")
        .with_temperature(0.0)
        .with_num_predict(4096),
)?;

let agent = Agent::builder()
    .model(adapter)
    .build()?;
}

Available providers

agentkit ships six provider crates:

CrateAuthHooks used
agentkit-provider-openrouterBearer + headersauth, cache mapping, error check, cost
agentkit-provider-openaiBearerauth, cache mapping
agentkit-provider-ollamanoneNone
agentkit-provider-vllmoptional Bearerpreprocess_request (optional auth)
agentkit-provider-groqBearerpreprocess_request (auth)
agentkit-provider-mistralBearerpreprocess_request (auth)

Ollama is the simplest — no auth, no hooks. OpenRouter is the most complex — it uses auth headers, prompt-cache mapping, 200-with-error handling, and response enrichment.

When to implement ModelAdapter directly

Use the raw traits when:

  • The provider doesn’t speak the OpenAI chat completions format
  • The provider uses WebSocket or gRPC instead of HTTP
  • The provider has server-side session state
  • You need streaming behavior that SSE doesn’t support

For WebSocket-based providers:

  • start_session opens the connection
  • begin_turn sends a continuation frame (not the full transcript)
  • next_event reads from the live connection
  • Session cleanup on drop

Testing adapters

Whether you use CompletionsProvider or implement the raw traits, the normalization contract is the same. Test these guarantees:

  1. Text completion → correct Delta sequence ending with CommitPart and Finished
  2. Tool calls → ToolCallPart with valid IDs and parseable JSON input
  3. Multiple tool calls → one ToolCall event per call
  4. Token limit → FinishReason::MaxTokens
  5. Cancellation → clean LoopError::Cancelled
  6. Usage → non-zero, plausible token counts

For CompletionsProvider implementations, you mostly need to test the hooks — the generic adapter handles everything else. Mock the HTTP layer with a test server that returns known SSE responses.

Crate: agentkit-adapter-completions — the generic adapter. agentkit-provider-* — provider-specific implementations.