API Reference
CachingConfig
Configuration for prompt caching to reduce costs and latency.
CachingConfig controls Anthropic prompt caching across direct Anthropic, Azure Anthropic, and compatible Claude models on AWS Bedrock. Appam maps the same high-level setting onto the provider-specific caching mechanism underneath.
Struct Definition
pub struct CachingConfig {
pub enabled: bool,
pub ttl: CacheTTL,
}
pub enum CacheTTL {
FiveMinutes, // "5m" - default, refreshed on use
OneHour, // "1h" - 2x write cost
}Fields
enabled
Whether prompt caching is active.
ttl
Cache time-to-live duration:
CacheTTL::FiveMinutes-- 5-minute cache (default). Refreshed on each use, so active conversations maintain the cache automatically.CacheTTL::OneHour-- 1-hour cache. Higher write cost (2x base input token price) but longer retention.
Default
impl Default for CachingConfig {
fn default() -> Self {
Self {
enabled: true,
ttl: CacheTTL::FiveMinutes,
}
}
}Provider Support
- Anthropic -- Supported. Appam maps
CachingConfigto Anthropic's top-levelcache_controlfield. - Azure Anthropic -- Supported with the same request semantics as direct Anthropic.
- Bedrock -- Supported for compatible Claude models. Appam injects block-level
cache_controlcheckpoints into supported Anthropic fields because Bedrock uses explicit cache checkpoints rather than Anthropic's top-level helper.
Cache Behavior
- Prefix Matching: Automatic prefix lookup (up to ~20 blocks)
- Direct Anthropic / Azure Anthropic: Anthropic applies the top-level marker to the last cacheable block in the request
- Bedrock: Appam injects block-level checkpoints into the end of the supported
system,messages, andtoolssections present in the request - Refresh: 5-minute caches refresh on each hit, effectively staying alive for the duration of an active conversation
Pricing
| Operation | Cost |
|---|---|
| Cache writes (5m) | 1.25x base input token price |
| Cache writes (1h) | 2x base input token price |
| Cache reads | 0.1x base input token price |
| Cache misses | Regular input token price |
Cache usage is tracked in AggregatedUsage:
cache_creation_input_tokens-- Tokens used to create cache entriescache_read_input_tokens-- Tokens read from cache (where cost savings come from)
Builder Usage
use appam::prelude::*;
use appam::llm::anthropic::CachingConfig;
let agent = AgentBuilder::new("my-agent")
.model("claude-sonnet-4-5")
.caching(CachingConfig::default())
.system_prompt("You are a helpful assistant with cached context.")
.build()?;TOML Configuration
[anthropic.caching]
enabled = true
ttl = "5m"Source
Defined in src/llm/anthropic/config.rs.