Caching Strategies

Reduce API costs and latency by caching LLM responses.

Built-in PromptCache

PromptC includes a simple in-memory cache:

import { PromptCache, createCache } from "@mzhub/promptc";

// Create a cache
const cache = new PromptCache({
  maxSize: 1000,         // Max entries
  ttlMs: 60 * 60 * 1000  // 1 hour TTL
});

// Or use the factory
const cache = createCache({
  maxSize: 1000,
  ttlMs: 3600000
});

Cache API

Method	Description
get(key, input)	Get cached value or undefined
set(key, input, value)	Store value in cache
has(key, input)	Check if entry exists
clear()	Clear all entries
size	Number of cached entries

Basic Usage

import { PromptCache } from "@mzhub/promptc";

const cache = new PromptCache({ maxSize: 500 });

async function extractWithCache(text) {
  // Check cache first
  const cached = cache.get("extractor", { text });
  if (cached) {
    console.log("Cache hit!");
    return cached;
  }
  
  // Call LLM
  const result = await extractor.run({ text });
  
  // Cache the result
  cache.set("extractor", { text }, result);
  
  return result;
}

Wrapper Pattern

Create a reusable caching wrapper:

function withCache(program, cache, cacheKey) {
  return async (input) => {
    const cached = cache.get(cacheKey, input);
    if (cached) return cached;
    
    const result = await program.run(input);
    cache.set(cacheKey, input, result);
    return result;
  };
}

// Usage
const cache = new PromptCache({ maxSize: 1000 });

const cachedExtractor = withCache(extractor, cache, "extractor");
const cachedSummarizer = withCache(summarizer, cache, "summarizer");

// These will be cached
await cachedExtractor({ text: "..." });
await cachedSummarizer({ text: "..." });

TTL (Time To Live)

Automatically expire cached entries:

const cache = new PromptCache({
  maxSize: 1000,
  ttlMs: 5 * 60 * 1000  // 5 minutes
});

// Entry expires after 5 minutes
cache.set("key", { input: "..." }, result);

// After 5 minutes, returns undefined
cache.get("key", { input: "..." });  // undefined

Memory Considerations

In-memory caching is fast but lost on restart. For persistent caching, use Redis or a database.

Redis Caching

For production, use Redis for persistent caching:

import Redis from "ioredis";
import { createHash } from "crypto";

const redis = new Redis(process.env.REDIS_URL);

function hashInput(input) {
  return createHash("sha256")
    .update(JSON.stringify(input))
    .digest("hex");
}

async function extractWithRedis(text) {
  const cacheKey = `prompt:extractor:${hashInput({ text })}`;
  
  // Check Redis
  const cached = await redis.get(cacheKey);
  if (cached) {
    return JSON.parse(cached);
  }
  
  // Call LLM
  const result = await extractor.run({ text });
  
  // Cache with 1 hour TTL
  await redis.setex(cacheKey, 3600, JSON.stringify(result));
  
  return result;
}

Cache Strategies

Strategy	Use Case	Implementation
Exact match	Identical inputs	Hash the full input
Semantic	Similar inputs	Embed + nearest neighbor
Time-based	Fresh data needed	TTL expiration
Hybrid	Balance freshness/cost	TTL + exact match

Cache Invalidation

// Clear all cached extractors
function invalidateExtractorCache() {
  // For in-memory cache
  cache.clear();
  
  // For Redis
  const keys = await redis.keys("prompt:extractor:*");
  if (keys.length > 0) {
    await redis.del(...keys);
  }
}

// Invalidate when model changes
async function upgradeModel() {
  await invalidateExtractorCache();
  // Update provider with new model
}

Next: Testing Your Prompts →