Caching Strategies

Reduce API costs and latency by caching LLM responses.

Built-in PromptCache

PromptC includes a simple in-memory cache:

import { PromptCache, createCache } from "@mzhub/promptc";

// Create a cache
const cache = new PromptCache({
  maxSize: 1000,         // Max entries
  ttlMs: 60 * 60 * 1000  // 1 hour TTL
});

// Or use the factory
const cache = createCache({
  maxSize: 1000,
  ttlMs: 3600000
});

Cache API

MethodDescription
get(key, input)Get cached value or undefined
set(key, input, value)Store value in cache
has(key, input)Check if entry exists
clear()Clear all entries
sizeNumber of cached entries

Basic Usage

import { PromptCache } from "@mzhub/promptc";

const cache = new PromptCache({ maxSize: 500 });

async function extractWithCache(text) {
  // Check cache first
  const cached = cache.get("extractor", { text });
  if (cached) {
    console.log("Cache hit!");
    return cached;
  }
  
  // Call LLM
  const result = await extractor.run({ text });
  
  // Cache the result
  cache.set("extractor", { text }, result);
  
  return result;
}

Wrapper Pattern

Create a reusable caching wrapper:

function withCache(program, cache, cacheKey) {
  return async (input) => {
    const cached = cache.get(cacheKey, input);
    if (cached) return cached;
    
    const result = await program.run(input);
    cache.set(cacheKey, input, result);
    return result;
  };
}

// Usage
const cache = new PromptCache({ maxSize: 1000 });

const cachedExtractor = withCache(extractor, cache, "extractor");
const cachedSummarizer = withCache(summarizer, cache, "summarizer");

// These will be cached
await cachedExtractor({ text: "..." });
await cachedSummarizer({ text: "..." });

TTL (Time To Live)

Automatically expire cached entries:

const cache = new PromptCache({
  maxSize: 1000,
  ttlMs: 5 * 60 * 1000  // 5 minutes
});

// Entry expires after 5 minutes
cache.set("key", { input: "..." }, result);

// After 5 minutes, returns undefined
cache.get("key", { input: "..." });  // undefined
Memory Considerations
In-memory caching is fast but lost on restart. For persistent caching, use Redis or a database.

Redis Caching

For production, use Redis for persistent caching:

import Redis from "ioredis";
import { createHash } from "crypto";

const redis = new Redis(process.env.REDIS_URL);

function hashInput(input) {
  return createHash("sha256")
    .update(JSON.stringify(input))
    .digest("hex");
}

async function extractWithRedis(text) {
  const cacheKey = `prompt:extractor:${hashInput({ text })}`;
  
  // Check Redis
  const cached = await redis.get(cacheKey);
  if (cached) {
    return JSON.parse(cached);
  }
  
  // Call LLM
  const result = await extractor.run({ text });
  
  // Cache with 1 hour TTL
  await redis.setex(cacheKey, 3600, JSON.stringify(result));
  
  return result;
}

Cache Strategies

StrategyUse CaseImplementation
Exact matchIdentical inputsHash the full input
SemanticSimilar inputsEmbed + nearest neighbor
Time-basedFresh data neededTTL expiration
HybridBalance freshness/costTTL + exact match

Cache Invalidation

// Clear all cached extractors
function invalidateExtractorCache() {
  // For in-memory cache
  cache.clear();
  
  // For Redis
  const keys = await redis.keys("prompt:extractor:*");
  if (keys.length > 0) {
    await redis.del(...keys);
  }
}

// Invalidate when model changes
async function upgradeModel() {
  await invalidateExtractorCache();
  // Update provider with new model
}