Name Extractor Example

This example shows the complete workflow: define a schema, compile with BootstrapFewShot, save the result, and use it in production.

What You'll Learn

  • How to define a schema for extracting names from text
  • How to prepare training data
  • How to compile with progress tracking
  • How to save and reuse the compiled program

1. Define the Schema

First, we define what our program should do: take text as input and output an array of names found in the text.

import {
  defineSchema,
  ChainOfThought,
  BootstrapFewShot,
  exactMatch,
  createProvider,
  createCompiledProgram,
  z,
} from "@mzhub/promptc";

// Define the schema
const NameExtractor = defineSchema({
  description: "Extract proper names of people from text",
  inputs: { text: z.string() },
  outputs: { names: z.array(z.string()) },
});

2. Create Provider & Program

Create an LLM provider and wrap it in a ChainOfThought program for step-by-step reasoning.

const provider = createProvider("openai", {
  apiKey: process.env.OPENAI_API_KEY,
});

const program = new ChainOfThought(NameExtractor, provider);

3. Prepare Training Data

Provide examples of correct input/output pairs. The compiler will use these to find the best few-shot examples.

const trainset = [
  {
    input: { text: "Bill Gates founded Microsoft with Paul Allen." },
    output: { names: ["Bill Gates", "Paul Allen"] },
  },
  {
    input: { text: "Elon Musk runs Tesla and SpaceX." },
    output: { names: ["Elon Musk"] },
  },
  {
    input: { text: "Jeff Bezos started Amazon in his garage." },
    output: { names: ["Jeff Bezos"] },
  },
  {
    input: { text: "Satya Nadella is the CEO of Microsoft." },
    output: { names: ["Satya Nadella"] },
  },
  {
    input: { text: "Tim Cook leads Apple after Steve Jobs." },
    output: { names: ["Tim Cook", "Steve Jobs"] },
  },
  {
    input: { text: "Mark Zuckerberg created Facebook." },
    output: { names: ["Mark Zuckerberg"] },
  },
  {
    input: { text: "Sundar Pichai runs Google and Alphabet." },
    output: { names: ["Sundar Pichai"] },
  },
];
Training Data Quality
Include edge cases like texts with multiple names, single names, or no names at all. More diverse training data leads to better generalization.

4. Compile with Progress Tracking

Run the compiler with an onProgress callback to monitor the optimization process.

const compiler = new BootstrapFewShot(exactMatch());

const result = await compiler.compile(program, trainset, {
  candidates: 10,
  concurrency: 3,
  onProgress: ({ candidatesEvaluated, totalCandidates, currentBestScore }) => {
    console.log(
      `Progress: ${candidatesEvaluated}/${totalCandidates} | Best: ${(
        currentBestScore * 100
      ).toFixed(1)}%`
    );
  },
});

console.log("Compilation complete!");
console.log(`Best score: ${(result.meta.score * 100).toFixed(1)}%`);
console.log(`Tokens used: ${result.meta.tokenUsage.totalTokens}`);
console.log(`API calls: ${result.meta.tokenUsage.calls}`);

5. Save & Use in Production

Save the compiled configuration as JSON, then load it in production for fast, optimized inference.

import { writeFileSync } from "fs";

// Save the compiled config
writeFileSync("name-extractor.json", JSON.stringify(result, null, 2));
console.log("Saved to name-extractor.json");

// Create a production-ready compiled program
const compiled = createCompiledProgram(program, result);

// Test it
const testResult = await compiled.run({
  text: "Jensen Huang founded NVIDIA.",
});
console.log("Names:", testResult.result.names);
// Output: ["Jensen Huang"]
Version Control
Commit the JSON file to version control. This ensures reproducible results and lets you roll back to previous prompt versions if needed.

Full Example

name-extractor.ts
import { writeFileSync } from "fs";
import {
  defineSchema,
  ChainOfThought,
  BootstrapFewShot,
  exactMatch,
  createProvider,
  createCompiledProgram,
  z,
} from "@mzhub/promptc";

// 1. Define the schema
const NameExtractor = defineSchema({
  description: "Extract proper names of people from text",
  inputs: { text: z.string() },
  outputs: { names: z.array(z.string()) },
});

// 2. Create provider and program
const provider = createProvider("openai", {
  apiKey: process.env.OPENAI_API_KEY,
});
const program = new ChainOfThought(NameExtractor, provider);

// 3. Training data
const trainset = [
  { input: { text: "Bill Gates founded Microsoft with Paul Allen." }, output: { names: ["Bill Gates", "Paul Allen"] } },
  { input: { text: "Elon Musk runs Tesla and SpaceX." }, output: { names: ["Elon Musk"] } },
  { input: { text: "Jeff Bezos started Amazon in his garage." }, output: { names: ["Jeff Bezos"] } },
  { input: { text: "Satya Nadella is the CEO of Microsoft." }, output: { names: ["Satya Nadella"] } },
  { input: { text: "Tim Cook leads Apple after Steve Jobs." }, output: { names: ["Tim Cook", "Steve Jobs"] } },
];

async function main() {
  // 4. Compile
  const compiler = new BootstrapFewShot(exactMatch());
  const result = await compiler.compile(program, trainset, {
    candidates: 10,
    concurrency: 3,
    onProgress: ({ candidatesEvaluated, totalCandidates, currentBestScore }) => {
      console.log(`Progress: ${candidatesEvaluated}/${totalCandidates} | Best: ${(currentBestScore * 100).toFixed(1)}%`);
    },
  });

  console.log(`\nBest score: ${(result.meta.score * 100).toFixed(1)}%`);

  // 5. Save and use
  writeFileSync("name-extractor.json", JSON.stringify(result, null, 2));
  
  const compiled = createCompiledProgram(program, result);
  const output = await compiled.run({ text: "Jensen Huang founded NVIDIA." });
  console.log("Names:", output.result.names);
}

main().catch(console.error);