QA System Example

Build a question-answering system that shows its reasoning before answering. Uses Chain of Thought for step-by-step thinking.

What You'll Learn

How to build context-aware QA with multiple inputs
How Chain of Thought exposes the model's reasoning
How to use partialMatch for flexible evaluation
How to output confidence scores

1. Define the Schema

Our QA system takes a context paragraph and a question, then outputs an answer with a confidence score.

import {
  defineSchema,
  ChainOfThought,
  BootstrapFewShot,
  partialMatch,
  createProvider,
  z,
} from "@mzhub/promptc";

const QASchema = defineSchema({
  description: "Answer questions based on provided context. Think step by step.",
  inputs: {
    context: z.string().describe("The text containing information"),
    question: z.string().describe("The question to answer"),
  },
  outputs: {
    answer: z.string().describe("The answer extracted from context"),
    confidence: z.number().min(0).max(1).describe("Confidence in the answer"),
  },
});

Field Descriptions

Adding .describe() to your Zod fields helps the LLM understand what each field represents, leading to better outputs.

2. Create the Program

ChainOfThought adds a reasoning step, which is especially useful for QA tasks that require understanding context.

const provider = createProvider("openai", {
  apiKey: process.env.OPENAI_API_KEY,
});

const qaProgram = new ChainOfThought(QASchema, provider);

3. Run a Single Inference

Before compiling, let's run a single inference to see the reasoning trace:

const result = await qaProgram.run({
  context: "Python was created by Guido van Rossum and first released in 1991. It emphasizes code readability.",
  question: "Who created Python?",
});

console.log("Question: Who created Python?");
console.log("Reasoning:", result.trace.reasoning);
console.log("Answer:", result.result.answer);
console.log("Confidence:", `${(result.result.confidence * 100).toFixed(0)}%`);

Example output:

Question: Who created Python?
Reasoning: The context states that "Python was created by Guido van Rossum". 
           This directly answers the question about who created Python.
Answer: Guido van Rossum
Confidence: 95%

4. Prepare Training Data

Create examples with varying contexts and questions:

const trainset = [
  {
    input: {
      context: "The Eiffel Tower was completed in 1889. It stands 330 meters tall and is located in Paris, France.",
      question: "When was the Eiffel Tower completed?",
    },
    output: { answer: "1889", confidence: 0.95 },
  },
  {
    input: {
      context: "Apple Inc. was founded by Steve Jobs, Steve Wozniak, and Ronald Wayne in 1976.",
      question: "Who founded Apple?",
    },
    output: {
      answer: "Steve Jobs, Steve Wozniak, and Ronald Wayne",
      confidence: 0.95,
    },
  },
  {
    input: {
      context: "The speed of light is approximately 299,792 kilometers per second in a vacuum.",
      question: "What is the speed of light?",
    },
    output: {
      answer: "approximately 299,792 kilometers per second",
      confidence: 0.9,
    },
  },
];

5. Compile for Better Accuracy

Use partialMatch for QA since answers don't need to be exact matches—similar answers should score well.

const compiler = new BootstrapFewShot(partialMatch());

const compiled = await compiler.compile(qaProgram, trainset, {
  candidates: 5,
  concurrency: 2,
});

console.log(`Compilation score: ${(compiled.meta.score * 100).toFixed(1)}%`);

Choosing an Evaluator

exactMatch: Strict equality (best for classification)
partialMatch: Substring matching (good for QA)
arrayOverlap: Jaccard similarity for arrays
llmJudge: Use another LLM to evaluate (most flexible)

Full Example

qa-system.ts

import {
  defineSchema,
  ChainOfThought,
  BootstrapFewShot,
  partialMatch,
  createProvider,
  z,
} from "@mzhub/promptc";

const QASchema = defineSchema({
  description: "Answer questions based on provided context. Think step by step.",
  inputs: {
    context: z.string().describe("The text containing information"),
    question: z.string().describe("The question to answer"),
  },
  outputs: {
    answer: z.string().describe("The answer extracted from context"),
    confidence: z.number().min(0).max(1).describe("Confidence in the answer"),
  },
});

const provider = createProvider("openai", { apiKey: process.env.OPENAI_API_KEY });
const qaProgram = new ChainOfThought(QASchema, provider);

const trainset = [
  {
    input: { context: "The Eiffel Tower was completed in 1889...", question: "When was the Eiffel Tower completed?" },
    output: { answer: "1889", confidence: 0.95 },
  },
  // ... more examples
];

async function main() {
  // Run single inference to see reasoning
  const result = await qaProgram.run({
    context: "Python was created by Guido van Rossum and first released in 1991.",
    question: "Who created Python?",
  });

  console.log("Reasoning:", result.trace.reasoning);
  console.log("Answer:", result.result.answer);

  // Compile for better accuracy
  const compiler = new BootstrapFewShot(partialMatch());
  const compiled = await compiler.compile(qaProgram, trainset, {
    candidates: 5,
    concurrency: 2,
  });

  console.log(`Score: ${(compiled.meta.score * 100).toFixed(1)}%`);
}

main().catch(console.error);

Next: Multi-Provider Example →