QA System Example
Build a question-answering system that shows its reasoning before answering. Uses Chain of Thought for step-by-step thinking.
What You'll Learn
- How to build context-aware QA with multiple inputs
- How Chain of Thought exposes the model's reasoning
- How to use
partialMatchfor flexible evaluation - How to output confidence scores
1. Define the Schema
Our QA system takes a context paragraph and a question, then outputs an answer with a confidence score.
import {
defineSchema,
ChainOfThought,
BootstrapFewShot,
partialMatch,
createProvider,
z,
} from "@mzhub/promptc";
const QASchema = defineSchema({
description: "Answer questions based on provided context. Think step by step.",
inputs: {
context: z.string().describe("The text containing information"),
question: z.string().describe("The question to answer"),
},
outputs: {
answer: z.string().describe("The answer extracted from context"),
confidence: z.number().min(0).max(1).describe("Confidence in the answer"),
},
});Field Descriptions
Adding
.describe() to your Zod fields helps the LLM understand what each field represents, leading to better outputs.2. Create the Program
ChainOfThought adds a reasoning step, which is especially useful for QA tasks that require understanding context.
const provider = createProvider("openai", {
apiKey: process.env.OPENAI_API_KEY,
});
const qaProgram = new ChainOfThought(QASchema, provider);3. Run a Single Inference
Before compiling, let's run a single inference to see the reasoning trace:
const result = await qaProgram.run({
context: "Python was created by Guido van Rossum and first released in 1991. It emphasizes code readability.",
question: "Who created Python?",
});
console.log("Question: Who created Python?");
console.log("Reasoning:", result.trace.reasoning);
console.log("Answer:", result.result.answer);
console.log("Confidence:", `${(result.result.confidence * 100).toFixed(0)}%`);Example output:
Question: Who created Python?
Reasoning: The context states that "Python was created by Guido van Rossum".
This directly answers the question about who created Python.
Answer: Guido van Rossum
Confidence: 95%4. Prepare Training Data
Create examples with varying contexts and questions:
const trainset = [
{
input: {
context: "The Eiffel Tower was completed in 1889. It stands 330 meters tall and is located in Paris, France.",
question: "When was the Eiffel Tower completed?",
},
output: { answer: "1889", confidence: 0.95 },
},
{
input: {
context: "Apple Inc. was founded by Steve Jobs, Steve Wozniak, and Ronald Wayne in 1976.",
question: "Who founded Apple?",
},
output: {
answer: "Steve Jobs, Steve Wozniak, and Ronald Wayne",
confidence: 0.95,
},
},
{
input: {
context: "The speed of light is approximately 299,792 kilometers per second in a vacuum.",
question: "What is the speed of light?",
},
output: {
answer: "approximately 299,792 kilometers per second",
confidence: 0.9,
},
},
];5. Compile for Better Accuracy
Use partialMatch for QA since answers don't need to be exact matches—similar answers should score well.
const compiler = new BootstrapFewShot(partialMatch());
const compiled = await compiler.compile(qaProgram, trainset, {
candidates: 5,
concurrency: 2,
});
console.log(`Compilation score: ${(compiled.meta.score * 100).toFixed(1)}%`);Choosing an Evaluator
- exactMatch: Strict equality (best for classification)
- partialMatch: Substring matching (good for QA)
- arrayOverlap: Jaccard similarity for arrays
- llmJudge: Use another LLM to evaluate (most flexible)
Full Example
qa-system.ts
import {
defineSchema,
ChainOfThought,
BootstrapFewShot,
partialMatch,
createProvider,
z,
} from "@mzhub/promptc";
const QASchema = defineSchema({
description: "Answer questions based on provided context. Think step by step.",
inputs: {
context: z.string().describe("The text containing information"),
question: z.string().describe("The question to answer"),
},
outputs: {
answer: z.string().describe("The answer extracted from context"),
confidence: z.number().min(0).max(1).describe("Confidence in the answer"),
},
});
const provider = createProvider("openai", { apiKey: process.env.OPENAI_API_KEY });
const qaProgram = new ChainOfThought(QASchema, provider);
const trainset = [
{
input: { context: "The Eiffel Tower was completed in 1889...", question: "When was the Eiffel Tower completed?" },
output: { answer: "1889", confidence: 0.95 },
},
// ... more examples
];
async function main() {
// Run single inference to see reasoning
const result = await qaProgram.run({
context: "Python was created by Guido van Rossum and first released in 1991.",
question: "Who created Python?",
});
console.log("Reasoning:", result.trace.reasoning);
console.log("Answer:", result.result.answer);
// Compile for better accuracy
const compiler = new BootstrapFewShot(partialMatch());
const compiled = await compiler.compile(qaProgram, trainset, {
candidates: 5,
concurrency: 2,
});
console.log(`Score: ${(compiled.meta.score * 100).toFixed(1)}%`);
}
main().catch(console.error);Next: Multi-Provider Example →