Lesson 5 · 5 min
Testing Tools in Isolation
Run tools end-to-end without invoking Claude.
Tools are normal functions — unit-test them with realistic inputs and assert on outputs. Then test the *schema* with a small Claude call that just produces a tool argument; assert the argument validates. Separate concerns means failures point at the right layer.
Production scenario
Real-world example: A send_email tool in CI
The team's email tool gets two layers of tests:
// 1. Pure-function unit test against a mocked SES client.
test("send_email assembles MIME and calls SES once", async () => {
const ses = mockSes();
await sendEmail({ to: "x@example.com", subject: "hi", body: "hello" }, ses);
expect(ses.send).toHaveBeenCalledTimes(1);
});
// 2. Schema probe — a tiny Claude call producing a tool argument.
test("Claude produces a valid send_email argument", async () => {
const arg = await claudeProbe("Email Bob saying we'll be late.");
expect(SendEmailArgs.safeParse(arg).success).toBe(true);
});Layer 1 tests the function. Layer 2 tests the schema's communicability. They fail for different reasons and point you at different fixes.
Why this matters: tool tests catch implementation bugs. Schema tests catch documentation bugs. You want both.
Knowledge points in this lesson
- Unit-test tool functions in isolation
- Schema correctness is a separate check
- Probe schema with a small Claude call
- Separate layers for clearer failure diagnosis
- Mock external services in tool tests
Quick check
Tool Design & MCPSelect one
Which is the BEST way to test a tool independently of an LLM?
