Skip to content

Add ToolArgValidationMiddleware — a new agent middleware that validates LLM-generated tool-call arguments #36700

@Serjbory

Description

@Serjbory

Checked other resources

  • This is a feature request, not a bug report or usage question.
  • I added a clear and descriptive title that summarizes the feature request.
  • I used the GitHub search to find a similar feature request and didn't find it.
  • I checked the LangChain documentation and API reference to see if this feature already exists.
  • This is not related to the langchain-community package.

Package (Required)

  • langchain
  • langchain-openai
  • langchain-anthropic
  • langchain-classic
  • langchain-core
  • langchain-model-profiles
  • langchain-tests
  • langchain-text-splitters
  • langchain-chroma
  • langchain-deepseek
  • langchain-exa
  • langchain-fireworks
  • langchain-groq
  • langchain-huggingface
  • langchain-mistralai
  • langchain-nomic
  • langchain-ollama
  • langchain-openrouter
  • langchain-perplexity
  • langchain-qdrant
  • langchain-xai
  • Other / not sure / general

Feature Description

Feature Description

Add a ToolArgValidationMiddleware to the LangChain agent middleware system that validates LLM-generated tool-call arguments against each tool's schema before tool execution.

The middleware intercepts model responses via wrap_model_call / awrap_model_call and checks every tool call's arguments against the tool's declared schema. It supports two validation paths:

  • Pydantic-based tools (created with @tool or having a BaseModel args_schema): validated using BaseModel.model_validate
  • MCP / dict-schema tools (where args_schema is a raw JSON Schema dict): validated using jsonschema (soft dependency, Draft7Validator by default, configurable)

When validation fails, the middleware appends error ToolMessages describing what went wrong and re-invokes the model so it can self-correct. This retry loop runs entirely inside the model node — only the final valid AIMessage ever enters the graph state.

Configurable parameters:

  • max_retries (default 2) — number of validation-retry cycles per model invocation
  • strip_empty_values (default True) — recursively remove keys with None, {}, or [] values before validation (common LLM hallucination pattern)
  • json_schema_validator_class (default NoneDraft7Validator) — allows overriding the JSON Schema validator class for dict-based schemas

Use Case

Use Case

LLMs frequently generate tool calls with malformed arguments — missing required fields, wrong types, hallucinated empty values, or extra keys. Without validation, these bad arguments reach the tool node and cause runtime errors or silent data corruption that is hard to debug.

This is especially problematic with:

  • MCP tools that have complex JSON Schemas the LLM hasn't been fine-tuned on
  • Agentic loops where a single bad tool call can derail a multi-step task
  • Human-in-the-loop workflows where invalid arguments would be presented to users for approval before anyone notices they're malformed

Currently, users must implement their own validation logic or accept tool-level failures and hope the agent recovers. A built-in middleware provides a standard, reusable solution that validates and retries at the model boundary before tool execution or HITL happens (it makes sense, because you don't want provide obviously incorrect args to the user, that can be detected and fixed during inner validation against the tool schema).

Proposed Solution

Proposed Solution

Implement ToolArgValidationMiddleware as a new middleware in langchain.agents.middleware, following the same patterns as existing middlewares (ModelRetryMiddleware, ToolRetryMiddleware, ToolCallLimitMiddleware):

  1. Subclass AgentMiddleware with wrap_model_call and awrap_model_call
  2. Lazily resolve tool schemas from request.tools on first call (cached thereafter)
  3. After each model response, validate every tool_call in the AIMessage against its schema
  4. On failure, build ToolMessages with human-readable error descriptions, append them to the request, and re-invoke the model
  5. After max_retries exhausted, pass through the last response as-is (fail open)

Key design decisions:

  • jsonschema is a soft dependency — only imported when dict-schema tools are present, so it doesn't affect users who only use Pydantic tools
  • Schema resolution is lazy and cached — no overhead on subsequent calls
  • The middleware strips empty values before validation by default, since LLMs commonly hallucinate None/{}/[] for optional fields
  • Unknown tools (not in the schema map) pass through without validation

Alternatives Considered

Alternatives Considered

  • Tool-level validation: Let each tool validate its own input and raise errors. This works but the error surfaces after the tool node, requiring a full agent loop iteration to retry. The middleware catches errors earlier and retries within the model node.
  • Pydantic-only validation: Skip jsonschema support and only validate BaseModel schemas. This would miss MCP tools entirely, which are increasingly common.
  • Strict mode / structured output: Some providers support constrained decoding. This doesn't cover all providers or all schema types and isn't available for MCP tools with dict schemas.
  • Post-processing in a custom node: Users could add a validation node after the model node. This requires graph modification and doesn't benefit from the middleware's retry-within-model-call pattern.

Additional Context

Additional Context

  • Follows existing middleware conventions in langchain.agents.middleware — same base class, same wrap_model_call/awrap_model_call pattern, keyword-only __init__ parameters
  • 57 unit tests covering: initialization, strip logic, Pydantic validation, JSON Schema validation, error formatting, sync/async retry loops, batch mixed valid/invalid tool calls, schema resolution caching, and end-to-end tests with create_agent + FakeToolCallingModel
  • All checks pass: make format, make lint (ruff + mypy strict), make test
  • Related prior art: ModelRetryMiddleware handles model-level retries on exceptions; this middleware handles argument-level validation retries on schema violations

Metadata

Metadata

Assignees

No one assigned

    Labels

    externalfeature requestRequest for an enhancement / additional functionalitylangchain`langchain` package issues & PRs
    No fields configured for Feature.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions