Skip to content

feat(examples): add simple-sampling server and client#2476

Open
trentisiete wants to merge 1 commit intomodelcontextprotocol:mainfrom
trentisiete:feat/simple-sampling-example
Open

feat(examples): add simple-sampling server and client#2476
trentisiete wants to merge 1 commit intomodelcontextprotocol:mainfrom
trentisiete:feat/simple-sampling-example

Conversation

@trentisiete
Copy link
Copy Markdown

Adds an end-to-end sampling example to examples/, addressing #1205.

Two new workspace members:

  • examples/servers/simple-sampling/ — minimal server with one tool (write_story) that calls session.create_message(...) populating every advisory field of CreateMessageRequestParams: model preferences with hints + cost/speed/intelligence priorities, system prompt, temperature, stop sequences, include_context, metadata.
  • examples/clients/simple-sampling-client/ — wires sampling_callback to a real LLM via any OpenAI-compatible endpoint. Default is Groq (free tier). Switching to OpenAI / OpenRouter / Ollama (/v1) / vLLM is 1-3 env vars; no extra dependencies.

Context

There's already #1436 from @yarnabrina addressing this issue. Their PR has been waiting on review for months and they've explicitly given the green light for an independent PR (see the recent comment thread on #1205). This one tries to land the same goal while incorporating the feedback their PR received and never got a chance to resolve:

  • No dependency on the openai SDK; uses httpx against the OpenAI-compatible /chat/completions schema.
  • LLMClient abstraction matching the simple-chatbot pattern that was requested in their review — the sampling callback knows nothing about httpx, and the LLM wrapper knows nothing about MCP.

Design notes

  • Model selection: first usable ModelHint.name is treated as a soft override, falling back to LLM_MODEL. The numeric priorities (cost/speed/intelligence) get logged but not used for routing — picking a model from those would require a provider-specific catalog, which is out of scope for an example.
  • LLM failures return types.ErrorData(code=INTERNAL_ERROR, ...) rather than raising, so the server gets a readable error instead of a transport-level one.
  • Non-text content blocks (image/audio) become [<type> content omitted] placeholders rather than being silently dropped. A production client would forward or explicitly reject them.

Verification

End-to-end check against a local ThreadingHTTPServer stub of /chat/completions that captures the request body and returns a canned story. Asserts that:

  • The selected model is the first hint (llama-3.1-8b).
  • max_tokens=200, temperature=0.8, stop=["THE END"], the system prompt, the user message and the metadata round-trip correctly.
  • The canned response surfaces in the client stdout.

The verification harness itself is not part of the PR. Repo checks pass:

  • uv run ruff format --check: clean
  • uv run ruff check: 0 issues
  • uv run pyright: 0 errors, 0 warnings, 0 informations

Try it

export LLM_API_KEY=<your_key>
uv sync --all-packages
uv run mcp-simple-sampling-client --topic "a lighthouse keeper"

Full env var table in examples/clients/simple-sampling-client/README.md.

Addresses modelcontextprotocol#1205 by adding two workspace members that demonstrate an
end-to-end sampling handshake with a real LLM.

examples/servers/simple-sampling exposes a write_story tool whose
handler issues sampling/createMessage with every advisory field set
(modelPreferences hints + priorities, systemPrompt, temperature,
maxTokens, stopSequences, includeContext, metadata).

examples/clients/simple-sampling-client wires a sampling_callback onto
ClientSession. It maps SamplingMessage into an OpenAI-style chat
payload, treats the first ModelHint as a soft override, logs the
numeric priorities and the includeContext hook so multi-server clients
can see where to inject context, and surfaces provider failures as
ErrorData rather than raising.

The client speaks the OpenAI-compatible /chat/completions schema via
httpx, so it runs against OpenAI, Groq, OpenRouter, Ollama, vLLM, or
any other gateway that honours the contract. Provider is picked via
LLM_API_KEY / LLM_API_BASE_URL / LLM_MODEL env vars to avoid pinning a
provider-specific SDK.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant