Model Context Protocol (MCP)

llms.txt

LLMRing provides comprehensive MCP (Model Context Protocol) support for standardized tool orchestration and resource management. MCP enables LLMs to interact with external tools, resources, and data sources through a unified protocol.

Key Features

MCP Chat Client: Interactive terminal application with persistent conversation history
Enhanced LLM: Automatic tool discovery and execution integrated into chat flows
Lockfile Management: Conversational configuration via MCP for intelligent alias setup
Multiple Transports: HTTP, WebSocket, and stdio connections to MCP servers
Streaming Support: Tool calls work seamlessly with streaming responses
Custom Servers: Build your own MCP servers to expose tools to LLMs
Persistence: Optional server-backed storage (requires llmring-server)

Choosing Your MCP Interface

LLMRing provides two ways to work with MCP:

Feature	MCP Chat Client	Enhanced LLM
Best For	Interactive terminal sessions, configuration	Programmatic integration, applications
Interface	Command-line chat application	Python API
History	Automatic persistent history in `~/.llmring/mcp_chat/`	Custom management needed
Session Management	Built-in session saving/loading	Manual implementation
Tool Discovery	Automatic with `/tools` command	Automatic via API
Streaming	Real-time terminal output	AsyncIterator for custom handling
Use Cases	Lockfile configuration Interactive exploration CLI tools Quick testing	Production applications Automated workflows Custom integrations Batch processing

Quick Decision Guide:

Use Chat Client for interactive configuration, exploration, or testing
Use Enhanced LLM for MCP capabilities in your Python applications

Quick Start

MCP Chat Client (Interactive)

# Conversational lockfile configuration (built-in)
llmring lock chat

# Connect to custom MCP server
llmring mcp chat --server "stdio://python -m your_mcp_server"

# HTTP server
llmring mcp chat --server "http://localhost:8080"

# WebSocket server
llmring mcp chat --server "ws://localhost:8080"

Command Line Options:

llmring mcp chat [OPTIONS]

Options:
  --server TEXT       MCP server URL (stdio://, http://, ws://)
  --model TEXT        LLM model alias to use (default: advisor)
  --no-telemetry      Disable telemetry
  --debug             Enable debug logging

Enhanced LLM (Programmatic)

from llmring.mcp.client.enhanced_llm import create_enhanced_llm

# Create enhanced LLM with MCP tools
llm = await create_enhanced_llm(
    model="balanced",
    mcp_server_path="stdio://python -m my_mcp_server"
)

# Chat with automatic tool execution
messages = [{"role": "user", "content": "Help me with my files"}]
response = await llm.chat(messages)

print(response.content)
if response.tool_calls:
    print(f"Used tools: {[call['function']['name'] for call in response.tool_calls]}")

Built-in Chat Commands

Command	Description
`/help`	Display all available commands
`/history`	Show current conversation history
`/sessions`	List all saved chat sessions
`/load <session_id>`	Load and resume a previous session
`/clear`	Clear the current conversation
`/model <alias>`	Switch to a different model
`/tools`	List available MCP tools from the server
`/exit` or `/quit`	Exit the chat client

Conversational Lockfile Configuration

The most powerful feature of MCP in LLMRing is conversational lockfile management:

llmring lock chat

This starts an interactive session where you can:

Describe requirements in natural language
Get recommendations based on current registry
Understand cost implications and tradeoffs
Configure aliases with fallback models
Set up environment-specific profiles

Example conversation:

You: I need a configuration for a coding assistant that prioritizes accuracy
Assistant: I'll help you configure an accurate coding assistant. Based on the registry,
I recommend using Claude 3.5 Sonnet as the primary model with GPT-4o as fallback.

[Calling tool: add_alias]
Added alias 'coder' with models: anthropic:claude-3-5-sonnet, openai:gpt-4o

This configuration prioritizes accuracy while providing fallback for availability.
Monthly cost estimate: ~$50-100 for moderate usage.

You: Add a cheaper option for simple tasks
Assistant: I'll add a cost-effective alias for simpler coding tasks.

[Calling tool: add_alias]
Added alias 'coder-fast' with model: openai:gpt-4o-mini

This model is 10x cheaper and perfect for simple completions, syntax fixes, and
basic code generation.

Persistent History

All conversations are automatically saved in ~/.llmring/mcp_chat/:

~/.llmring/mcp_chat/
├── command_history.txt              # Terminal command history
├── conversation_<session_id>.json   # Individual conversations
└── sessions.json                    # Session metadata and index

Each session includes:

Unique session ID and timestamp
Complete message history
Tool calls and their results
Model used for each response

Connecting to MCP Servers

Stdio Servers (Local Processes)

Most common for development and local tools:

# Python MCP server
llmring mcp chat --server "stdio://python -m mypackage.mcp_server"

# Node.js MCP server
llmring mcp chat --server "stdio://node my-mcp-server.js"

# Any executable
llmring mcp chat --server "stdio:///usr/local/bin/my-mcp-tool"

HTTP Servers

For REST API-based MCP servers:

# Local development
llmring mcp chat --server "http://localhost:8080"

# Remote server
llmring mcp chat --server "https://api.example.com/mcp"

WebSocket Servers

For real-time, bidirectional communication:

# WebSocket connection
llmring mcp chat --server "ws://localhost:8080"

# Secure WebSocket
llmring mcp chat --server "wss://mcp.example.com"

Creating Custom MCP Servers

Build your own MCP servers to expose tools to LLMs:

Simple Python Example

#!/usr/bin/env python3
"""my_mcp_server.py - Custom MCP server example"""

import asyncio
from llmring.mcp.server import MCPServer
from llmring.mcp.server.transport.stdio import StdioTransport

# Create server
server = MCPServer(
    name="My Custom Tools",
    version="1.0.0"
)

# Register tools
@server.function_registry.register(
    name="get_weather",
    description="Get weather for a location"
)
def get_weather(location: str) -> dict:
    return {
        "location": location,
        "temperature": 72,
        "conditions": "sunny"
    }

@server.function_registry.register(
    name="calculate",
    description="Perform calculations"
)
def calculate(expression: str) -> dict:
    try:
        # Use ast.literal_eval in production for safety
        result = eval(expression)
        return {"result": result}
    except Exception as e:
        return {"error": str(e)}

# Run server
async def main():
    transport = StdioTransport()
    await server.run(transport)

if __name__ == "__main__":
    asyncio.run(main())

Connect to your server:

llmring mcp chat --server "stdio://python my_mcp_server.py"

Enhanced LLM with MCP

For programmatic usage with automatic tool execution:

Basic Usage

from llmring.mcp.client.enhanced_llm import create_enhanced_llm

# Create enhanced LLM
llm = await create_enhanced_llm(
    model="balanced",
    mcp_server_path="stdio://python -m my_mcp_server"
)

# Chat with automatic tool execution
messages = [{"role": "user", "content": "What's the weather in NYC?"}]
response = await llm.chat(messages)

# Tools are called automatically and results integrated
print(response.content)  # "The weather in NYC is 72°F and sunny."

Streaming with Tools

# Streaming works seamlessly with tool calls
messages = [{"role": "user", "content": "Analyze this file and summarize it"}]

async for chunk in await llm.chat_stream(messages):
    if chunk.type == "content":
        print(chunk.content, end="", flush=True)
    elif chunk.type == "tool_call":
        print(f"\n[Calling tool: {chunk.tool_call.name}]")
    elif chunk.type == "tool_result":
        print(f"\n[Tool result received]")

Direct MCP Client Usage

from llmring.mcp.client.mcp_client import MCPClient

# Connect to MCP server
client = MCPClient("http://localhost:8000")
await client.initialize()

# List available tools
tools = await client.list_tools()
for tool in tools:
    print(f"- {tool['name']}: {tool['description']}")

# Execute a tool directly
result = await client.call_tool(
    "read_file",
    {"path": "/path/to/file.txt"}
)
print(result)

# Clean up
await client.close()

Best Practices

Server Design Philosophy

When creating MCP servers:

Data-Focused Tools: Design tools to provide data and perform actions, not make decisions
LLM in Driver’s Seat: Let the LLM decide how to use tools based on user intent
Clear Tool Names: Use descriptive, action-oriented names
Comprehensive Schemas: Provide detailed parameter descriptions
Error Handling: Return informative error messages

Security Considerations

Validate Input: Always validate tool parameters
Limit Scope: Tools should have minimal necessary permissions
Secure Transport: Use HTTPS/WSS in production
Authentication: Implement proper auth for production servers
Audit Logging: Log tool usage for security monitoring

Performance Tips

Choose Appropriate Models: Match model capability to task complexity
Cache Results: Implement caching in MCP servers for expensive operations
Streaming Responses: Use streaming for long-running operations
Batch Operations: Design tools to handle batch requests when possible

Troubleshooting

Server won’t start:

Check the server path is correct
Ensure proper permissions
Verify Python/Node.js environment is activated

Tools not appearing:

Run /tools to refresh tool list
Check server logs for registration errors
Verify tool schemas are valid

History not saving:

Check permissions on ~/.llmring/mcp_chat/
Ensure enough disk space
Look for errors in debug mode (--debug)

Connection errors:

Verify server is running
Check firewall/network settings
Ensure correct protocol (stdio/http/ws)

Further Resources

Example MCP servers: See examples/mcp/ in the llmring repository
MCP specification: Model Context Protocol
LLMRing MCP source: src/llmring/mcp/ in the repository