Model Context Protocol (MCP)

llms.txt

LLMRing provides comprehensive MCP (Model Context Protocol) support for standardized tool orchestration and resource management. MCP enables LLMs to interact with external tools, resources, and data sources through a unified protocol.

Key Features

  • MCP Chat Client: Interactive terminal application with persistent conversation history
  • Enhanced LLM: Automatic tool discovery and execution integrated into chat flows
  • Lockfile Management: Conversational configuration via MCP for intelligent alias setup
  • Multiple Transports: HTTP, WebSocket, and stdio connections to MCP servers
  • Streaming Support: Tool calls work seamlessly with streaming responses
  • Custom Servers: Build your own MCP servers to expose tools to LLMs
  • Persistence: Optional server-backed storage (requires llmring-server)

Choosing Your MCP Interface

LLMRing provides two ways to work with MCP:

FeatureMCP Chat ClientEnhanced LLM
Best ForInteractive terminal sessions, configurationProgrammatic integration, applications
InterfaceCommand-line chat applicationPython API
HistoryAutomatic persistent history in ~/.llmring/mcp_chat/Custom management needed
Session ManagementBuilt-in session saving/loadingManual implementation
Tool DiscoveryAutomatic with /tools commandAutomatic via API
StreamingReal-time terminal outputAsyncIterator for custom handling
Use CasesLockfile configuration
Interactive exploration
CLI tools
Quick testing
Production applications
Automated workflows
Custom integrations
Batch processing

Quick Decision Guide:

  • Use Chat Client for interactive configuration, exploration, or testing
  • Use Enhanced LLM for MCP capabilities in your Python applications

Quick Start

MCP Chat Client (Interactive)

# Conversational lockfile configuration (built-in)
llmring lock chat

# Connect to custom MCP server
llmring mcp chat --server "stdio://python -m your_mcp_server"

# HTTP server
llmring mcp chat --server "http://localhost:8080"

# WebSocket server
llmring mcp chat --server "ws://localhost:8080"

Command Line Options:

llmring mcp chat [OPTIONS]

Options:
  --server TEXT       MCP server URL (stdio://, http://, ws://)
  --model TEXT        LLM model alias to use (default: advisor)
  --no-telemetry      Disable telemetry
  --debug             Enable debug logging

Enhanced LLM (Programmatic)

from llmring.mcp.client.enhanced_llm import create_enhanced_llm

# Create enhanced LLM with MCP tools
llm = await create_enhanced_llm(
    model="balanced",
    mcp_server_path="stdio://python -m my_mcp_server"
)

# Chat with automatic tool execution
messages = [{"role": "user", "content": "Help me with my files"}]
response = await llm.chat(messages)

print(response.content)
if response.tool_calls:
    print(f"Used tools: {[call['function']['name'] for call in response.tool_calls]}")

Built-in Chat Commands

CommandDescription
/helpDisplay all available commands
/historyShow current conversation history
/sessionsList all saved chat sessions
/load <session_id>Load and resume a previous session
/clearClear the current conversation
/model <alias>Switch to a different model
/toolsList available MCP tools from the server
/exit or /quitExit the chat client

Conversational Lockfile Configuration

The most powerful feature of MCP in LLMRing is conversational lockfile management:

llmring lock chat

This starts an interactive session where you can:

  • Describe requirements in natural language
  • Get recommendations based on current registry
  • Understand cost implications and tradeoffs
  • Configure aliases with fallback models
  • Set up environment-specific profiles

Example conversation:

You: I need a configuration for a coding assistant that prioritizes accuracy
Assistant: I'll help you configure an accurate coding assistant. Based on the registry,
I recommend using Claude 3.5 Sonnet as the primary model with GPT-4o as fallback.

[Calling tool: add_alias]
Added alias 'coder' with models: anthropic:claude-3-5-sonnet, openai:gpt-4o

This configuration prioritizes accuracy while providing fallback for availability.
Monthly cost estimate: ~$50-100 for moderate usage.

You: Add a cheaper option for simple tasks
Assistant: I'll add a cost-effective alias for simpler coding tasks.

[Calling tool: add_alias]
Added alias 'coder-fast' with model: openai:gpt-4o-mini

This model is 10x cheaper and perfect for simple completions, syntax fixes, and
basic code generation.

Persistent History

All conversations are automatically saved in ~/.llmring/mcp_chat/:

~/.llmring/mcp_chat/
├── command_history.txt              # Terminal command history
├── conversation_<session_id>.json   # Individual conversations
└── sessions.json                    # Session metadata and index

Each session includes:

  • Unique session ID and timestamp
  • Complete message history
  • Tool calls and their results
  • Model used for each response

Connecting to MCP Servers

Stdio Servers (Local Processes)

Most common for development and local tools:

# Python MCP server
llmring mcp chat --server "stdio://python -m mypackage.mcp_server"

# Node.js MCP server
llmring mcp chat --server "stdio://node my-mcp-server.js"

# Any executable
llmring mcp chat --server "stdio:///usr/local/bin/my-mcp-tool"

HTTP Servers

For REST API-based MCP servers:

# Local development
llmring mcp chat --server "http://localhost:8080"

# Remote server
llmring mcp chat --server "https://api.example.com/mcp"

WebSocket Servers

For real-time, bidirectional communication:

# WebSocket connection
llmring mcp chat --server "ws://localhost:8080"

# Secure WebSocket
llmring mcp chat --server "wss://mcp.example.com"

Creating Custom MCP Servers

Build your own MCP servers to expose tools to LLMs:

Simple Python Example

#!/usr/bin/env python3
"""my_mcp_server.py - Custom MCP server example"""

import asyncio
from llmring.mcp.server import MCPServer
from llmring.mcp.server.transport.stdio import StdioTransport

# Create server
server = MCPServer(
    name="My Custom Tools",
    version="1.0.0"
)

# Register tools
@server.function_registry.register(
    name="get_weather",
    description="Get weather for a location"
)
def get_weather(location: str) -> dict:
    return {
        "location": location,
        "temperature": 72,
        "conditions": "sunny"
    }

@server.function_registry.register(
    name="calculate",
    description="Perform calculations"
)
def calculate(expression: str) -> dict:
    try:
        # Use ast.literal_eval in production for safety
        result = eval(expression)
        return {"result": result}
    except Exception as e:
        return {"error": str(e)}

# Run server
async def main():
    transport = StdioTransport()
    await server.run(transport)

if __name__ == "__main__":
    asyncio.run(main())

Connect to your server:

llmring mcp chat --server "stdio://python my_mcp_server.py"

Enhanced LLM with MCP

For programmatic usage with automatic tool execution:

Basic Usage

from llmring.mcp.client.enhanced_llm import create_enhanced_llm

# Create enhanced LLM
llm = await create_enhanced_llm(
    model="balanced",
    mcp_server_path="stdio://python -m my_mcp_server"
)

# Chat with automatic tool execution
messages = [{"role": "user", "content": "What's the weather in NYC?"}]
response = await llm.chat(messages)

# Tools are called automatically and results integrated
print(response.content)  # "The weather in NYC is 72°F and sunny."

Streaming with Tools

# Streaming works seamlessly with tool calls
messages = [{"role": "user", "content": "Analyze this file and summarize it"}]

async for chunk in await llm.chat_stream(messages):
    if chunk.type == "content":
        print(chunk.content, end="", flush=True)
    elif chunk.type == "tool_call":
        print(f"\n[Calling tool: {chunk.tool_call.name}]")
    elif chunk.type == "tool_result":
        print(f"\n[Tool result received]")

Direct MCP Client Usage

from llmring.mcp.client.mcp_client import MCPClient

# Connect to MCP server
client = MCPClient("http://localhost:8000")
await client.initialize()

# List available tools
tools = await client.list_tools()
for tool in tools:
    print(f"- {tool['name']}: {tool['description']}")

# Execute a tool directly
result = await client.call_tool(
    "read_file",
    {"path": "/path/to/file.txt"}
)
print(result)

# Clean up
await client.close()

Best Practices

Server Design Philosophy

When creating MCP servers:

  1. Data-Focused Tools: Design tools to provide data and perform actions, not make decisions
  2. LLM in Driver’s Seat: Let the LLM decide how to use tools based on user intent
  3. Clear Tool Names: Use descriptive, action-oriented names
  4. Comprehensive Schemas: Provide detailed parameter descriptions
  5. Error Handling: Return informative error messages

Security Considerations

  1. Validate Input: Always validate tool parameters
  2. Limit Scope: Tools should have minimal necessary permissions
  3. Secure Transport: Use HTTPS/WSS in production
  4. Authentication: Implement proper auth for production servers
  5. Audit Logging: Log tool usage for security monitoring

Performance Tips

  1. Choose Appropriate Models: Match model capability to task complexity
  2. Cache Results: Implement caching in MCP servers for expensive operations
  3. Streaming Responses: Use streaming for long-running operations
  4. Batch Operations: Design tools to handle batch requests when possible

Troubleshooting

Server won’t start:

  • Check the server path is correct
  • Ensure proper permissions
  • Verify Python/Node.js environment is activated

Tools not appearing:

  • Run /tools to refresh tool list
  • Check server logs for registration errors
  • Verify tool schemas are valid

History not saving:

  • Check permissions on ~/.llmring/mcp_chat/
  • Ensure enough disk space
  • Look for errors in debug mode (--debug)

Connection errors:

  • Verify server is running
  • Check firewall/network settings
  • Ensure correct protocol (stdio/http/ws)

Further Resources