LLMRing Python Library

llms.txt

GitHub: https://github.com/juanre/llmring

Python library to talk to OpenAI, Anthropic, Google, and Ollama with a unified interface. Configuration is stored in a version-controlled llmring.lock file (local to each codebase). Models are accessed via aliases.

Modes of Operation

  1. Lockfile-Only: Works completely standalone with just your llmring.lock. No backend required, no logging, no MCP persistence.
  2. With Server: Connect to self-hosted llmring-server for receipts, usage tracking, and MCP persistence.

Installation

uv add llmring

Quick Start

llmring lock init
llmring lock chat  # For intelligent conversational configuration
llmring bind summarizer anthropic:claude-3-haiku
llmring aliases
from llmring import LLMRing, LLMRequest, Message

ring = LLMRing()
request = LLMRequest(
  messages=[Message(role="user", content="Summarize this text")],
  model="summarizer"
)
response = await ring.chat(request)

Lockfile

  • Authoritative config; commit to VCS
  • Optional profiles for different environments: dev, staging, prod
  • Pinned registry versions per provider
# Registry version pinning (optional)
[registry_versions]
openai = 142
anthropic = 89

# Default bindings
[[bindings]]
alias = "summarizer"
models = ["anthropic:claude-3-haiku-20240307"]

[[bindings]]
alias = "balanced"
models = ["anthropic:claude-3-5-sonnet", "openai:gpt-4o"]

CLI

llmring lock init [--force]
llmring lock chat                      # conversational configuration
llmring bind <alias> <provider:model> [--profile <name>]
llmring aliases [--profile <name>]
llmring lock validate
llmring lock bump-registry
llmring list [--provider <name>]
llmring info <provider:model> [--json]
llmring stats|export                   # requires server
llmring mcp chat [--server URL]        # MCP interactive chat
llmring mcp servers list               # list MCP servers
llmring mcp tools                      # list MCP tools

CLI Output

llmring --help
usage: cli.py [-h] {lock,bind,aliases,list,chat,info,providers,push,pull,stats,export,register} ... LLMRing - Unified LLM Service CLI positional arguments: {lock,bind,aliases,list,chat,info,providers,push,pull,stats,export,register} Commands lock Lockfile management bind Bind an alias to a model aliases List aliases from lockfile list List available models chat Send a chat message info Show model information providers List configured providers push Push lockfile aliases to server (X-Project-Key required) pull Pull aliases from server into lockfile (X-Project-Key required) stats Show usage statistics export Export receipts to file register Register with LLMRing server (for SaaS features) options: -h, --help show this help message and exit
llmring providers
Configured Providers: ---------------------------------------- ✓ openai OPENAI_API_KEY ✓ anthropic ANTHROPIC_API_KEY ✗ google GOOGLE_API_KEY or GEMINI_API_KEY ✓ ollama (not required)
llmring list
Available Models: ---------------------------------------- ANTHROPIC: - claude-3-7-sonnet-20250219 - claude-3-7-sonnet - claude-3-5-sonnet-20241022-v2 - claude-3-5-sonnet-20241022 - claude-3-5-sonnet-20240620 - claude-3-5-sonnet - claude-3-5-haiku-20241022 - claude-3-5-haiku - claude-3-opus-20240229 - claude-3-sonnet-20240229 - claude-3-haiku-20240307 OPENAI: - gpt-4o - gpt-4o-mini - gpt-4o-2024-08-06 - gpt-4-turbo - gpt-4 - gpt-3.5-turbo - o1 - o1-mini OLLAMA: - llama3.3:latest - llama3.3 - llama3.2 - llama3.1 - llama3 - mistral - mixtral - codellama - phi3 - gemma2 - gemma - qwen2.5 - qwen

Lockfile workflow

llmring lock init
✅ Created lockfile: /Users/juanre/prj/llmring-all/llmring.ai/dist/docs-run/llmring.lock Default bindings: long_context → openai:gpt-4-turbo-preview low_cost → openai:gpt-3.5-turbo json_mode → openai:gpt-4-turbo-preview fast → openai:gpt-3.5-turbo deep → anthropic:claude-3-opus-20240229 balanced → anthropic:claude-3-sonnet-20240229 pdf_reader → anthropic:claude-3-sonnet-20240229 local → ollama:llama3.3:latest
llmring lock chat

Starts an interactive conversational configuration session for intelligent lockfile management.

llmring aliases
Aliases in profile 'default': ---------------------------------------- long_context → openai:gpt-4-turbo-preview low_cost → openai:gpt-3.5-turbo json_mode → openai:gpt-4-turbo-preview fast → openai:gpt-3.5-turbo deep → anthropic:claude-3-opus-20240229 balanced → anthropic:claude-3-sonnet-20240229 pdf_reader → anthropic:claude-3-sonnet-20240229 local → ollama:llama3.3:latest
llmring bind summarizer anthropic:claude-3-haiku
✅ Bound 'summarizer' → 'anthropic:claude-3-haiku' in profile 'default'
llmring aliases
Aliases in profile 'default': ---------------------------------------- long_context → openai:gpt-4-turbo-preview low_cost → openai:gpt-3.5-turbo json_mode → openai:gpt-4-turbo-preview fast → openai:gpt-3.5-turbo deep → anthropic:claude-3-opus-20240229 balanced → anthropic:claude-3-sonnet-20240229 pdf_reader → anthropic:claude-3-sonnet-20240229 local → ollama:llama3.3:latest summarizer → anthropic:claude-3-haiku

Registry Integration

  • Fetches model capabilities/pricing from https://llmring.github.io/registry/
  • Models keyed as provider:model
  • Fields include max_input_tokens, max_output_tokens, dollars_per_million_tokens_*, and capability flags including supports_parallel_tool_calls.

Receipts

  • Local: library can calculate costs and create unsigned receipt objects
  • Canonical: signed by server using Ed25519 over RFC 8785 JCS

Environment

export OPENAI_API_KEY=...
export ANTHROPIC_API_KEY=...
export GEMINI_API_KEY=...         # or GOOGLE_API_KEY=... or GOOGLE_GEMINI_API_KEY=...
export LLMRING_PROFILE=prod

Security

  • Lockfile contains no secrets
  • API keys via environment only

API Reference

Schemas

Message

{
  "role": "system | user | assistant | tool",
  "content": "string or structured content",
  "tool_calls": [ { "id": "...", "type": "...", "function": { "name": "...", "arguments": { } } } ],
  "tool_call_id": "optional",
  "timestamp": "ISO-8601 optional"
}

LLMRequest

{
  "messages": [ Message ],
  "model": "provider:model or alias",
  "temperature": 0.0,
  "max_tokens": 1024,
  "response_format": { },
  "tools": [ { } ],
  "tool_choice": "auto | none | any | { function: name }",
  "cache": { },
  "metadata": { },
  "json_response": true,
  "stream": false,
  "extra_params": { }
}

LLMResponse

{
  "content": "string",
  "model": "provider:model",
  "parsed": { },
  "usage": {
    "prompt_tokens": 0,
    "completion_tokens": 0,
    "total_tokens": 0,
    "cost": 0.000123,               // if registry pricing available
    "cost_breakdown": { "input": 0.0, "output": 0.0 }
  },
  "finish_reason": "stop | length | tool_calls | ...",
  "tool_calls": [ { } ]
}

StreamChunk

{
  "delta": "partial text",
  "model": "provider:model",
  "finish_reason": null,
  "usage": null,
  "tool_calls": [ ]
}

Class: LLMRing

Constructor:

LLMRing(origin: str = "llmring", registry_url: str | None = None, lockfile_path: str | None = None)

Methods:

  • async chat(request: LLMRequest, profile: str | None = None) -> LLMResponse

    • Resolves aliases via lockfile, routes call to provider, enriches usage.cost when registry pricing is available, records a local unsigned receipt if lockfile present.
  • async chat_with_alias(alias_or_model: str, messages: list, temperature: float | None = None, max_tokens: int | None = None, profile: str | None = None, **kwargs) -> LLMResponse

  • resolve_alias(alias_or_model: str, profile: str | None = None) -> str

  • bind_alias(alias: str, model: str, profile: str | None = None) -> None

  • unbind_alias(alias: str, profile: str | None = None) -> None

  • list_aliases(profile: str | None = None) -> dict[str, str]

  • init_lockfile(force: bool = False) -> None

  • get_available_models() -> dict[str, list[str]]

  • get_model_info(model: str) -> dict

  • async get_enhanced_model_info(model: str) -> dict

  • async validate_context_limit(request: LLMRequest) -> str | None

  • async calculate_cost(response: LLMResponse) -> dict | None

  • async close() -> None

Lockfile API

Classes:

  • AliasBinding { alias, provider, model, constraints? } with model_ref property.
  • ProfileConfig { name, bindings[], registry_versions{} }
    • set_binding(alias, model_ref, constraints?)
    • remove_binding(alias) -> bool
    • get_binding(alias) -> AliasBinding | None
  • Lockfile { version, created_at, updated_at, default_profile, profiles{} }
    • @classmethod create_default() -> Lockfile
    • save(path: Path | None = None) -> None
    • @classmethod load(path: Path | None = None) -> Lockfile
    • @classmethod find_lockfile(start_path: Path | None = None) -> Path | None
    • calculate_digest() -> str
    • get_profile(name: str | None = None) -> ProfileConfig
    • set_binding(alias, model_ref, profile: str | None = None, constraints: dict | None = None)
    • resolve_alias(alias, profile: str | None = None) -> str | None

Registry Client

  • RegistryModel fields: provider, model_name, display_name, description?, max_input_tokens?, max_output_tokens?, dollars_per_million_tokens_input?, dollars_per_million_tokens_output?, supports_vision, supports_function_calling, supports_json_mode, supports_parallel_tool_calls, is_active, added_date?, deprecated_date?.

  • RegistryClient(registry_url: str | None = None, cache_dir: Path | None = None)

    • async fetch_current_models(provider: str) -> list[RegistryModel]
    • async fetch_version(provider: str, version: int) -> RegistryVersion
    • async get_current_version(provider: str) -> int
    • async check_drift(provider: str, pinned_version: int) -> dict
    • async validate_model(provider: str, model_name: str) -> bool
    • clear_cache() -> None

Receipts

  • Receipt fields: receipt_id, timestamp, alias, profile, lock_digest, provider, model, prompt_tokens, completion_tokens, total_tokens, input_cost, output_cost, total_cost, signature?.
  • ReceiptSigner: generate_keypair(), load_private_key(), load_public_key(), sign_receipt(receipt) -> str, verify_receipt(receipt, public_key) -> bool, export_private_key(), export_public_key().
  • ReceiptGenerator: generate_receipt(...) -> Receipt, calculate_costs(provider, model, prompt_tokens, completion_tokens, model_info?) -> dict.

Providers

All providers implement BaseLLMProvider:

async def chat(messages, model, temperature=None, max_tokens=None, response_format=None, tools=None, tool_choice=None, json_response=None, cache=None, stream=False, extra_params=None) -> LLMResponse | AsyncIterator[StreamChunk]
async def validate_model(model: str) -> bool
async def get_supported_models() -> list[str]
def get_default_model() -> str

Structured Output (Unified)

LLMRing provides a single interface for JSON Schema across providers:

request = LLMRequest(
  model="balanced",
  messages=[Message(role="user", content="Generate a person")],
  response_format={
    "type": "json_schema",
    "json_schema": {"name": "person", "schema": {"type": "object", "properties": {"name": {"type": "string"}}}},
    "strict": True
  }
)
response = await ring.chat(request)
print(response.parsed)

OpenAI uses native JSON Schema; Anthropic/Gemini use native tools/functions under the hood; Ollama uses best‑effort JSON with one repair attempt.