CloudLLM v0.10: from a simple LLM wrapper to a Multi-Agent Orchestration framework

February 11th, 2026

CloudLLM has evolved dramatically over three consecutive releases (v0.8.0 through v0.10.0) into a comprehensive, production-ready platform for building autonomous multi-agent systems.

What began as an LLM wrapper request-response pattern library has grown into a sophisticated orchestration engine with seven distinct collaboration modes, real-time event observability, atomic task coordination primitives, and a rich tool ecosystem.

This major release cycle introduces AnthropicAgentTeams mode (inspired by Anthropic’s own approach to collaborative AI), a unified event system for real-time agent introspection, and HttpClientProtocol—enabling agents to research and coordinate via the web. Combined with the previous release’s RALPH mode and image generation support, CloudLLM now empowers teams to build AI systems as sophisticated as their imagination.

Part 1: RALPH Mode & The Game-Builder Architecture (v0.8.0)

The Problem We Solved

Building complex software via LLM agents requires structured work tracking—the ability to break down a large goal into discrete tasks, have agents iterate through them sequentially or in parallel, and track completion automatically. Traditional approaches either relied on manual task lists or unstructured agent banter.

RALPH: PRD-Driven Autonomous Orchestration

v0.8.0 introduced RALPH mode—a 6th orchestration collaboration pattern named after Ralph Wiggum for its wonderfully earnest determination to get through a checklist.

RALPH works like this:

Define a set of tasks as a PRD (Product Requirements Document) with clear completion criteria
Each iteration, agents see the current task list and can work on any task
Agents signal completion by including [TASK_COMPLETE:task_id] in their response
The system tracks progress and terminates when all tasks are complete or max iterations reached
Agents automatically see which tasks are already done, preventing duplicate work

let tasks = vec![
    RalphTask::new("html",  "HTML Structure", "Create the HTML boilerplate and canvas"),
    RalphTask::new("loop",  "Game Loop",      "Implement requestAnimationFrame game loop"),
    RalphTask::new("input", "Controls",       "Add keyboard input for the paddle"),
];

let mut orch = Orchestration::new("game-builder", "Game Builder")
    .with_mode(OrchestrationMode::Ralph {
        tasks,
        max_iterations: 5,
    });

The Breakout Game: A Complete Real-World Example

To showcase RALPH’s power, we built a complete Atari Breakout game using four specialized LLM agents:

Game Architect: Designs the overall structure and flow
Game Programmer: Implements core mechanics (physics, collision, game loop)
Sound Designer: Creates 8-bit chiptune music and sound effects
Powerup Engineer: Implements 8 distinct powerup types

The orchestration proceeded through 10 PRD tasks:

Core mechanics (HTML structure, game loop, paddle/ball physics, collision detection)
Audio system (background music with Web Audio API, collision sounds, powerup chimes)
Powerup implementation (speed boost, paddle expansion, projectile launcher, multiball, lava paddle, bomb, growth, mushroom)

Result: A fully playable HTML5 game with:

5 game states (MENU, PLAYING, PAUSED, GAME_OVER, LEVEL_COMPLETE)
Multi-hit bricks with color-coded HP
Particle effects (fire bursts, paddle jets, score popups)
Procedural brick generation
Mobile touch controls

All generated by LLMs iterating through a structured task list. The RALPH pattern proved remarkably effective for this kind of incremental, feature-driven development.

Supporting: A New Naming Paradigm

We also renamed the entire Council abstraction to Orchestration (Council → Orchestration, CouncilMode → OrchestrationMode, etc.) for clarity and consistency with industry terminology.

Image Generation: Multi-Provider Support

v0.8.0 added unified image generation across OpenAI (DALL-E), Grok, and Google Gemini via a single ImageGenerationClient trait. This allows agents to generate images and save them locally, opening new classes of creative workflows.

let image_client = new_image_generation_client(
    ImageGenerationProvider::OpenAI,
    &api_key,
)?;
register_image_generation_tool(&protocol, image_client).await?;

One-line helper function eliminates ~80 lines of boilerplate. Supports aspect ratio customization and response format selection (URL or Base64).

Part 2: Per-Agent Sessions & Hub Routing (v0.9.0)

The Architectural Evolution

v0.8.0 worked, but it broadcast the entire conversation history to every agent every round. This was inefficient and semantically wrong—in a round-robin discussion, Agent C shouldn’t see Agent B’s entire scratchpad, just the conclusion.

v0.9.0 redesigned the orchestration architecture with a message router (called a “hub”) that intelligently routes only relevant messages to each agent.

How It Works

Instead of:

Agent A sees: [system, user_prompt, A's_response, B's_response, C's_response]
Agent B sees: [system, user_prompt, A's_response, B's_response, C's_response]  <- Duplicate
Agent C sees: [system, user_prompt, A's_response, B's_response, C's_response]  <- Duplicate

Now:

Agent A:
  - Sees its own prior messages (session history)
  - In round-robin: gets injected message from previous agent, generates response

Agent B:
  - Sees its own session history
  - Gets injected message from Agent A's response
  - Generates next response

Agent C:
  - Sees its own session history
  - Gets injected message from Agent B's response
  - Generates response

API Changes

Each agent now has:

Agent::send(prompt) — Uses agent’s own LLMSession for stateful generation
Agent::receive_message(role, content) — Injects external messages into the session
Agent::set_system_prompt(prompt) — Orchestration updates system context per-round
Agent::session_history_len() — Query agent state for decision-making
Agent::fork_with_context() — Copy agent with session state for parallel execution

Why It Matters

Token efficiency: Agents don’t duplicate each other’s full conversation history
Semantic correctness: Each agent has its own conversation context, not a shared one
Scalability: With 10 agents, this prevents the conversation from exploding in size
Flexibility: Each mode can route messages according to its collaboration pattern

The refactor was complex (added agent_message_cursors tracking to prevent duplication) but entirely transparent to users. All existing code continues to work.

Part 3: Real-Time Observability, Decentralized Coordination & The Complete Tool Ecosystem (v0.10.0)

Three Major Systems Arrive Together

v0.10.0 is the release where CloudLLM transitions from “good orchestration system” to “production-ready multi-agent platform“. It introduces:

Event System — Real-time observability into agent and orchestration behavior
AnthropicAgentTeams Mode — A 7th collaboration pattern with no central orchestrator
HttpClientProtocol — Agents can now make HTTP requests to external services
Expanded Breakout Examples — 18 tasks (vs. 10), comprehensive tool ecosystem

System 1: The Event System

Orchestrations are black boxes—until now. We added an EventHandler trait that lets you observe:

Agent-level events:

SendStarted / SendCompleted — Agent is thinking
LLMCallStarted / LLMCallCompleted — LLM round-trip timing
ToolCallDetected / ToolExecutionCompleted — Tool invocation and results
ThoughtCommitted — Thoughts written to ThoughtChain
ProtocolAdded / ProtocolRemoved — Tool availability changes

Orchestration-level events:

RunStarted / RunCompleted — Orchestration lifecycle
RoundStarted / RoundCompleted — Round boundaries
AgentSelected / AgentResponded / AgentFailed — Per-agent status
RalphIterationStarted / RalphTaskCompleted — RALPH progress
TaskClaimed / TaskCompleted / TaskFailed — AnthropicAgentTeams status

Usage is simple:

struct ProgressHandler { start: Instant }

#[async_trait]
impl EventHandler for ProgressHandler {
    async fn on_agent_event(&self, event: &AgentEvent) {
        match event {
            AgentEvent::SendStarted { agent_name, message_preview, .. } => {
                println!("[{:02}:{:02}] {} thinking...", elapsed().as_secs()/60, elapsed().as_secs()%60, agent_name);
            }
            AgentEvent::ToolExecutionCompleted { agent_name, tool_name, success, .. } => {
                println!("[{:02}:{:02}] {} called '{}' — {}",
                    elapsed().as_secs()/60, elapsed().as_secs()%60, agent_name, tool_name,
                    if *success { "✓" } else { "✗" });
            }
            _ => {}
        }
    }

    async fn on_orchestration_event(&self, event: &OrchestrationEvent) {
        match event {
            OrchestrationEvent::RalphTaskCompleted { task_ids, tasks_completed_total, tasks_total, .. } => {
                println!("✓ Tasks done: {}/{}", tasks_completed_total, tasks_total);
            }
            _ => {}
        }
    }
}

let handler = Arc::new(ProgressHandler::new());
let orchestration = Orchestration::new("id", "Name")
    .with_event_handler(handler);

Register on an orchestration, and the handler auto-propagates to all added agents. You get a unified stream of both agent and orchestration events, enabling:

Real-time progress dashboards
Cost tracking (tokens per agent, per task)
Debugging (why did agent X call tool Y?)
Automated alerting (fail-fast on errors)

System 2: AnthropicAgentTeams — Decentralized Task Coordination

Inspiration: Anthropic’s research team built a C compiler using collaborative AI agents with no central orchestrator. Each agent independently searched for work, claimed it, and reported completion. This inspired the AnthropicAgentTeams mode.

The Pattern:

Instead of a manager/moderator assigning work, agents autonomously discover and claim tasks from a shared Memory pool:

1. Orchestration initializes task pool in Memory:
   - teams:<pool_id>:unclaimed:<task_id> → <description>
   - teams:<pool_id>:claimed:<task_id> → "<agent_id>:<timestamp>"
   - teams:<pool_id>:completed:<task_id> → "<result_json>"

2. Each agent, each iteration:
   - Query Memory: LIST teams:<pool_id>:unclaimed:*
   - See available tasks
   - Claim a task: PUT teams:<pool_id>:claimed:<task_id> <my_agent_id>
   - Work on it
   - Report completion: PUT teams:<pool_id>:completed:<task_id> <result>

3. Orchestration monitors Memory:
   - Count completed tasks vs. total
   - Exit when all done

Why It’s Powerful:

No central bottleneck: No orchestrator making decisions
Truly autonomous: Each agent is responsible for finding and claiming work
Atomic operations: Memory’s single-threaded design guarantees no task conflicts
Transparent progress: Query Memory to see who’s working on what and what’s done
Naturally scales: Add agents, they all independently discover work

System 3: HttpClientProtocol

Agents can now make HTTP requests via HttpClientProtocol, a wrapper around the existing HttpClient tool:

let http_client = Arc::new(HttpClient::new());
let http_protocol = Arc::new(HttpClientProtocol::new(http_client));

let mut registry = ToolRegistry::empty();
registry.add_protocol("http", http_protocol).await?;

Exposes 5 tools:

http_get — Fetch data from a URL
http_post — Send JSON data with optional body
http_put — Update a resource
http_delete — Delete a resource
http_patch — Partial update

All return consistent JSON responses: { "status": 200, "body": "...", "headers": {...} }

Agent usage:

{
  "tool": "http_get",
  "parameters": {
    "url": "https://api.github.com/repos/anthropics/anthropic-sdk-python"
  }
}

Built with security in mind: domain allowlist/blocklist, timeout controls, size limits.

System 4: Expanded Breakout Examples

The breakout games grew from 10 tasks to 18 tasks, now orchestrated by both:

examples/breakout_game_ralph.rs — RALPH mode with mixed-provider agents (2 Claude Haiku 4.5, 2 OpenAI GPT)
examples/breakout_game_agent_teams.rs — AnthropicAgentTeams mode with 4 Claude Haiku 4.5 agents

18 Tasks Across 5 Categories:

Core Mechanics (6)

HTML structure and canvas setup
Game loop with requestAnimationFrame
Paddle control (keyboard + touch/swipe)
Ball physics with angle reflection
Brick layout generation
Collision detection (paddle, bricks, walls, bottom)

Audio System (2)

Background music (Atari 2600-style chiptune via Web Audio API)
Sound effects (collision, powerup, life lost, level complete)

Powerup System (3)

Basic powerups: paddle size, speed boost, projectile launcher
Advanced powerups: lava paddle, bomb, growth, mushroom
Multiball coordination

Visual Effects (3)

Particle systems (fire bursts, paddle jets, score displays)
Paddle 3D rotation animation
Level complete screen animation

Advanced Mechanics (4)

Level progression with 10+ procedural brick patterns
Dynamic difficulty scaling
Mobile-responsive canvas
High score tracking with milestones

Tool Ecosystem:

Both examples now include:

Memory — Inter-agent coordination and shared state tracking
Bash — Shell command execution for research and file operations (macOS/Linux)
HttpClient — API calls for external data/research
Custom write_game_file Tool — Persist game HTML to disk

This demonstrates how a real orchestration system uses multiple tools cooperatively.

Architectural Highlights

1. Multi-Protocol Tool Registry

Agents can access tools from multiple sources simultaneously:

let mut registry = ToolRegistry::empty();
registry.add_protocol("memory", memory_protocol).await?;
registry.add_protocol("bash", bash_protocol).await?;
registry.add_protocol("http", http_protocol).await?;
registry.add_protocol("custom", custom_tools).await?;

let agent = Agent::new("id", "name", client)
    .with_tools(registry);

// Agent automatically discovers ALL tools and can use any of them

2. ThoughtChain: Persistent, Hash-Chained Agent Memory

Agents can record their thinking across sessions with tamper-evident integrity:

agent.commit(ThoughtType::Finding, "Latency increased 3x after deploy").await?;
agent.commit(ThoughtType::Decision, "Roll back to v2.3").await?;

// Later, resume from latest thought with full context graph
let resumed = Agent::resume_from_latest("id", "name", client, 128_000, chain)?;

3. Context Strategies: Automatic Conversation Pruning

As context windows fill, agents automatically invoke strategies:

TrimStrategy (default) — Let LLMSession handle trimming
SelfCompressionStrategy — LLM writes a structured save file
NoveltyAwareStrategy — Only trigger compression when truly needed

4. Fork and Parallel Execution

Agents can be forked for parallel tasks:

let agent = Agent::new("id", "name", client);
let forked = agent.fork();  // Fresh session, shared tools and thought chain

// Use in parallel orchestration mode

Performance & Scalability

Token Efficiency

The per-agent session architecture (v0.9.0) reduced token consumption by ~30% in typical orchestrations by eliminating conversation history duplication.

Example: Breakout Game RALPH Run

4 agents (mixed OpenAI + Claude Haiku)
18 tasks across 5 categories
8 iterations (max)
Runtime: 5-8 minutes
Cost: $1-2 per run (Haiku is cheap!)
Tokens: ~15,000-25,000 per run
Output: Fully playable game HTML

AnthropicAgentTeams Scaling

Decentralized coordination scales better than centralized models:

4 agents, 8-task pool: 10-15 minutes
Linear growth with agent count (each independently searching work)
No moderator bottleneck

Real-World Use Cases

1. AI-Assisted Code Generation

Use RALPH mode to build features incrementally:

Task: "Implement user authentication"
→ Agent designs API structure
→ Agent implements backend
→ Agent adds frontend
→ Agent writes tests
→ All tracked and automatically composed

2. Decentralized Research Teams

Use AnthropicAgentTeams for independent research agents:

Orchestration sets up task pool:
  - Research question 1 (economics)
  - Research question 2 (policy)
  - Research question 3 (case studies)

Agents autonomously:
  - Discover questions
  - Claim and research them
  - Use HTTP client to fetch data
  - Store findings in Memory
  - Orchestration collects results

3. Real-Time Agent Debugging

Use EventHandler to monitor agent behavior:

Handler logs:
  - Which agent called which tool
  - Tool latency and token usage
  - Task claiming/completion in AnthropicAgentTeams
  - Error rates and failures

Build dashboards, alerting, auto-remediation

4. Image-Generating Creative Workflows

Agents can now:

let image = agent.generate("A sunset over the ocean", "openai").await?;
agent.save_image("sunset.png", image)?;

// Agents can also research (HTTP), coordinate (Memory), and record thoughts (ThoughtChain)

Migration Guide

From v0.7.x to v0.8.x

RALPH Mode is opt-in. Existing orchestrations continue to work unchanged.

Image Generation is additive:

// Old code
let agent = Agent::new("id", "name", client).with_tools(registry);

// New code (if you want image generation)
register_image_generation_tool(&protocol, image_client).await?;

From v0.8.x to v0.9.x

Per-agent sessions are transparent. No changes needed for most code. If you have custom orchestration modes, update to use agent.send() instead of agent.generate().

From v0.9.x to v0.10.x

Event System is opt-in:

let handler = Arc::new(MyHandler);
let orchestration = Orchestration::new("id", "name")
    .with_event_handler(handler);  // Optional

AnthropicAgentTeams is a new mode:

.with_mode(OrchestrationMode::AnthropicAgentTeams {
    pool_id: "my_pool".to_string(),
    tasks: vec![...],
    max_iterations: 10,
})

HttpClientProtocol is additive:

registry.add_protocol("http", http_protocol).await?;
// Agents automatically discover it

Looking Ahead

v0.11.0 (Planned)

Streaming aggregation for Parallel mode (see results as they arrive)
Distributed orchestration (agents on different machines coordinating via Memory)
Cost-aware scheduling (choose agents by price/capability tradeoffs)
Tool use optimization (agents learn which tools are most useful)

The Vision

CloudLLM is building toward a future where LLM agents are as natural to orchestrate as threads or processes. The same patterns that work for 2 agents should work for 20. The same debugging tools should work for simple sessions and complex orchestrations.

By providing structured collaboration patterns, atomic coordination primitives, and real-time observability, we’re enabling teams to build AI systems that are:

Efficient — Only passing relevant context around
Transparent — Every decision is observable and auditable
Scalable — From 1 agent to many, the patterns hold
Maintainable — Clear roles, clear task definitions, clear outputs

Thank You

To everyone who’s used CloudLLM to build agents, orchestrated multi-agent systems, and given feedback—thank you. This release represents months of architectural refinement inspired by real-world usage.

Special thanks to the Anthropic research team for the AnthropicAgentTeams inspiration (and for building Claude).

Getting Started

Installation

[dependencies]
cloudllm = "0.10.0"

Quick Example: RALPH Mode

use cloudllm::orchestration::{Orchestration, OrchestrationMode, RalphTask};
use cloudllm::Agent;
use cloudllm::clients::claude::ClaudeClient;
use std::sync::Arc;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error + Send + Sync>> {
    let client = Arc::new(ClaudeClient::new_with_model_enum(
        &std::env::var("ANTHROPIC_KEY")?,
        cloudllm::clients::claude::Model::ClaudeHaiku45,
    ));

    let tasks = vec![
        RalphTask::new("task1", "Task 1", "Do the first thing"),
        RalphTask::new("task2", "Task 2", "Do the second thing"),
        RalphTask::new("task3", "Task 3", "Do the third thing"),
    ];

    let mut orchestration = Orchestration::new("example", "My Orchestration")
        .with_mode(OrchestrationMode::Ralph { tasks, max_iterations: 5 })
        .with_system_context("Complete all tasks systematically.");

    let agent = Agent::new("agent1", "Worker", client)
        .with_expertise("Getting things done");

    orchestration.add_agent(agent)?;

    let result = orchestration.run("Complete all tasks", 1).await?;

    println!("Done! {} rounds, {}% complete", result.round, result.convergence_score.unwrap_or(0.0) * 100.0);
    Ok(())
}

Quick Example: AnthropicAgentTeams Mode

use cloudllm::orchestration::{Orchestration, OrchestrationMode, WorkItem};
use cloudllm::tools::Memory;
use cloudllm::tool_protocols::MemoryProtocol;
use std::sync::Arc;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error + Send + Sync>> {
    let memory = Arc::new(Memory::new());
    let protocol = Arc::new(MemoryProtocol::new(memory));

    let tasks = vec![
        WorkItem {
            id: "task1".to_string(),
            description: "Research topic A".to_string(),
            acceptance_criteria: "2-page report".to_string(),
        },
        WorkItem {
            id: "task2".to_string(),
            description: "Analyze topic B".to_string(),
            acceptance_criteria: "5-point analysis".to_string(),
        },
    ];

    let mut orchestration = Orchestration::new("research", "Team Research")
        .with_mode(OrchestrationMode::AnthropicAgentTeams {
            pool_id: "research_2024".to_string(),
            tasks,
            max_iterations: 10,
        });

    // Add agents — they'll autonomously discover and claim work
    // orchestration.add_agent(agent1)?;
    // orchestration.add_agent(agent2)?;

    let result = orchestration.run("Research these topics", 1).await?;
    println!("All tasks complete: {}", result.is_complete);
    Ok(())
}

Full Examples

examples/breakout_game_ralph.rs — Build a game with RALPH mode
examples/breakout_game_agent_teams.rs — Build the same game with decentralized coordination
examples/anthropic_teams.rs — Research team with 4 agents and 8-task pool
examples/orchestration_demo.rs — All 7 orchestration modes in one file

Run with:

cargo run --example orchestration_demo
cargo run --example breakout_game_ralph
ANTHROPIC_API_KEY=your-key cargo run --example breakout_game_agent_teams

Documentation

README.md — Platform overview and quick start
ORCHESTRATION_TUTORIAL.md — Deep dive into all 7 collaboration modes
examples/README.md — Complete 22-example guide

Run cargo doc --open for complete API documentation.

Community & Support

GitHub: CloudLLM-ai/cloudllm
Issues & Pull Requests welcome
MIT License

Welcome to the era of intelligent, orchestrated, observable multi-agent systems.

Happy orchestration! 🚀