CloudLLM v0.10: from a simple LLM wrapper to a Multi-Agent Orchestration framework

February 11th, 2026

CloudLLM logo

 

CloudLLM has evolved dramatically over three consecutive releases (v0.8.0 through v0.10.0) into a comprehensive, production-ready platform for building autonomous multi-agent systems.

What began as an LLM wrapper request-response pattern library has grown into a sophisticated orchestration engine with seven distinct collaboration modes, real-time event observability, atomic task coordination primitives, and a rich tool ecosystem.

This major release cycle introduces AnthropicAgentTeams mode (inspired by Anthropic’s own approach to collaborative AI), a unified event system for real-time agent introspection, and HttpClientProtocol—enabling agents to research and coordinate via the web. Combined with the previous release’s RALPH mode and image generation support, CloudLLM now empowers teams to build AI systems as sophisticated as their imagination.


Part 1: RALPH Mode & The Game-Builder Architecture (v0.8.0)

 

The Problem We Solved

 

Building complex software via LLM agents requires structured work tracking—the ability to break down a large goal into discrete tasks, have agents iterate through them sequentially or in parallel, and track completion automatically. Traditional approaches either relied on manual task lists or unstructured agent banter.

RALPH: PRD-Driven Autonomous Orchestration

 

v0.8.0 introduced RALPH mode—a 6th orchestration collaboration pattern named after Ralph Wiggum for its wonderfully earnest determination to get through a checklist.

RALPH works like this:

  1. Define a set of tasks as a PRD (Product Requirements Document) with clear completion criteria
  2. Each iteration, agents see the current task list and can work on any task
  3. Agents signal completion by including [TASK_COMPLETE:task_id] in their response
  4. The system tracks progress and terminates when all tasks are complete or max iterations reached
  5. Agents automatically see which tasks are already done, preventing duplicate work
let tasks = vec![
    RalphTask::new("html",  "HTML Structure", "Create the HTML boilerplate and canvas"),
    RalphTask::new("loop",  "Game Loop",      "Implement requestAnimationFrame game loop"),
    RalphTask::new("input", "Controls",       "Add keyboard input for the paddle"),
];

let mut orch = Orchestration::new("game-builder", "Game Builder")
    .with_mode(OrchestrationMode::Ralph {
        tasks,
        max_iterations: 5,
    });

The Breakout Game: A Complete Real-World Example

 

To showcase RALPH’s power, we built a complete Atari Breakout game using four specialized LLM agents:

  • Game Architect: Designs the overall structure and flow
  • Game Programmer: Implements core mechanics (physics, collision, game loop)
  • Sound Designer: Creates 8-bit chiptune music and sound effects
  • Powerup Engineer: Implements 8 distinct powerup types

The orchestration proceeded through 10 PRD tasks:

  • Core mechanics (HTML structure, game loop, paddle/ball physics, collision detection)
  • Audio system (background music with Web Audio API, collision sounds, powerup chimes)
  • Powerup implementation (speed boost, paddle expansion, projectile launcher, multiball, lava paddle, bomb, growth, mushroom)

Result: A fully playable HTML5 game with:

  • 5 game states (MENU, PLAYING, PAUSED, GAME_OVER, LEVEL_COMPLETE)
  • Multi-hit bricks with color-coded HP
  • Particle effects (fire bursts, paddle jets, score popups)
  • Procedural brick generation
  • Mobile touch controls

All generated by LLMs iterating through a structured task list. The RALPH pattern proved remarkably effective for this kind of incremental, feature-driven development.

Supporting: A New Naming Paradigm

 

We also renamed the entire Council abstraction to Orchestration (Council → Orchestration, CouncilMode → OrchestrationMode, etc.) for clarity and consistency with industry terminology.

Image Generation: Multi-Provider Support

 

v0.8.0 added unified image generation across OpenAI (DALL-E), Grok, and Google Gemini via a single ImageGenerationClient trait. This allows agents to generate images and save them locally, opening new classes of creative workflows.

let image_client = new_image_generation_client(
    ImageGenerationProvider::OpenAI,
    &api_key,
)?;
register_image_generation_tool(&protocol, image_client).await?;

One-line helper function eliminates ~80 lines of boilerplate. Supports aspect ratio customization and response format selection (URL or Base64).


Part 2: Per-Agent Sessions & Hub Routing (v0.9.0)

 

The Architectural Evolution

 

v0.8.0 worked, but it broadcast the entire conversation history to every agent every round. This was inefficient and semantically wrong—in a round-robin discussion, Agent C shouldn’t see Agent B’s entire scratchpad, just the conclusion.

v0.9.0 redesigned the orchestration architecture with a message router (called a “hub”) that intelligently routes only relevant messages to each agent.

How It Works

 

Instead of:

Agent A sees: [system, user_prompt, A's_response, B's_response, C's_response]
Agent B sees: [system, user_prompt, A's_response, B's_response, C's_response]  <- Duplicate
Agent C sees: [system, user_prompt, A's_response, B's_response, C's_response]  <- Duplicate

Now:

Agent A:
  - Sees its own prior messages (session history)
  - In round-robin: gets injected message from previous agent, generates response

Agent B:
  - Sees its own session history
  - Gets injected message from Agent A's response
  - Generates next response

Agent C:
  - Sees its own session history
  - Gets injected message from Agent B's response
  - Generates response

API Changes

 

Each agent now has:

  • Agent::send(prompt) — Uses agent’s own LLMSession for stateful generation
  • Agent::receive_message(role, content) — Injects external messages into the session
  • Agent::set_system_prompt(prompt) — Orchestration updates system context per-round
  • Agent::session_history_len() — Query agent state for decision-making
  • Agent::fork_with_context() — Copy agent with session state for parallel execution

Why It Matters

 

  1. Token efficiency: Agents don’t duplicate each other’s full conversation history
  2. Semantic correctness: Each agent has its own conversation context, not a shared one
  3. Scalability: With 10 agents, this prevents the conversation from exploding in size
  4. Flexibility: Each mode can route messages according to its collaboration pattern

The refactor was complex (added agent_message_cursors tracking to prevent duplication) but entirely transparent to users. All existing code continues to work.


Part 3: Real-Time Observability, Decentralized Coordination & The Complete Tool Ecosystem (v0.10.0)

 

Three Major Systems Arrive Together

 

v0.10.0 is the release where CloudLLM transitions from “good orchestration system” to “production-ready multi-agent platform“. It introduces:

  1. Event System — Real-time observability into agent and orchestration behavior
  2. AnthropicAgentTeams Mode — A 7th collaboration pattern with no central orchestrator
  3. HttpClientProtocol — Agents can now make HTTP requests to external services
  4. Expanded Breakout Examples — 18 tasks (vs. 10), comprehensive tool ecosystem

System 1: The Event System

 

Orchestrations are black boxes—until now. We added an EventHandler trait that lets you observe:

Agent-level events:

  • SendStarted / SendCompleted — Agent is thinking
  • LLMCallStarted / LLMCallCompleted — LLM round-trip timing
  • ToolCallDetected / ToolExecutionCompleted — Tool invocation and results
  • ThoughtCommitted — Thoughts written to ThoughtChain
  • ProtocolAdded / ProtocolRemoved — Tool availability changes

Orchestration-level events:

  • RunStarted / RunCompleted — Orchestration lifecycle
  • RoundStarted / RoundCompleted — Round boundaries
  • AgentSelected / AgentResponded / AgentFailed — Per-agent status
  • RalphIterationStarted / RalphTaskCompleted — RALPH progress
  • TaskClaimed / TaskCompleted / TaskFailed — AnthropicAgentTeams status

Usage is simple:

struct ProgressHandler { start: Instant }

#[async_trait]
impl EventHandler for ProgressHandler {
    async fn on_agent_event(&self, event: &AgentEvent) {
        match event {
            AgentEvent::SendStarted { agent_name, message_preview, .. } => {
                println!("[{:02}:{:02}] {} thinking...", elapsed().as_secs()/60, elapsed().as_secs()%60, agent_name);
            }
            AgentEvent::ToolExecutionCompleted { agent_name, tool_name, success, .. } => {
                println!("[{:02}:{:02}] {} called '{}' — {}",
                    elapsed().as_secs()/60, elapsed().as_secs()%60, agent_name, tool_name,
                    if *success { "✓" } else { "✗" });
            }
            _ => {}
        }
    }

    async fn on_orchestration_event(&self, event: &OrchestrationEvent) {
        match event {
            OrchestrationEvent::RalphTaskCompleted { task_ids, tasks_completed_total, tasks_total, .. } => {
                println!("✓ Tasks done: {}/{}", tasks_completed_total, tasks_total);
            }
            _ => {}
        }
    }
}

let handler = Arc::new(ProgressHandler::new());
let orchestration = Orchestration::new("id", "Name")
    .with_event_handler(handler);

Register on an orchestration, and the handler auto-propagates to all added agents. You get a unified stream of both agent and orchestration events, enabling:

  • Real-time progress dashboards
  • Cost tracking (tokens per agent, per task)
  • Debugging (why did agent X call tool Y?)
  • Automated alerting (fail-fast on errors)

System 2: AnthropicAgentTeams — Decentralized Task Coordination

 

Inspiration: Anthropic’s research team built a C compiler using collaborative AI agents with no central orchestrator. Each agent independently searched for work, claimed it, and reported completion. This inspired the AnthropicAgentTeams mode.

The Pattern:

Instead of a manager/moderator assigning work, agents autonomously discover and claim tasks from a shared Memory pool:

1. Orchestration initializes task pool in Memory:
   - teams:<pool_id>:unclaimed:<task_id> → <description>
   - teams:<pool_id>:claimed:<task_id> → "<agent_id>:<timestamp>"
   - teams:<pool_id>:completed:<task_id> → "<result_json>"

2. Each agent, each iteration:
   - Query Memory: LIST teams:<pool_id>:unclaimed:*
   - See available tasks
   - Claim a task: PUT teams:<pool_id>:claimed:<task_id> <my_agent_id>
   - Work on it
   - Report completion: PUT teams:<pool_id>:completed:<task_id> <result>

3. Orchestration monitors Memory:
   - Count completed tasks vs. total
   - Exit when all done

Why It’s Powerful:

  • No central bottleneck: No orchestrator making decisions
  • Truly autonomous: Each agent is responsible for finding and claiming work
  • Atomic operations: Memory’s single-threaded design guarantees no task conflicts
  • Transparent progress: Query Memory to see who’s working on what and what’s done
  • Naturally scales: Add agents, they all independently discover work

System 3: HttpClientProtocol

 

Agents can now make HTTP requests via HttpClientProtocol, a wrapper around the existing HttpClient tool:

let http_client = Arc::new(HttpClient::new());
let http_protocol = Arc::new(HttpClientProtocol::new(http_client));

let mut registry = ToolRegistry::empty();
registry.add_protocol("http", http_protocol).await?;

Exposes 5 tools:

  • http_get — Fetch data from a URL
  • http_post — Send JSON data with optional body
  • http_put — Update a resource
  • http_delete — Delete a resource
  • http_patch — Partial update

All return consistent JSON responses: { "status": 200, "body": "...", "headers": {...} }

Agent usage:

{
  "tool": "http_get",
  "parameters": {
    "url": "https://api.github.com/repos/anthropics/anthropic-sdk-python"
  }
}

Built with security in mind: domain allowlist/blocklist, timeout controls, size limits.

System 4: Expanded Breakout Examples

 

The breakout games grew from 10 tasks to 18 tasks, now orchestrated by both:

  1. examples/breakout_game_ralph.rs — RALPH mode with mixed-provider agents (2 Claude Haiku 4.5, 2 OpenAI GPT)
  2. examples/breakout_game_agent_teams.rs — AnthropicAgentTeams mode with 4 Claude Haiku 4.5 agents

18 Tasks Across 5 Categories:

Core Mechanics (6)

  • HTML structure and canvas setup
  • Game loop with requestAnimationFrame
  • Paddle control (keyboard + touch/swipe)
  • Ball physics with angle reflection
  • Brick layout generation
  • Collision detection (paddle, bricks, walls, bottom)

Audio System (2)

  • Background music (Atari 2600-style chiptune via Web Audio API)
  • Sound effects (collision, powerup, life lost, level complete)

Powerup System (3)

  • Basic powerups: paddle size, speed boost, projectile launcher
  • Advanced powerups: lava paddle, bomb, growth, mushroom
  • Multiball coordination

Visual Effects (3)

  • Particle systems (fire bursts, paddle jets, score displays)
  • Paddle 3D rotation animation
  • Level complete screen animation

Advanced Mechanics (4)

  • Level progression with 10+ procedural brick patterns
  • Dynamic difficulty scaling
  • Mobile-responsive canvas
  • High score tracking with milestones

Tool Ecosystem:

Both examples now include:

  • Memory — Inter-agent coordination and shared state tracking
  • Bash — Shell command execution for research and file operations (macOS/Linux)
  • HttpClient — API calls for external data/research
  • Custom write_game_file Tool — Persist game HTML to disk

This demonstrates how a real orchestration system uses multiple tools cooperatively.


Architectural Highlights

 

1. Multi-Protocol Tool Registry

 

Agents can access tools from multiple sources simultaneously:

let mut registry = ToolRegistry::empty();
registry.add_protocol("memory", memory_protocol).await?;
registry.add_protocol("bash", bash_protocol).await?;
registry.add_protocol("http", http_protocol).await?;
registry.add_protocol("custom", custom_tools).await?;

let agent = Agent::new("id", "name", client)
    .with_tools(registry);

// Agent automatically discovers ALL tools and can use any of them

2. ThoughtChain: Persistent, Hash-Chained Agent Memory

 

Agents can record their thinking across sessions with tamper-evident integrity:

agent.commit(ThoughtType::Finding, "Latency increased 3x after deploy").await?;
agent.commit(ThoughtType::Decision, "Roll back to v2.3").await?;

// Later, resume from latest thought with full context graph
let resumed = Agent::resume_from_latest("id", "name", client, 128_000, chain)?;

3. Context Strategies: Automatic Conversation Pruning

 

As context windows fill, agents automatically invoke strategies:

  • TrimStrategy (default) — Let LLMSession handle trimming
  • SelfCompressionStrategy — LLM writes a structured save file
  • NoveltyAwareStrategy — Only trigger compression when truly needed

4. Fork and Parallel Execution

 

Agents can be forked for parallel tasks:

let agent = Agent::new("id", "name", client);
let forked = agent.fork();  // Fresh session, shared tools and thought chain

// Use in parallel orchestration mode

Performance & Scalability

 

Token Efficiency

 

The per-agent session architecture (v0.9.0) reduced token consumption by ~30% in typical orchestrations by eliminating conversation history duplication.

Example: Breakout Game RALPH Run

 

  • 4 agents (mixed OpenAI + Claude Haiku)
  • 18 tasks across 5 categories
  • 8 iterations (max)
  • Runtime: 5-8 minutes
  • Cost: $1-2 per run (Haiku is cheap!)
  • Tokens: ~15,000-25,000 per run
  • Output: Fully playable game HTML

AnthropicAgentTeams Scaling

 

Decentralized coordination scales better than centralized models:

  • 4 agents, 8-task pool: 10-15 minutes
  • Linear growth with agent count (each independently searching work)
  • No moderator bottleneck

Real-World Use Cases

 

1. AI-Assisted Code Generation

 

Use RALPH mode to build features incrementally:

Task: "Implement user authentication"
→ Agent designs API structure
→ Agent implements backend
→ Agent adds frontend
→ Agent writes tests
→ All tracked and automatically composed

2. Decentralized Research Teams

 

Use AnthropicAgentTeams for independent research agents:

Orchestration sets up task pool:
  - Research question 1 (economics)
  - Research question 2 (policy)
  - Research question 3 (case studies)

Agents autonomously:
  - Discover questions
  - Claim and research them
  - Use HTTP client to fetch data
  - Store findings in Memory
  - Orchestration collects results

3. Real-Time Agent Debugging

 

Use EventHandler to monitor agent behavior:

Handler logs:
  - Which agent called which tool
  - Tool latency and token usage
  - Task claiming/completion in AnthropicAgentTeams
  - Error rates and failures

Build dashboards, alerting, auto-remediation

4. Image-Generating Creative Workflows

 

Agents can now:

let image = agent.generate("A sunset over the ocean", "openai").await?;
agent.save_image("sunset.png", image)?;

// Agents can also research (HTTP), coordinate (Memory), and record thoughts (ThoughtChain)

Migration Guide

 

From v0.7.x to v0.8.x

 

RALPH Mode is opt-in. Existing orchestrations continue to work unchanged.

Image Generation is additive:

// Old code
let agent = Agent::new("id", "name", client).with_tools(registry);

// New code (if you want image generation)
register_image_generation_tool(&protocol, image_client).await?;

From v0.8.x to v0.9.x

 

Per-agent sessions are transparent. No changes needed for most code. If you have custom orchestration modes, update to use agent.send() instead of agent.generate().

From v0.9.x to v0.10.x

 

Event System is opt-in:

let handler = Arc::new(MyHandler);
let orchestration = Orchestration::new("id", "name")
    .with_event_handler(handler);  // Optional

AnthropicAgentTeams is a new mode:

.with_mode(OrchestrationMode::AnthropicAgentTeams {
    pool_id: "my_pool".to_string(),
    tasks: vec![...],
    max_iterations: 10,
})

HttpClientProtocol is additive:

registry.add_protocol("http", http_protocol).await?;
// Agents automatically discover it

Looking Ahead

 

v0.11.0 (Planned)

 

  • Streaming aggregation for Parallel mode (see results as they arrive)
  • Distributed orchestration (agents on different machines coordinating via Memory)
  • Cost-aware scheduling (choose agents by price/capability tradeoffs)
  • Tool use optimization (agents learn which tools are most useful)

The Vision

 

CloudLLM is building toward a future where LLM agents are as natural to orchestrate as threads or processes. The same patterns that work for 2 agents should work for 20. The same debugging tools should work for simple sessions and complex orchestrations.

By providing structured collaboration patterns, atomic coordination primitives, and real-time observability, we’re enabling teams to build AI systems that are:

  • Efficient — Only passing relevant context around
  • Transparent — Every decision is observable and auditable
  • Scalable — From 1 agent to many, the patterns hold
  • Maintainable — Clear roles, clear task definitions, clear outputs

Thank You

 

To everyone who’s used CloudLLM to build agents, orchestrated multi-agent systems, and given feedback—thank you. This release represents months of architectural refinement inspired by real-world usage.

Special thanks to the Anthropic research team for the AnthropicAgentTeams inspiration (and for building Claude).


Getting Started

 

Installation

 

[dependencies]
cloudllm = "0.10.0"

Quick Example: RALPH Mode

 

use cloudllm::orchestration::{Orchestration, OrchestrationMode, RalphTask};
use cloudllm::Agent;
use cloudllm::clients::claude::ClaudeClient;
use std::sync::Arc;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error + Send + Sync>> {
    let client = Arc::new(ClaudeClient::new_with_model_enum(
        &std::env::var("ANTHROPIC_KEY")?,
        cloudllm::clients::claude::Model::ClaudeHaiku45,
    ));

    let tasks = vec![
        RalphTask::new("task1", "Task 1", "Do the first thing"),
        RalphTask::new("task2", "Task 2", "Do the second thing"),
        RalphTask::new("task3", "Task 3", "Do the third thing"),
    ];

    let mut orchestration = Orchestration::new("example", "My Orchestration")
        .with_mode(OrchestrationMode::Ralph { tasks, max_iterations: 5 })
        .with_system_context("Complete all tasks systematically.");

    let agent = Agent::new("agent1", "Worker", client)
        .with_expertise("Getting things done");

    orchestration.add_agent(agent)?;

    let result = orchestration.run("Complete all tasks", 1).await?;

    println!("Done! {} rounds, {}% complete", result.round, result.convergence_score.unwrap_or(0.0) * 100.0);
    Ok(())
}

Quick Example: AnthropicAgentTeams Mode

 

use cloudllm::orchestration::{Orchestration, OrchestrationMode, WorkItem};
use cloudllm::tools::Memory;
use cloudllm::tool_protocols::MemoryProtocol;
use std::sync::Arc;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error + Send + Sync>> {
    let memory = Arc::new(Memory::new());
    let protocol = Arc::new(MemoryProtocol::new(memory));

    let tasks = vec![
        WorkItem {
            id: "task1".to_string(),
            description: "Research topic A".to_string(),
            acceptance_criteria: "2-page report".to_string(),
        },
        WorkItem {
            id: "task2".to_string(),
            description: "Analyze topic B".to_string(),
            acceptance_criteria: "5-point analysis".to_string(),
        },
    ];

    let mut orchestration = Orchestration::new("research", "Team Research")
        .with_mode(OrchestrationMode::AnthropicAgentTeams {
            pool_id: "research_2024".to_string(),
            tasks,
            max_iterations: 10,
        });

    // Add agents — they'll autonomously discover and claim work
    // orchestration.add_agent(agent1)?;
    // orchestration.add_agent(agent2)?;

    let result = orchestration.run("Research these topics", 1).await?;
    println!("All tasks complete: {}", result.is_complete);
    Ok(())
}

Full Examples

 

  • examples/breakout_game_ralph.rs — Build a game with RALPH mode
  • examples/breakout_game_agent_teams.rs — Build the same game with decentralized coordination
  • examples/anthropic_teams.rs — Research team with 4 agents and 8-task pool
  • examples/orchestration_demo.rs — All 7 orchestration modes in one file

Run with:

cargo run --example orchestration_demo
cargo run --example breakout_game_ralph
ANTHROPIC_API_KEY=your-key cargo run --example breakout_game_agent_teams

Documentation

 

Run cargo doc --open for complete API documentation.


Community & Support

 


Welcome to the era of intelligent, orchestrated, observable multi-agent systems.

Happy orchestration! 🚀