February 11th, 2026
CloudLLM has evolved dramatically over three consecutive releases (v0.8.0 through v0.10.0) into a comprehensive, production-ready platform for building autonomous multi-agent systems.
What began as an LLM wrapper request-response pattern library has grown into a sophisticated orchestration engine with seven distinct collaboration modes, real-time event observability, atomic task coordination primitives, and a rich tool ecosystem.
This major release cycle introduces AnthropicAgentTeams mode (inspired by Anthropic’s own approach to collaborative AI), a unified event system for real-time agent introspection, and HttpClientProtocol—enabling agents to research and coordinate via the web. Combined with the previous release’s RALPH mode and image generation support, CloudLLM now empowers teams to build AI systems as sophisticated as their imagination.
Part 1: RALPH Mode & The Game-Builder Architecture (v0.8.0)
The Problem We Solved
Building complex software via LLM agents requires structured work tracking—the ability to break down a large goal into discrete tasks, have agents iterate through them sequentially or in parallel, and track completion automatically. Traditional approaches either relied on manual task lists or unstructured agent banter.
RALPH: PRD-Driven Autonomous Orchestration
v0.8.0 introduced RALPH mode—a 6th orchestration collaboration pattern named after Ralph Wiggum for its wonderfully earnest determination to get through a checklist.
RALPH works like this:
- Define a set of tasks as a PRD (Product Requirements Document) with clear completion criteria
- Each iteration, agents see the current task list and can work on any task
- Agents signal completion by including
[TASK_COMPLETE:task_id]in their response - The system tracks progress and terminates when all tasks are complete or max iterations reached
- Agents automatically see which tasks are already done, preventing duplicate work
let tasks = vec![
RalphTask::new("html", "HTML Structure", "Create the HTML boilerplate and canvas"),
RalphTask::new("loop", "Game Loop", "Implement requestAnimationFrame game loop"),
RalphTask::new("input", "Controls", "Add keyboard input for the paddle"),
];
let mut orch = Orchestration::new("game-builder", "Game Builder")
.with_mode(OrchestrationMode::Ralph {
tasks,
max_iterations: 5,
});
The Breakout Game: A Complete Real-World Example
To showcase RALPH’s power, we built a complete Atari Breakout game using four specialized LLM agents:
- Game Architect: Designs the overall structure and flow
- Game Programmer: Implements core mechanics (physics, collision, game loop)
- Sound Designer: Creates 8-bit chiptune music and sound effects
- Powerup Engineer: Implements 8 distinct powerup types
The orchestration proceeded through 10 PRD tasks:
- Core mechanics (HTML structure, game loop, paddle/ball physics, collision detection)
- Audio system (background music with Web Audio API, collision sounds, powerup chimes)
- Powerup implementation (speed boost, paddle expansion, projectile launcher, multiball, lava paddle, bomb, growth, mushroom)
Result: A fully playable HTML5 game with:
- 5 game states (MENU, PLAYING, PAUSED, GAME_OVER, LEVEL_COMPLETE)
- Multi-hit bricks with color-coded HP
- Particle effects (fire bursts, paddle jets, score popups)
- Procedural brick generation
- Mobile touch controls
All generated by LLMs iterating through a structured task list. The RALPH pattern proved remarkably effective for this kind of incremental, feature-driven development.
Supporting: A New Naming Paradigm
We also renamed the entire Council abstraction to Orchestration (Council → Orchestration, CouncilMode → OrchestrationMode, etc.) for clarity and consistency with industry terminology.
Image Generation: Multi-Provider Support
v0.8.0 added unified image generation across OpenAI (DALL-E), Grok, and Google Gemini via a single ImageGenerationClient trait. This allows agents to generate images and save them locally, opening new classes of creative workflows.
let image_client = new_image_generation_client(
ImageGenerationProvider::OpenAI,
&api_key,
)?;
register_image_generation_tool(&protocol, image_client).await?;
One-line helper function eliminates ~80 lines of boilerplate. Supports aspect ratio customization and response format selection (URL or Base64).
Part 2: Per-Agent Sessions & Hub Routing (v0.9.0)
The Architectural Evolution
v0.8.0 worked, but it broadcast the entire conversation history to every agent every round. This was inefficient and semantically wrong—in a round-robin discussion, Agent C shouldn’t see Agent B’s entire scratchpad, just the conclusion.
v0.9.0 redesigned the orchestration architecture with a message router (called a “hub”) that intelligently routes only relevant messages to each agent.
How It Works
Instead of:
Agent A sees: [system, user_prompt, A's_response, B's_response, C's_response]
Agent B sees: [system, user_prompt, A's_response, B's_response, C's_response] <- Duplicate
Agent C sees: [system, user_prompt, A's_response, B's_response, C's_response] <- Duplicate
Now:
Agent A:
- Sees its own prior messages (session history)
- In round-robin: gets injected message from previous agent, generates response
Agent B:
- Sees its own session history
- Gets injected message from Agent A's response
- Generates next response
Agent C:
- Sees its own session history
- Gets injected message from Agent B's response
- Generates response
API Changes
Each agent now has:
Agent::send(prompt)— Uses agent’s own LLMSession for stateful generationAgent::receive_message(role, content)— Injects external messages into the sessionAgent::set_system_prompt(prompt)— Orchestration updates system context per-roundAgent::session_history_len()— Query agent state for decision-makingAgent::fork_with_context()— Copy agent with session state for parallel execution
Why It Matters
- Token efficiency: Agents don’t duplicate each other’s full conversation history
- Semantic correctness: Each agent has its own conversation context, not a shared one
- Scalability: With 10 agents, this prevents the conversation from exploding in size
- Flexibility: Each mode can route messages according to its collaboration pattern
The refactor was complex (added agent_message_cursors tracking to prevent duplication) but entirely transparent to users. All existing code continues to work.
Part 3: Real-Time Observability, Decentralized Coordination & The Complete Tool Ecosystem (v0.10.0)
Three Major Systems Arrive Together
v0.10.0 is the release where CloudLLM transitions from “good orchestration system” to “production-ready multi-agent platform“. It introduces:
- Event System — Real-time observability into agent and orchestration behavior
- AnthropicAgentTeams Mode — A 7th collaboration pattern with no central orchestrator
- HttpClientProtocol — Agents can now make HTTP requests to external services
- Expanded Breakout Examples — 18 tasks (vs. 10), comprehensive tool ecosystem
System 1: The Event System
Orchestrations are black boxes—until now. We added an EventHandler trait that lets you observe:
Agent-level events:
SendStarted/SendCompleted— Agent is thinkingLLMCallStarted/LLMCallCompleted— LLM round-trip timingToolCallDetected/ToolExecutionCompleted— Tool invocation and resultsThoughtCommitted— Thoughts written to ThoughtChainProtocolAdded/ProtocolRemoved— Tool availability changes
Orchestration-level events:
RunStarted/RunCompleted— Orchestration lifecycleRoundStarted/RoundCompleted— Round boundariesAgentSelected/AgentResponded/AgentFailed— Per-agent statusRalphIterationStarted/RalphTaskCompleted— RALPH progressTaskClaimed/TaskCompleted/TaskFailed— AnthropicAgentTeams status
Usage is simple:
struct ProgressHandler { start: Instant }
#[async_trait]
impl EventHandler for ProgressHandler {
async fn on_agent_event(&self, event: &AgentEvent) {
match event {
AgentEvent::SendStarted { agent_name, message_preview, .. } => {
println!("[{:02}:{:02}] {} thinking...", elapsed().as_secs()/60, elapsed().as_secs()%60, agent_name);
}
AgentEvent::ToolExecutionCompleted { agent_name, tool_name, success, .. } => {
println!("[{:02}:{:02}] {} called '{}' — {}",
elapsed().as_secs()/60, elapsed().as_secs()%60, agent_name, tool_name,
if *success { "✓" } else { "✗" });
}
_ => {}
}
}
async fn on_orchestration_event(&self, event: &OrchestrationEvent) {
match event {
OrchestrationEvent::RalphTaskCompleted { task_ids, tasks_completed_total, tasks_total, .. } => {
println!("✓ Tasks done: {}/{}", tasks_completed_total, tasks_total);
}
_ => {}
}
}
}
let handler = Arc::new(ProgressHandler::new());
let orchestration = Orchestration::new("id", "Name")
.with_event_handler(handler);
Register on an orchestration, and the handler auto-propagates to all added agents. You get a unified stream of both agent and orchestration events, enabling:
- Real-time progress dashboards
- Cost tracking (tokens per agent, per task)
- Debugging (why did agent X call tool Y?)
- Automated alerting (fail-fast on errors)
System 2: AnthropicAgentTeams — Decentralized Task Coordination
Inspiration: Anthropic’s research team built a C compiler using collaborative AI agents with no central orchestrator. Each agent independently searched for work, claimed it, and reported completion. This inspired the AnthropicAgentTeams mode.
The Pattern:
Instead of a manager/moderator assigning work, agents autonomously discover and claim tasks from a shared Memory pool:
1. Orchestration initializes task pool in Memory:
- teams:<pool_id>:unclaimed:<task_id> → <description>
- teams:<pool_id>:claimed:<task_id> → "<agent_id>:<timestamp>"
- teams:<pool_id>:completed:<task_id> → "<result_json>"
2. Each agent, each iteration:
- Query Memory: LIST teams:<pool_id>:unclaimed:*
- See available tasks
- Claim a task: PUT teams:<pool_id>:claimed:<task_id> <my_agent_id>
- Work on it
- Report completion: PUT teams:<pool_id>:completed:<task_id> <result>
3. Orchestration monitors Memory:
- Count completed tasks vs. total
- Exit when all done
Why It’s Powerful:
- No central bottleneck: No orchestrator making decisions
- Truly autonomous: Each agent is responsible for finding and claiming work
- Atomic operations: Memory’s single-threaded design guarantees no task conflicts
- Transparent progress: Query Memory to see who’s working on what and what’s done
- Naturally scales: Add agents, they all independently discover work
System 3: HttpClientProtocol
Agents can now make HTTP requests via HttpClientProtocol, a wrapper around the existing HttpClient tool:
let http_client = Arc::new(HttpClient::new());
let http_protocol = Arc::new(HttpClientProtocol::new(http_client));
let mut registry = ToolRegistry::empty();
registry.add_protocol("http", http_protocol).await?;
Exposes 5 tools:
http_get— Fetch data from a URLhttp_post— Send JSON data with optional bodyhttp_put— Update a resourcehttp_delete— Delete a resourcehttp_patch— Partial update
All return consistent JSON responses: { "status": 200, "body": "...", "headers": {...} }
Agent usage:
{
"tool": "http_get",
"parameters": {
"url": "https://api.github.com/repos/anthropics/anthropic-sdk-python"
}
}
Built with security in mind: domain allowlist/blocklist, timeout controls, size limits.
System 4: Expanded Breakout Examples
The breakout games grew from 10 tasks to 18 tasks, now orchestrated by both:
examples/breakout_game_ralph.rs— RALPH mode with mixed-provider agents (2 Claude Haiku 4.5, 2 OpenAI GPT)examples/breakout_game_agent_teams.rs— AnthropicAgentTeams mode with 4 Claude Haiku 4.5 agents
18 Tasks Across 5 Categories:
Core Mechanics (6)
- HTML structure and canvas setup
- Game loop with requestAnimationFrame
- Paddle control (keyboard + touch/swipe)
- Ball physics with angle reflection
- Brick layout generation
- Collision detection (paddle, bricks, walls, bottom)
Audio System (2)
- Background music (Atari 2600-style chiptune via Web Audio API)
- Sound effects (collision, powerup, life lost, level complete)
Powerup System (3)
- Basic powerups: paddle size, speed boost, projectile launcher
- Advanced powerups: lava paddle, bomb, growth, mushroom
- Multiball coordination
Visual Effects (3)
- Particle systems (fire bursts, paddle jets, score displays)
- Paddle 3D rotation animation
- Level complete screen animation
Advanced Mechanics (4)
- Level progression with 10+ procedural brick patterns
- Dynamic difficulty scaling
- Mobile-responsive canvas
- High score tracking with milestones
Tool Ecosystem:
Both examples now include:
- Memory — Inter-agent coordination and shared state tracking
- Bash — Shell command execution for research and file operations (macOS/Linux)
- HttpClient — API calls for external data/research
- Custom write_game_file Tool — Persist game HTML to disk
This demonstrates how a real orchestration system uses multiple tools cooperatively.
Architectural Highlights
1. Multi-Protocol Tool Registry
Agents can access tools from multiple sources simultaneously:
let mut registry = ToolRegistry::empty();
registry.add_protocol("memory", memory_protocol).await?;
registry.add_protocol("bash", bash_protocol).await?;
registry.add_protocol("http", http_protocol).await?;
registry.add_protocol("custom", custom_tools).await?;
let agent = Agent::new("id", "name", client)
.with_tools(registry);
// Agent automatically discovers ALL tools and can use any of them
2. ThoughtChain: Persistent, Hash-Chained Agent Memory
Agents can record their thinking across sessions with tamper-evident integrity:
agent.commit(ThoughtType::Finding, "Latency increased 3x after deploy").await?;
agent.commit(ThoughtType::Decision, "Roll back to v2.3").await?;
// Later, resume from latest thought with full context graph
let resumed = Agent::resume_from_latest("id", "name", client, 128_000, chain)?;
3. Context Strategies: Automatic Conversation Pruning
As context windows fill, agents automatically invoke strategies:
- TrimStrategy (default) — Let LLMSession handle trimming
- SelfCompressionStrategy — LLM writes a structured save file
- NoveltyAwareStrategy — Only trigger compression when truly needed
4. Fork and Parallel Execution
Agents can be forked for parallel tasks:
let agent = Agent::new("id", "name", client);
let forked = agent.fork(); // Fresh session, shared tools and thought chain
// Use in parallel orchestration mode
Performance & Scalability
Token Efficiency
The per-agent session architecture (v0.9.0) reduced token consumption by ~30% in typical orchestrations by eliminating conversation history duplication.
Example: Breakout Game RALPH Run
- 4 agents (mixed OpenAI + Claude Haiku)
- 18 tasks across 5 categories
- 8 iterations (max)
- Runtime: 5-8 minutes
- Cost: $1-2 per run (Haiku is cheap!)
- Tokens: ~15,000-25,000 per run
- Output: Fully playable game HTML
AnthropicAgentTeams Scaling
Decentralized coordination scales better than centralized models:
- 4 agents, 8-task pool: 10-15 minutes
- Linear growth with agent count (each independently searching work)
- No moderator bottleneck
Real-World Use Cases
1. AI-Assisted Code Generation
Use RALPH mode to build features incrementally:
Task: "Implement user authentication"
→ Agent designs API structure
→ Agent implements backend
→ Agent adds frontend
→ Agent writes tests
→ All tracked and automatically composed
2. Decentralized Research Teams
Use AnthropicAgentTeams for independent research agents:
Orchestration sets up task pool:
- Research question 1 (economics)
- Research question 2 (policy)
- Research question 3 (case studies)
Agents autonomously:
- Discover questions
- Claim and research them
- Use HTTP client to fetch data
- Store findings in Memory
- Orchestration collects results
3. Real-Time Agent Debugging
Use EventHandler to monitor agent behavior:
Handler logs:
- Which agent called which tool
- Tool latency and token usage
- Task claiming/completion in AnthropicAgentTeams
- Error rates and failures
Build dashboards, alerting, auto-remediation
4. Image-Generating Creative Workflows
Agents can now:
let image = agent.generate("A sunset over the ocean", "openai").await?;
agent.save_image("sunset.png", image)?;
// Agents can also research (HTTP), coordinate (Memory), and record thoughts (ThoughtChain)
Migration Guide
From v0.7.x to v0.8.x
RALPH Mode is opt-in. Existing orchestrations continue to work unchanged.
Image Generation is additive:
// Old code
let agent = Agent::new("id", "name", client).with_tools(registry);
// New code (if you want image generation)
register_image_generation_tool(&protocol, image_client).await?;
From v0.8.x to v0.9.x
Per-agent sessions are transparent. No changes needed for most code. If you have custom orchestration modes, update to use agent.send() instead of agent.generate().
From v0.9.x to v0.10.x
Event System is opt-in:
let handler = Arc::new(MyHandler);
let orchestration = Orchestration::new("id", "name")
.with_event_handler(handler); // Optional
AnthropicAgentTeams is a new mode:
.with_mode(OrchestrationMode::AnthropicAgentTeams {
pool_id: "my_pool".to_string(),
tasks: vec![...],
max_iterations: 10,
})
HttpClientProtocol is additive:
registry.add_protocol("http", http_protocol).await?;
// Agents automatically discover it
Looking Ahead
v0.11.0 (Planned)
- Streaming aggregation for Parallel mode (see results as they arrive)
- Distributed orchestration (agents on different machines coordinating via Memory)
- Cost-aware scheduling (choose agents by price/capability tradeoffs)
- Tool use optimization (agents learn which tools are most useful)
The Vision
CloudLLM is building toward a future where LLM agents are as natural to orchestrate as threads or processes. The same patterns that work for 2 agents should work for 20. The same debugging tools should work for simple sessions and complex orchestrations.
By providing structured collaboration patterns, atomic coordination primitives, and real-time observability, we’re enabling teams to build AI systems that are:
- Efficient — Only passing relevant context around
- Transparent — Every decision is observable and auditable
- Scalable — From 1 agent to many, the patterns hold
- Maintainable — Clear roles, clear task definitions, clear outputs
Thank You
To everyone who’s used CloudLLM to build agents, orchestrated multi-agent systems, and given feedback—thank you. This release represents months of architectural refinement inspired by real-world usage.
Special thanks to the Anthropic research team for the AnthropicAgentTeams inspiration (and for building Claude).
Getting Started
Installation
[dependencies]
cloudllm = "0.10.0"
Quick Example: RALPH Mode
use cloudllm::orchestration::{Orchestration, OrchestrationMode, RalphTask};
use cloudllm::Agent;
use cloudllm::clients::claude::ClaudeClient;
use std::sync::Arc;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error + Send + Sync>> {
let client = Arc::new(ClaudeClient::new_with_model_enum(
&std::env::var("ANTHROPIC_KEY")?,
cloudllm::clients::claude::Model::ClaudeHaiku45,
));
let tasks = vec![
RalphTask::new("task1", "Task 1", "Do the first thing"),
RalphTask::new("task2", "Task 2", "Do the second thing"),
RalphTask::new("task3", "Task 3", "Do the third thing"),
];
let mut orchestration = Orchestration::new("example", "My Orchestration")
.with_mode(OrchestrationMode::Ralph { tasks, max_iterations: 5 })
.with_system_context("Complete all tasks systematically.");
let agent = Agent::new("agent1", "Worker", client)
.with_expertise("Getting things done");
orchestration.add_agent(agent)?;
let result = orchestration.run("Complete all tasks", 1).await?;
println!("Done! {} rounds, {}% complete", result.round, result.convergence_score.unwrap_or(0.0) * 100.0);
Ok(())
}
Quick Example: AnthropicAgentTeams Mode
use cloudllm::orchestration::{Orchestration, OrchestrationMode, WorkItem};
use cloudllm::tools::Memory;
use cloudllm::tool_protocols::MemoryProtocol;
use std::sync::Arc;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error + Send + Sync>> {
let memory = Arc::new(Memory::new());
let protocol = Arc::new(MemoryProtocol::new(memory));
let tasks = vec![
WorkItem {
id: "task1".to_string(),
description: "Research topic A".to_string(),
acceptance_criteria: "2-page report".to_string(),
},
WorkItem {
id: "task2".to_string(),
description: "Analyze topic B".to_string(),
acceptance_criteria: "5-point analysis".to_string(),
},
];
let mut orchestration = Orchestration::new("research", "Team Research")
.with_mode(OrchestrationMode::AnthropicAgentTeams {
pool_id: "research_2024".to_string(),
tasks,
max_iterations: 10,
});
// Add agents — they'll autonomously discover and claim work
// orchestration.add_agent(agent1)?;
// orchestration.add_agent(agent2)?;
let result = orchestration.run("Research these topics", 1).await?;
println!("All tasks complete: {}", result.is_complete);
Ok(())
}
Full Examples
examples/breakout_game_ralph.rs— Build a game with RALPH modeexamples/breakout_game_agent_teams.rs— Build the same game with decentralized coordinationexamples/anthropic_teams.rs— Research team with 4 agents and 8-task poolexamples/orchestration_demo.rs— All 7 orchestration modes in one file
Run with:
cargo run --example orchestration_demo
cargo run --example breakout_game_ralph
ANTHROPIC_API_KEY=your-key cargo run --example breakout_game_agent_teams
Documentation
- README.md — Platform overview and quick start
- ORCHESTRATION_TUTORIAL.md — Deep dive into all 7 collaboration modes
- examples/README.md — Complete 22-example guide
Run cargo doc --open for complete API documentation.
Community & Support
- GitHub: CloudLLM-ai/cloudllm
- Issues & Pull Requests welcome
- MIT License
Welcome to the era of intelligent, orchestrated, observable multi-agent systems.
Happy orchestration! 🚀
