Agent Architectures

ReAct, plan-and-execute, reflection loops, multi-agent systems, and orchestration strategies.

The 80/20

Agent architectures define how LLMs interact with tools, plan actions, and coordinate multiple reasoning steps. Most production systems use one of four core patterns: ReAct for simple tool use, plan-and-execute for complex multi-step tasks, reflection loops for self-improvement, and multi-agent systems for specialized coordination.

The key insight is that different tasks require different architectural approaches. Simple queries work with basic ReAct, complex workflows need planning, and tasks requiring quality need reflection. Choose the simplest architecture that meets your requirements—more complex patterns add latency and failure modes.

The Agent Architecture Problem

Traditional LLMs are stateless—they receive a prompt, generate a response, and forget everything. But many real-world tasks require multiple steps, tool usage, memory, and the ability to recover from errors. Consider booking a flight:

Search for flights on a specific date
Compare prices across airlines
Check seat availability
Handle booking errors (sold out, payment issues)
Confirm reservation details

This requires calling multiple APIs, maintaining state across steps, handling failures, and potentially replanning when things go wrong. A single LLM call can't handle this complexity.

graph TD
    subgraph "Single LLM Call"
    P[Prompt] --> L[LLM] --> R[Response]
    end
    
    subgraph "Agent Architecture"
    P2[Task] --> A[Agent]
    A --> T1[Tool Call 1]
    A --> T2[Tool Call 2]
    A --> T3[Tool Call 3]
    T1 --> A
    T2 --> A
    T3 --> A
    A --> R2[Final Result]
    end

Agent architectures solve this by adding control flow, memory, and tool integration around the LLM. The architecture determines how the agent reasons, acts, and learns from its actions.

Core Agent Patterns

Four patterns handle the majority of agent use cases in production systems. Each makes different tradeoffs between simplicity, capability, and reliability.

ReAct (Reasoning + Acting)

ReAct alternates between reasoning (thinking about what to do) and acting (using tools or taking actions). The agent follows a simple loop: think, act, observe, repeat until the task is complete.

def react_loop(task, tools, max_steps=10):
    context = f"Task: {task}"
    
    for step in range(max_steps):
        # Reasoning step
        thought = llm.generate(f"{context}\nThought:")
        
        # Acting step  
        action = llm.generate(f"{context}\nThought: {thought}\nAction:")
        
        if action.startswith("FINISH"):
            return action.replace("FINISH: ", "")
            
        # Observation step
        result = execute_tool(action, tools)
        context += f"\nThought: {thought}\nAction: {action}\nObservation: {result}"
    
    return "Task incomplete after maximum steps"

Example ReAct trace:

Task: What's the weather in San Francisco and should I bring an umbrella?

Thought: I need to check the current weather in San Francisco to see if it's raining or likely to rain.
Action: get_weather(location="San Francisco, CA")
Observation: Current weather in San Francisco: 72°F, partly cloudy, 10% chance of rain, wind 8 mph

Thought: The weather looks good with only a 10% chance of rain. I should provide a recommendation about the umbrella.
Action: FINISH: The weather in San Francisco is 72°F and partly cloudy with only a 10% chance of rain. You probably don't need an umbrella today, but it's always good to check the forecast again before heading out.

Strengths:

Simple to implement and debug
Works well for straightforward tool use
Low latency for simple tasks
Easy to add new tools

Weaknesses:

No long-term planning
Can get stuck in loops
Inefficient for complex multi-step tasks
Limited error recovery

ReAct works best for tasks that can be solved with 1-5 tool calls where each step naturally follows from the previous observation.

Plan-and-Execute

Plan-and-execute separates high-level planning from step-by-step execution. The agent first creates a complete plan, then executes each step, potentially replanning when things go wrong.

def plan_and_execute(task, tools):
    # Planning phase
    plan = llm.generate(f"""
    Task: {task}
    Available tools: {list(tools.keys())}
    
    Create a step-by-step plan to complete this task:
    """)
    
    steps = parse_plan(plan)
    results = []
    
    # Execution phase
    for i, step in enumerate(steps):
        try:
            result = execute_step(step, tools, results)
            results.append(result)
        except Exception as e:
            # Replanning on failure
            remaining_steps = steps[i:]
            new_plan = replan(task, remaining_steps, results, str(e))
            steps = steps[:i] + parse_plan(new_plan)
    
    return synthesize_results(results)

def replan(task, failed_steps, completed_results, error):
    return llm.generate(f"""
    Original task: {task}
    Completed so far: {completed_results}
    Failed step: {failed_steps[0]}
    Error: {error}
    
    Create a new plan to complete the remaining task:
    """)

Example Plan-and-Execute:

Task: Research and book a flight from NYC to London for next Friday

Plan:
1. Search for flights from NYC to London on [date]
2. Compare prices and flight times
3. Check baggage policies for top 3 options
4. Select best option based on price and convenience
5. Initiate booking process
6. Handle payment and confirmation

Execution:
Step 1: search_flights(origin="NYC", destination="London", date="2024-03-22")
Result: Found 15 flights, prices $450-$890

Step 2: compare_flights(top_n=3)
Result: British Airways $650 (direct), Virgin $580 (1 stop), Delta $720 (direct)

Step 3: get_baggage_policy(airlines=["British Airways", "Virgin", "Delta"])
Result: BA: 1 free bag, Virgin: 1 free bag, Delta: 1 free bag

Step 4: analyze_options()
Result: Virgin offers best value at $580 with acceptable 1-stop

Step 5: initiate_booking(flight_id="VS123", passenger_details=...)
Result: Booking initiated, payment required

Step 6: process_payment(booking_id="ABC123")
Result: Payment successful, confirmation #DEF456

Strengths:

Handles complex multi-step tasks well
Can replan when things go wrong
More efficient than ReAct for complex workflows
Clear separation of planning and execution

Weaknesses:

Higher latency due to planning overhead
Plans can become outdated quickly
More complex to implement and debug
May over-plan for simple tasks

Plan-and-execute excels at complex workflows with 5+ steps where upfront planning saves time and reduces errors.

Reflection Loops

Reflection loops add self-critique and improvement to agent behavior. The agent performs an action, evaluates the result, and iteratively improves until reaching a satisfactory outcome.

def reflection_loop(task, tools, max_iterations=3):
    current_attempt = ""
    
    for iteration in range(max_iterations):
        # Generate or improve attempt
        if iteration == 0:
            current_attempt = llm.generate(f"Task: {task}\nSolution:")
        else:
            current_attempt = llm.generate(f"""
            Task: {task}
            Previous attempt: {current_attempt}
            Critique: {critique}
            
            Improved solution:
            """)
        
        # Self-critique
        critique = llm.generate(f"""
        Task: {task}
        Solution: {current_attempt}
        
        Critique this solution. What could be improved?
        Rate quality 1-10 and explain issues:
        """)
        
        # Check if good enough
        if extract_rating(critique) >= 8:
            return current_attempt
            
    return current_attempt

def extract_rating(critique_text):
    # Extract numerical rating from critique
    import re
    match = re.search(r'(\d+)/10|(\d+) out of 10', critique_text)
    return int(match.group(1) or match.group(2)) if match else 5

Example Reflection Loop:

Task: Write a professional email declining a job offer

Iteration 1:
Solution: "Thanks but no thanks. I found something better."
Critique: Rating 3/10. Too casual and potentially rude. Lacks professionalism, gratitude, and doesn't maintain relationships.

Iteration 2:
Solution: "Thank you for the offer. After careful consideration, I've decided to pursue another opportunity that better aligns with my career goals. I appreciate the time you invested in the process."
Critique: Rating 7/10. Much better tone and professionalism. Could add more specific gratitude and leave door open for future opportunities.

Iteration 3:
Solution: "Thank you so much for extending the offer for the [Position] role at [Company]. I truly appreciate the time you and your team invested in getting to know me throughout the interview process. After careful consideration, I've decided to accept another position that more closely aligns with my long-term career objectives. I was impressed by [Company] and hope we might have the opportunity to work together in the future."
Critique: Rating 9/10. Excellent professional tone, specific gratitude, maintains relationships.

Strengths:

Improves output quality through iteration
Self-correcting behavior
Works well for creative and analytical tasks
Can catch and fix its own mistakes

Weaknesses:

High latency due to multiple LLM calls
Expensive (3x+ the cost of single attempts)
May over-optimize or get stuck in loops
Critique quality depends on model capability

Reflection loops are ideal for high-stakes outputs where quality matters more than speed—writing, analysis, code review, and creative tasks.

Multi-Agent Systems

Multi-agent systems coordinate multiple specialized agents to handle complex tasks. Each agent has specific capabilities and they communicate to achieve shared goals.

class MultiAgentSystem:
    def __init__(self):
        self.agents = {
            'researcher': ResearchAgent(),
            'writer': WritingAgent(), 
            'critic': CriticAgent(),
            'coordinator': CoordinatorAgent()
        }
        self.shared_memory = {}
    
    def execute_task(self, task):
        # Coordinator decides task breakdown
        plan = self.agents['coordinator'].create_plan(task)
        
        for step in plan.steps:
            agent_name = step.assigned_agent
            agent = self.agents[agent_name]
            
            # Execute step with access to shared memory
            result = agent.execute(step.instruction, self.shared_memory)
            
            # Update shared memory
            self.shared_memory[step.output_key] = result
            
            # Allow other agents to react/provide feedback
            if step.requires_review:
                feedback = self.agents['critic'].review(result)
                if feedback.needs_revision:
                    result = agent.revise(result, feedback.suggestions)
                    self.shared_memory[step.output_key] = result
        
        return self.agents['coordinator'].synthesize_results(self.shared_memory)

class ResearchAgent:
    def execute(self, instruction, shared_memory):
        # Specialized for information gathering
        return self.search_and_analyze(instruction)

class WritingAgent:
    def execute(self, instruction, shared_memory):
        # Specialized for content creation
        research_data = shared_memory.get('research_results', '')
        return self.write_content(instruction, research_data)

Example Multi-Agent Workflow:

Task: Create a comprehensive market analysis report for electric vehicles

Coordinator Plan:
1. Researcher: Gather market data, competitor analysis, trends
2. Researcher: Collect regulatory information and policy impacts  
3. Writer: Create executive summary based on research
4. Writer: Write detailed analysis sections
5. Critic: Review report for accuracy and completeness
6. Writer: Revise based on feedback
7. Coordinator: Compile final report

Execution:
Researcher → Gathers EV market data ($X billion market, Y% growth)
Researcher → Finds policy info (tax incentives, emission regulations)
Writer → Creates executive summary using research data
Critic → Reviews: "Missing competitive positioning analysis"
Writer → Adds competitive analysis section
Coordinator → Compiles polished final report

Strengths:

Handles very complex, multi-faceted tasks
Agents can specialize and improve independently
Parallel execution possible for some tasks
Natural division of labor

Weaknesses:

High complexity to implement and debug
Coordination overhead and potential conflicts
Expensive due to multiple agent calls
Communication protocols can become complex

Multi-agent systems work best for complex projects requiring diverse skills—research reports, software development, content creation pipelines, and collaborative analysis.

Implementation Considerations

Context Management

All agent patterns must handle context growth as conversations extend. Long-running agents can exceed context windows quickly.

class ContextManager:
    def __init__(self, max_tokens=8000):
        self.max_tokens = max_tokens
        self.conversation_history = []
    
    def add_interaction(self, thought, action, observation):
        interaction = {
            'thought': thought,
            'action': action, 
            'observation': observation,
            'tokens': estimate_tokens(thought + action + observation)
        }
        self.conversation_history.append(interaction)
        self._trim_if_needed()
    
    def _trim_if_needed(self):
        total_tokens = sum(i['tokens'] for i in self.conversation_history)
        
        if total_tokens > self.max_tokens:
            # Keep recent interactions, summarize older ones
            recent = self.conversation_history[-5:]  # Keep last 5
            older = self.conversation_history[:-5]
            
            summary = self._summarize_interactions(older)
            self.conversation_history = [{'summary': summary}] + recent

Error Handling

Production agents need robust error handling for tool failures, API timeouts, and invalid responses.

def robust_tool_call(tool_name, params, max_retries=3):
    for attempt in range(max_retries):
        try:
            return tools[tool_name](**params)
        except ToolTimeout:
            if attempt < max_retries - 1:
                time.sleep(2 ** attempt)  # Exponential backoff
                continue
            return f"Tool {tool_name} timed out after {max_retries} attempts"
        except ToolError as e:
            return f"Tool error: {str(e)}"
        except Exception as e:
            if attempt < max_retries - 1:
                continue
            return f"Unexpected error: {str(e)}"

Cost Optimization

Agent architectures can be expensive due to multiple LLM calls. Optimize by caching, using smaller models for simple tasks, and batching when possible.

class CostOptimizedAgent:
    def __init__(self):
        self.cache = {}
        self.small_model = "gpt-3.5-turbo"  # For simple tasks
        self.large_model = "gpt-4"          # For complex reasoning
    
    def choose_model(self, task_complexity):
        if task_complexity < 0.5:
            return self.small_model
        return self.large_model
    
    def cached_call(self, prompt, model):
        cache_key = hash(prompt + model)
        if cache_key in self.cache:
            return self.cache[cache_key]
        
        result = llm.generate(prompt, model=model)
        self.cache[cache_key] = result
        return result

When to Use Which Pattern

Decision Framework

graph TD
    A[Agent Task] --> B{Number of Steps?}
    B -->|1-3 steps| C{Quality Critical?}
    B -->|4-10 steps| D{Complex Planning Needed?}
    B -->|10+ steps| E{Multiple Specializations?}
    
    C -->|No| F[ReAct]
    C -->|Yes| G[Reflection Loop]
    
    D -->|Yes| H[Plan-and-Execute]
    D -->|No| F
    
    E -->|Yes| I[Multi-Agent System]
    E -->|No| H

Task-Pattern Mapping

Task Type	Best Pattern	Why
Simple Q&A with tools	ReAct	Direct tool use, minimal steps
Data analysis	Reflection Loop	Quality matters, iterative improvement
Complex workflows	Plan-and-Execute	Multi-step coordination needed
Creative projects	Multi-Agent	Diverse skills (research, writing, critique)
Code generation	Reflection Loop	Quality critical, self-debugging
Research tasks	Plan-and-Execute	Structured information gathering
Real-time chat	ReAct	Low latency requirements

Performance Characteristics

Pattern	Latency	Cost	Quality	Complexity
ReAct	Low	Low	Medium	Low
Plan-and-Execute	Medium	Medium	High	Medium
Reflection Loop	High	High	Very High	Medium
Multi-Agent	Very High	Very High	Very High	Very High

Common Challenges

Tool Selection and Routing

As agents gain access to more tools, choosing the right tool becomes critical. Poor tool selection leads to inefficient workflows and errors.

class ToolRouter:
    def __init__(self, tools):
        self.tools = tools
        self.tool_descriptions = {
            name: tool.description for name, tool in tools.items()
        }
    
    def select_tool(self, task, context):
        prompt = f"""
        Task: {task}
        Context: {context}
        
        Available tools:
        {self.format_tool_descriptions()}
        
        Which tool is most appropriate? Respond with just the tool name.
        """
        
        selected = llm.generate(prompt).strip()
        
        if selected not in self.tools:
            # Fallback to similarity matching
            selected = self.find_most_similar_tool(task)
        
        return selected

Loop Detection and Prevention

Agents can get stuck in loops, repeatedly trying the same failed action. Implement loop detection and circuit breakers.

class LoopDetector:
    def __init__(self, max_repeats=3):
        self.action_history = []
        self.max_repeats = max_repeats
    
    def check_action(self, action):
        self.action_history.append(action)
        
        # Keep only recent history
        if len(self.action_history) > 10:
            self.action_history = self.action_history[-10:]
        
        # Count recent repeats
        recent_actions = self.action_history[-self.max_repeats:]
        if len(set(recent_actions)) == 1 and len(recent_actions) == self.max_repeats:
            return False  # Loop detected
        
        return True  # Action allowed

State Management Across Steps

Complex agents need to maintain state across multiple interactions while avoiding context window overflow.

class AgentState:
    def __init__(self):
        self.working_memory = {}  # Current task context
        self.episodic_memory = []  # Past interactions
        self.semantic_memory = {}  # Learned facts/patterns
    
    def update_working_memory(self, key, value):
        self.working_memory[key] = value
        
        # Prevent memory overflow
        if len(str(self.working_memory)) > 4000:  # ~1000 tokens
            self._compress_working_memory()
    
    def _compress_working_memory(self):
        # Summarize older entries
        summary = llm.generate(f"Summarize key points: {self.working_memory}")
        self.working_memory = {'summary': summary}

Appendix: Additional Patterns

Memory Patterns

Episodic Memory - Store and retrieve past experiences for learning and context.

class EpisodicMemory:
    def __init__(self):
        self.episodes = []
    
    def store_episode(self, situation, action, outcome, success):
        episode = {
            'situation': situation,
            'action': action,
            'outcome': outcome,
            'success': success,
            'timestamp': time.time()
        }
        self.episodes.append(episode)
    
    def retrieve_similar(self, current_situation, k=3):
        # Use embedding similarity to find relevant past episodes
        similarities = []
        for episode in self.episodes:
            sim = cosine_similarity(
                embed(current_situation), 
                embed(episode['situation'])
            )
            similarities.append((sim, episode))
        
        return sorted(similarities, reverse=True)[:k]

Working Memory - Manage short-term context and attention.

class WorkingMemory:
    def __init__(self, capacity=7):  # Miller's magic number
        self.items = []
        self.capacity = capacity
    
    def add_item(self, item, importance=1.0):
        self.items.append({'content': item, 'importance': importance})
        
        if len(self.items) > self.capacity:
            # Remove least important item
            self.items.sort(key=lambda x: x['importance'])
            self.items = self.items[1:]
    
    def get_context(self):
        return [item['content'] for item in self.items]

Advanced Reasoning Patterns

Tree of Thoughts - Explore multiple reasoning paths simultaneously.

def tree_of_thoughts(problem, depth=3, breadth=3):
    class ThoughtNode:
        def __init__(self, thought, parent=None):
            self.thought = thought
            self.parent = parent
            self.children = []
            self.value = None
    
    root = ThoughtNode("Initial problem analysis")
    
    def expand_node(node, current_depth):
        if current_depth >= depth:
            return
        
        # Generate multiple next thoughts
        thoughts = llm.generate(f"""
        Problem: {problem}
        Current reasoning: {node.thought}
        
        Generate {breadth} different next reasoning steps:
        """).split('\n')
        
        for thought in thoughts[:breadth]:
            child = ThoughtNode(thought.strip(), node)
            node.children.append(child)
            expand_node(child, current_depth + 1)
    
    expand_node(root, 0)
    
    # Evaluate all leaf nodes and backpropagate
    def evaluate_path(node):
        if not node.children:  # Leaf node
            path = []
            current = node
            while current:
                path.append(current.thought)
                current = current.parent
            
            score = llm.generate(f"""
            Problem: {problem}
            Reasoning path: {' -> '.join(reversed(path))}
            
            Rate this reasoning path 1-10:
            """)
            return float(score.strip())
        
        return max(evaluate_path(child) for child in node.children)
    
    return evaluate_path(root)

Control Flow Patterns

State Machines - Explicit state management for complex agent behavior.

class AgentStateMachine:
    def __init__(self):
        self.state = 'IDLE'
        self.transitions = {
            'IDLE': ['PLANNING', 'RESPONDING'],
            'PLANNING': ['EXECUTING', 'REPLANNING'],
            'EXECUTING': ['PLANNING', 'COMPLETED', 'ERROR'],
            'ERROR': ['PLANNING', 'COMPLETED'],
            'COMPLETED': ['IDLE']
        }
    
    def transition(self, new_state, context=None):
        if new_state in self.transitions[self.state]:
            old_state = self.state
            self.state = new_state
            self._on_state_change(old_state, new_state, context)
        else:
            raise ValueError(f"Invalid transition from {self.state} to {new_state}")
    
    def _on_state_change(self, old_state, new_state, context):
        # Handle state-specific logic
        if new_state == 'PLANNING':
            self._create_plan(context)
        elif new_state == 'EXECUTING':
            self._execute_current_step()

Behavior Trees - Hierarchical decision structures from game AI.

class BehaviorNode:
    def execute(self, context):
        raise NotImplementedError

class SequenceNode(BehaviorNode):
    def __init__(self, children):
        self.children = children
    
    def execute(self, context):
        for child in self.children:
            result = child.execute(context)
            if result != 'SUCCESS':
                return result
        return 'SUCCESS'

class SelectorNode(BehaviorNode):
    def __init__(self, children):
        self.children = children
    
    def execute(self, context):
        for child in self.children:
            result = child.execute(context)
            if result == 'SUCCESS':
                return result
        return 'FAILURE'

class ActionNode(BehaviorNode):
    def __init__(self, action_func):
        self.action_func = action_func
    
    def execute(self, context):
        return self.action_func(context)

These additional patterns provide specialized solutions for complex agent behaviors, but the four core patterns (ReAct, Plan-and-Execute, Reflection Loops, Multi-Agent) handle the majority of production use cases. Choose the simplest pattern that meets your requirements, and consider these advanced patterns only when the core patterns prove insufficient.