Skip to main content

Overview

The LlmAgent is the foundational agent type in Fast Agent that provides core LLM interaction capabilities. It handles conversation management, message display, streaming responses, and stop reason handling. LlmAgent extends LlmDecorator with UI display methods, tool call tracking, and chat interaction patterns while delegating core LLM operations to the attached FastAgentLLMProtocol.

Key Features

  • Conversation Management: Maintains message history and handles multi-turn conversations
  • Streaming Support: Displays responses as they’re generated with configurable streaming modes
  • Stop Reason Handling: Gracefully handles different completion reasons (END_TURN, MAX_TOKENS, SAFETY, etc.)
  • Display Integration: Rich console display with syntax highlighting and formatting
  • Usage Tracking: Tracks token usage and context percentage
  • Message Rendering: Supports markdown rendering and custom message display

Architecture

The LlmAgent is part of a three-layer architecture:
LlmAgent (interaction & display)
    ↓ extends
LlmDecorator (core LLM logic)
    ↓ uses
FastAgentLLMProtocol (LLM provider interface)

Creating a Basic Agent

Simple Configuration

import asyncio
from fast_agent.agents.agent_types import AgentConfig
from fast_agent.agents.llm_agent import LlmAgent
from fast_agent.core import Core
from fast_agent.llm.model_factory import ModelFactory

async def main():
    core = Core()
    await core.initialize()
    
    # Create agent configuration
    config = AgentConfig(
        name="assistant",
        instruction="You are a helpful assistant.",
        model="gpt-4o-mini"
    )
    
    # Create the agent
    agent = LlmAgent(config, context=core.context)
    
    # Attach the LLM
    await agent.attach_llm(ModelFactory.create_factory("gpt-4o-mini"))
    
    # Send a message
    response = await agent.send("Hello, how are you?")
    print(response)
    
    await core.cleanup()

asyncio.run(main())

With Custom Instructions

config = AgentConfig(
    name="writer",
    instruction="""You are a professional technical writer.
    
    Guidelines:
    - Write clear, concise documentation
    - Use examples to illustrate concepts
    - Structure content with headings
    - Include code snippets when helpful
    """,
    model="claude-3-5-sonnet-20241022",
    use_history=True
)

agent = LlmAgent(config, context=core.context)
await agent.attach_llm(ModelFactory.create_factory("claude-3-5-sonnet-20241022"))

Configuration Options

AgentConfig Parameters

name
str
required
The name of the agent
instruction
str
default:"DEFAULT_AGENT_INSTRUCTION"
System prompt/instruction for the agent
model
str
Model identifier (e.g., “gpt-4o-mini”, “claude-3-5-sonnet”)
use_history
bool
default:"true"
Whether to maintain conversation history
description
str
Human-readable description of the agent’s purpose
default_request_params
RequestParams
Default parameters for LLM requests

Working with Messages

Sending Messages

# Simple text message
response = await agent.send("What is Fast Agent?")

# Generate with full control
from fast_agent.core.prompt import Prompt

messages = [
    Prompt.user("Explain quantum computing"),
]
response = await agent.generate(messages, None)
print(response.first_text())

Message History

# Access conversation history
history = agent.message_history
for msg in history:
    print(f"{msg.role}: {msg.content}")

# Clear history
agent.clear()

# Load custom history
from fast_agent.types import PromptMessageExtended

custom_history = [
    PromptMessageExtended(role="user", content="Hello"),
    PromptMessageExtended(role="assistant", content="Hi there!"),
]
agent.load_message_history(custom_history)

Display and Streaming

Display Configuration

The agent uses ConsoleDisplay for rich terminal output:
# Access display settings
display = agent.display

# Check streaming preferences
enabled, mode = display.resolve_streaming_preferences()
print(f"Streaming: {enabled}, Mode: {mode}")

Controlling Streaming

# Disable streaming for next turn
agent.force_non_streaming_next_turn(reason="debugging")

# Close active streaming display
agent.close_active_streaming_display(reason="parallel operations")

Stop Reasons

The agent handles various completion reasons:
Stop ReasonDescriptionAgent Behavior
END_TURNNormal completionDisplay response
MAX_TOKENSToken limit reachedShow warning
TOOL_USETool call requestedExecute tools (if ToolAgent)
SAFETYSafety filter triggeredShow error
PAUSELLM requested pauseShow notification
ERRORError occurredDisplay error details
CANCELLEDUser cancelledShow cancellation

Advanced Usage

Structured Output

from pydantic import BaseModel

class WeatherReport(BaseModel):
    city: str
    temperature: int
    conditions: str

messages = [Prompt.user("What's the weather in Paris?")]
result, message = await agent.structured(
    messages,
    WeatherReport,
    None
)

if result:
    print(f"Temperature in {result.city}: {result.temperature}°C")

Custom Message Display

from fast_agent.types import PromptMessageExtended
from rich.text import Text

# Create a custom message
message = PromptMessageExtended(
    role="assistant",
    content="Custom response"
)

# Display with custom formatting
await agent.show_assistant_message(
    message,
    name="CustomAgent",
    model="gpt-4",
    additional_message=Text("Extra info", style="dim"),
    render_markdown=True
)

Usage Tracking

# Get usage accumulator
usage = agent.usage_accumulator

if usage:
    print(f"Input tokens: {usage.input_tokens}")
    print(f"Output tokens: {usage.output_tokens}")
    print(f"Total cost: ${usage.total_cost:.4f}")
    print(f"Context usage: {usage.context_usage_percentage:.1f}%")

Best Practices

  • Keep instructions clear and specific
  • Include examples for complex tasks
  • Use structured formatting for guidelines
  • Test with various inputs to validate behavior
  • Clear history when starting new topics
  • Monitor context window usage
  • Use use_history=False for stateless interactions
  • Load custom history for specific workflows
  • Always check stop_reason for errors
  • Handle safety filters gracefully
  • Monitor token limits
  • Implement retry logic for transient failures

Next Steps

Tool Agent

Add function calling capabilities to your agent

MCP Agent

Connect to MCP servers for extended functionality

LLM Agent

Learn about the full LlmAgent API

Configuration

Explore all configuration options