Skip to main content

Overview

The Evaluator-Optimizer workflow implements iterative refinement through a feedback loop between two specialized agents: a generator that produces content and an evaluator that assesses quality and provides actionable feedback. This pattern continues until content meets quality standards or maximum refinements are reached.
This pattern is inspired by Anthropic’s “Building Effective Agents” research and is ideal for high-quality content generation.

Key Features

  • Iterative Refinement: Continuous improvement through feedback loops
  • Quality-Driven: Stops when content meets specified quality rating
  • Structured Evaluation: Uses Pydantic models for consistent feedback
  • Actionable Feedback: Evaluator provides specific improvement areas
  • Best Response Tracking: Returns highest-quality result even if target not reached

Basic Usage

import asyncio
from fast_agent import FastAgent

fast = FastAgent("Evaluator-Optimizer")

# Define generator agent
@fast.agent(
    name="generator",
    instruction="""You are a career coach specializing in cover letter writing.
    You are tasked with generating a compelling cover letter given the job posting,
    candidate details, and company information. Tailor the response to the company 
    and job requirements.""",
    servers=["fetch"],
    model="gpt-5-nano.low",
    use_history=True,
)

# Define evaluator agent
@fast.agent(
    name="evaluator",
    instruction="""Evaluate the following response based on the criteria below:
    1. Clarity: Is the language clear, concise, and grammatically correct?
    2. Specificity: Does the response include relevant and concrete details?
    3. Relevance: Does the response align with the prompt?
    4. Tone and Style: Is the tone professional and appropriate?
    5. Persuasiveness: Does the response effectively highlight value?
    6. Grammar and Mechanics: Are there spelling or grammatical issues?
    7. Feedback Alignment: Has the response addressed previous feedback?

    For each criterion:
    - Provide a rating (EXCELLENT, GOOD, FAIR, or POOR)
    - Offer specific feedback or suggestions for improvement

    Summarize your evaluation with overall quality rating and specific feedback.""",
    model="o3-mini.medium",
)

# Define the evaluator-optimizer workflow
@fast.evaluator_optimizer(
    name="cover_letter_writer",
    generator="generator",
    evaluator="evaluator",
    min_rating="EXCELLENT",
    max_refinements=3,
)
async def main() -> None:
    async with fast.run() as agent:
        job_posting = (
            "Software Engineer at LastMile AI. Responsibilities include developing AI systems, "
            "collaborating with cross-functional teams, and enhancing scalability. Skills required: "
            "Python, distributed systems, and machine learning."
        )
        candidate_details = (
            "Alex Johnson, 3 years in machine learning, contributor to open-source AI projects, "
            "proficient in Python and TensorFlow. Motivated by building scalable AI systems."
        )
        company_information = (
            "Look up from the LastMile AI About page: https://lastmileai.dev/about"
        )

        await agent.cover_letter_writer.send(
            f"Write a cover letter for the following job posting: {job_posting}\n\n"
            f"Candidate Details: {candidate_details}\n\n"
            f"Company information: {company_information}",
        )

if __name__ == "__main__":
    asyncio.run(main())

Configuration Parameters

name
string
required
Name of the evaluator-optimizer workflow
generator
string
required
Name of the content generator agent
evaluator
string
required
Name of the evaluator agent
min_rating
QualityRating
default:"GOOD"
Minimum acceptable quality rating: EXCELLENT, GOOD, FAIR, or POOR
max_refinements
int
default:"3"
Maximum number of refinement iterations
refinement_instruction
string
Optional custom instruction for refinement process

Quality Ratings

The workflow uses a structured quality rating system:
class QualityRating(str, Enum):
    POOR = "POOR"          # Major improvements needed
    FAIR = "FAIR"          # Several improvements needed
    GOOD = "GOOD"          # Minor improvements possible
    EXCELLENT = "EXCELLENT" # No improvements needed

Evaluation Response Format

class EvaluationResult(BaseModel):
    rating: QualityRating
    feedback: str
    needs_improvement: bool
    focus_areas: List[str]
Example evaluation:
{
  "rating": "GOOD",
  "feedback": "Cover letter demonstrates strong technical alignment but could better highlight specific achievements and company research.",
  "needs_improvement": true,
  "focus_areas": [
    "Add quantifiable achievements from past projects",
    "Reference specific LastMile AI products or initiatives",
    "Strengthen the closing call-to-action"
  ]
}

How It Works

Workflow Steps:
  1. Initial Generation: Generator creates first version
  2. Evaluation: Evaluator assesses quality and provides feedback
  3. Quality Check: Compare rating against minimum threshold
  4. Refinement: If needed and refinements remain, generator improves based on feedback
  5. Iteration: Repeat evaluation and refinement
  6. Completion: Return best result when quality met or max refinements reached

Generator History Modes

The generator’s use_history setting affects refinement:

With History (use_history=True)

@fast.agent(
    name="generator",
    instruction="Your instruction here",
    use_history=True,  # Generator sees conversation context
)
Refinement prompt references feedback conversationally:
You are tasked with improving your previous response.
This is iteration 2 of the refinement process.

<fastagent:feedback>
  <rating>GOOD</rating>
  <details>Strong technical content but needs more specific examples</details>
  <focus-areas>
    * Add quantifiable achievements
    * Reference specific company initiatives
  </focus-areas>
</fastagent:feedback>

Create an improved version...

Without History (use_history=False)

The previous iteration’s full output is included in refinement prompt.

Advanced Examples

Research Report Generator

@fast.agent(
    "researcher",
    instruction="""Research a topic thoroughly using available sources.
    Produce a comprehensive, well-cited report with clear sections.""",
    servers=["fetch"],
    model="sonnet",
    use_history=True,
)
@fast.agent(
    "research_evaluator",
    instruction="""Evaluate research reports on:
    1. Depth of research and source quality
    2. Accuracy and factual correctness
    3. Organization and structure
    4. Citation quality and completeness
    5. Clarity and readability
    
    Provide detailed, actionable feedback for improvement.""",
    model="sonnet",
)
@fast.evaluator_optimizer(
    name="research_assistant",
    generator="researcher",
    evaluator="research_evaluator",
    min_rating="EXCELLENT",
    max_refinements=5,
)
async def main() -> None:
    async with fast.run() as agent:
        await agent.research_assistant.send(
            "Produce a comprehensive report on the environmental impact of cryptocurrency mining"
        )

Code Quality Improver

@fast.agent(
    "code_generator",
    instruction="""Generate clean, well-documented Python code that solves the given problem.
    Follow PEP 8 style guidelines and include docstrings.""",
    servers=["filesystem"],
    use_history=True,
)
@fast.agent(
    "code_reviewer",
    instruction="""Review code for:
    1. Correctness and bug-free implementation
    2. Code style and PEP 8 compliance
    3. Documentation quality
    4. Performance and efficiency
    5. Error handling
    6. Test coverage
    
    Provide specific line-level feedback.""",
    model="sonnet",
)
@fast.evaluator_optimizer(
    name="code_improver",
    generator="code_generator",
    evaluator="code_reviewer",
    min_rating="GOOD",
    max_refinements=4,
)
async def main() -> None:
    async with fast.run() as agent:
        await agent.code_improver.send(
            "Create a Python class for managing a connection pool with automatic retry logic"
        )

Marketing Copy Optimizer

@fast.agent(
    "copywriter",
    instruction="""Write compelling marketing copy that engages the target audience,
    highlights key benefits, and includes a strong call-to-action.""",
    use_history=True,
)
@fast.agent(
    "copy_critic",
    instruction="""Evaluate marketing copy on:
    1. Attention-grabbing headline
    2. Clear value proposition
    3. Emotional resonance
    4. Call-to-action effectiveness
    5. Brand voice alignment
    6. Grammar and readability
    
    Focus on conversion optimization.""",
)
@fast.evaluator_optimizer(
    name="copy_optimizer",
    generator="copywriter",
    evaluator="copy_critic",
    min_rating="EXCELLENT",
    max_refinements=3,
)
async def main() -> None:
    async with fast.run() as agent:
        await agent.copy_optimizer.send(
            "Write landing page copy for a new AI-powered project management tool"
        )

Custom Refinement Instructions

Provide domain-specific guidance for the refinement process:
CUSTOM_REFINEMENT = """
You are an academic writing specialist.
Each refinement should strengthen:
1. Thesis clarity and argumentation
2. Evidence quality and citation accuracy
3. Academic tone and formality
4. Logical flow between paragraphs
5. Critical analysis depth
"""

@fast.evaluator_optimizer(
    name="academic_writer",
    generator="writer",
    evaluator="evaluator",
    refinement_instruction=CUSTOM_REFINEMENT,
    min_rating="EXCELLENT",
    max_refinements=4,
)

Best Practices

Specific Evaluator Instructions

Define clear, measurable evaluation criteria for consistent feedback

Actionable Feedback

Ensure evaluator provides specific, implementable improvement suggestions

Appropriate Refinement Limits

Balance quality goals with cost - typically 3-5 refinements

Generator History Management

Use history mode when refinements build on conversation context

Performance Considerations

Cost Scaling: Each refinement doubles the LLM calls (generator + evaluator).Example with max_refinements=3:
  • Initial: 2 calls (generate + evaluate)
  • Refinement 1: 2 calls
  • Refinement 2: 2 calls
  • Refinement 3: 2 calls
  • Total: 8 LLM calls
Use appropriate max_refinements and consider using cheaper models for initial iterations.

Tracking Refinement History

Access the refinement history for debugging or analysis:
result = await agent.my_optimizer.send("Generate content...")

# Access refinement history
history = agent.my_optimizer.refinement_history
for iteration in history:
    print(f"Attempt {iteration['attempt']}")
    print(f"Rating: {iteration['evaluation']['rating']}")
    print(f"Feedback: {iteration['evaluation']['feedback']}")

Use Cases

  • Content Creation: Blog posts, articles, marketing copy
  • Technical Writing: Documentation, reports, specifications
  • Creative Writing: Stories, scripts, poetry with quality standards
  • Code Generation: Iteratively improve code quality and style
  • Academic Writing: Research papers, essays, thesis work
  • Legal Documents: Contracts, policies with accuracy requirements
  • Translation: Improve translation quality through feedback
  • Resume/Cover Letters: Personalized, high-quality job applications
FeatureEvaluator-OptimizerChainMAKEROrchestrator
Feedback Loop✅ Iterative❌ One-shot❌ Voting only❌ Linear
Quality Control✅ Explicit❌ None✅ Statistical❌ None
Refinement✅ Guided❌ None❌ None❌ None
Agent Count2 (gen + eval)Multiple1 + wrapperMultiple
Best ForQuality contentPipelinesHigh reliabilityComplex tasks
  • MAKER - Statistical reliability through voting (different quality approach)
  • Chain - Multi-stage processing without feedback loops
  • Orchestrator - Complex task decomposition without iterative refinement