Wednesday, August 06, 2025

12 Factor Agents: Building Enterprise-Grade AI Systems

The Challenge: Most AI agents fail to meet production standards. They work great in demos but fall apart when faced with real-world enterprise requirements: reliability, scalability, maintainability, and security.

The Solution: 12 Factor Agents - a methodology inspired by the battle-tested 12 Factor App principles, adapted specifically for building production-ready AI agent systems.

Why Traditional Agent Frameworks Fall Short

After working with hundreds of AI builders and testing every major agent framework, a clear pattern emerges: 80% quality isn't good enough for customer-facing features. Most builders hit a wall where they need to reverse-engineer their chosen framework to achieve production quality, ultimately starting over from scratch.

"I've been surprised to find that most products billing themselves as 'AI Agents' are not all that agentic. A lot of them are mostly deterministic code, with LLM steps sprinkled in at just the right points to make the experience truly magical."
— Dex Horthy, Creator of 12 Factor Agents

The problem isn't with frameworks themselves—it's that good agents are comprised of mostly just software, not the "here's your prompt, here's a bag of tools, loop until you hit the goal" pattern that many frameworks promote.

What Are 12 Factor Agents?

12 Factor Agents is a methodology that provides core engineering principles for building LLM-powered software that's reliable, scalable, and maintainable. Rather than enforcing a specific framework, it offers modular concepts that can be incorporated into existing products.

Key Insight: The fastest way to get high-quality AI software in customers' hands is to take small, modular concepts from agent building and incorporate them into existing products—not to rebuild everything from scratch.

The 12 Factors Explained

1 Natural Language to Tool Calls

Convert natural language directly into structured tool calls. This is the fundamental pattern that enables agents to reason about tasks and execute them deterministically.


"create a payment link for $750 to Jeff" 
→ 
{
  "function": "create_payment_link",
  "parameters": {
    "amount": 750,
    "customer": "cust_128934ddasf9",
    "memo": "Payment for service"
  }
}
        

2 Own Your Prompts

Don't outsource prompt engineering to frameworks. Treat prompts as first-class code that you can version, test, and iterate on. Black-box prompting limits your ability to optimize performance.

Benefits:

  • Full control over instructions
  • Testable and version-controlled prompts
  • Fast iteration based on real-world performance
  • Transparency in what your agent is working with

3 Own Your Context Window

Don't rely solely on standard message formats. Engineer your context for maximum effectiveness—this is your primary interface with the LLM.

"At any given point, your input to an LLM in an agent is 'here's what's happened so far, what's the next step'"

Consider custom formats that optimize for:

  • Token efficiency
  • Information density
  • LLM comprehension
  • Easy human debugging

4 Tools Are Just Structured Outputs

Tools don't need to be complex. They're just structured JSON output from your LLM that triggers deterministic code. This creates clean separation between LLM decision-making and your application's actions.


if nextStep.intent == 'create_payment_link':
    stripe.paymentlinks.create(nextStep.parameters)
elif nextStep.intent == 'wait_for_approval': 
    # pause and wait for human intervention
else:
    # handle unknown tool calls
       

5 Unify Execution State and Business State

Simplify by unifying execution state (current step, waiting status) with business state (what's happened so far). This reduces complexity and makes systems easier to debug and maintain.

Benefits:

  • One source of truth for all state
  • Trivial serialization/deserialization
  • Complete history visibility
  • Easy recovery and forking

6 Launch/Pause/Resume with Simple APIs

Agents should be easy to launch, pause when long-running operations are needed, and resume from where they left off. This enables durable, reliable workflows that can handle interruptions.

7 Contact Humans with Tool Calls

Make human interaction just another tool call. Instead of forcing the LLM to choose between returning text or structured data, always use structured output with intents like request_human_input or done_for_now.

This enables:

  • Clear instructions for different types of human contact
  • Workflows that start with Agent→Human rather than Human→Agent
  • Multiple human coordination
  • Multi-agent communication

8 Own Your Control Flow

Build custom control structures for your specific use case. Different tool calls may require breaking out of loops to wait for human responses or long-running tasks.

Critical capability: Interrupt agents between tool selection and tool invocation—essential for human approval workflows.

9 Compact Errors into Context Window

When errors occur, compact them into useful context rather than letting them break the agent loop. This improves reliability and enables agents to learn from and recover from failures.

10 Small, Focused Agents

Build agents that do one thing well. Even as LLMs get more powerful, focused agents are easier to debug, test, and maintain than monolithic ones.

11 Trigger from Anywhere, Meet Users Where They Are

Agents should be triggerable from any interface—webhooks, cron jobs, Slack, email, APIs. Don't lock users into a single interaction mode.

12 Make Your Agent a Stateless Reducer

Design your agent as a pure function that takes the current state and an event, returning the new state. This functional approach improves testability and reasoning about agent behavior.

Enterprise Benefits

๐Ÿ”’ Security & Compliance

Human-in-the-loop approvals for sensitive operations, audit trails through structured state, and controlled execution environments.

๐Ÿ“Š Observability

Complete visibility into agent decision-making, structured logs, and easy debugging through unified state management.

⚡ Reliability

Graceful error handling, pause/resume capabilities, and deterministic execution for mission-critical operations.

๐Ÿ”ง Maintainability

Version-controlled prompts, testable components, and modular architecture that evolves with your needs.

๐Ÿ“ˆ Scalability

Stateless design, simple APIs, and focused agents that can be deployed and scaled independently.

๐Ÿค Integration

Works with existing systems, doesn't require complete rewrites, and meets users where they already work.

Real-World Implementation

Unlike theoretical frameworks, 12 Factor Agents has emerged from real production experience. The methodology comes from builders who have:

  • Built and deployed customer-facing AI agents
  • Tested every major agent framework
  • Worked with hundreds of technical founders
  • Learned from production failures and successes
"Most of them are rolling the stack themselves. I don't see a lot of frameworks in production customer-facing agents."

Getting Started

The beauty of 12 Factor Agents is that you don't need to implement all factors at once. Start with the factors most relevant to your current challenges:

  1. Experiencing prompt issues? Start with Factor 2 (Own Your Prompts)
  2. Need human oversight? Implement Factor 7 (Contact Humans with Tool Calls)
  3. Debugging problems? Focus on Factor 5 (Unify State) and Factor 3 (Own Context Window)
  4. Reliability concerns? Implement Factor 6 (Launch/Pause/Resume) and Factor 8 (Own Control Flow)

The Future of Enterprise AI

As AI becomes critical infrastructure for enterprises, the principles that made web applications reliable and scalable become essential for AI systems too. 12 Factor Agents provides that foundation—battle-tested engineering practices adapted for the unique challenges of LLM-powered applications.

Key Takeaway: Great agents aren't just about having the right model or the perfect prompt. They're about applying solid software engineering principles to create systems that work reliably in the real world.

The methodology acknowledges that even as LLMs continue to get exponentially more powerful, there will always be core engineering techniques that make LLM-powered software more reliable, scalable, and maintainable.

Learn More

The complete 12 Factor Agents methodology, including detailed examples, code samples, and workshops, is available at github.com/humanlayer/12-factor-agents. The project is open source and actively maintained by the community.

For enterprises looking to implement production-grade AI agents, 12 Factor Agents provides the roadmap from proof-of-concept to production-ready system—one factor at a time.

Friday, August 01, 2025

Building a Modern React Frontend for Movie Vibes: A Journey Through CSS Frameworks, AI Timeouts, and Real-World Development

How it started ...

A couple of days ago, I shared the creation of Movie Vibes, an AI-powered Spring Boot application that analyzes movie "vibes" using Spring AI and Ollama. The backend was working beautifully, but it was time to build a proper user interface. What started as a simple "add React + Tailwind" task turned into an educational journey through modern frontend development challenges, framework limitations, and the beauty of getting back to fundamentals.


How it's going ... 

The Original Plan: React + Tailwind CSS

The plan seemed straightforward:

  • ✅ React 18 + TypeScript for the frontend
  • ✅ Tailwind CSS for rapid styling
  • ✅ Modern, responsive design
  • ✅ Quick development cycle

How hard could it be? Famous last words.


The Tailwind CSS Nightmare

The Promise vs. Reality

Tailwind CSS markets itself as a "utility-first CSS framework" that accelerates development. In theory, you get: 

  • Rapid prototyping with utility classes
  • Consistent design tokens
  • Smaller CSS bundles
  • No context switching between CSS and HTML

In practice, with Create React App and Tailwind v4, we got:

  • ๐Ÿšซ Build failures due to PostCSS plugin incompatibilities
  • ๐Ÿšซ Cryptic error messages about plugin configurations
  • ๐Ÿšซ Hours of debugging CRACO configurations
  • ๐Ÿšซ Version conflicts between Tailwind v4 and CRA's PostCSS setup

The Technical Issues

The error that started it all:
Error: Loading PostCSS Plugin failed: tailwindcss directly as a PostCSS plugin has moved to @tailwindcss/postcss

We tried multiple solutions:

  1. CRACO configuration - Failed with plugin conflicts
  2. Downgrading to Tailwind v3 - Still had PostCSS issues
  3. Custom PostCSS config - Broke Create React App's build process
  4. Ejecting CRA - Nuclear option, but defeats the purpose

The Breaking Point

After spending more time debugging Tailwind than actually building features, I made a decision: dump Tailwind entirely. Sometimes the best solution is the simplest one.

The Pure CSS Renaissance

Going Back to Fundamentals

Instead of fighting with framework abstractions, we built a custom CSS design system that:

  • Compiles instantly - No build step complications
  • Full control - Every pixel exactly where we want it
  • No dependencies - Zero external CSS frameworks
  • Better performance - Only the CSS we actually use
  • Maintainable - Clear, semantic class names

The CSS Architecture


          /* Semantic, maintainable class names */
          .movie-card {
            background: white;
            border-radius: 12px;
            box-shadow: 0 4px 6px -1px rgba(0, 0, 0, 0.1);
            transition: box-shadow 0.3s ease;
          }

          .movie-card:hover {
          	box-shadow: 0 20px 25px -5px rgba(0, 0, 0, 0.1);
          }

          /* Responsive design without utility class bloat */
          @media (max-width: 768px) 
          {
            .movie-card {
              /* Mobile-specific styles */
            }
          }
          

Compare this to Tailwind's approach:


<!-- Tailwind: Utility class soup -->
<div className="bg-white rounded-xl shadow-lg p-6 hover:shadow-2xl 
            	transition-shadow duration-300 md:p-8 lg:p-10">
        
Our approach is more readable, maintainable, and debuggable.

The AI Timeout Challenge

The Problem

Once the UI was working, we discovered a new issue: AI operations take time. Our local Ollama model could take 30-60 seconds to analyze a movie and generate recommendations. The frontend was timing out before the AI finished processing.

The Solution

We implemented a comprehensive timeout strategy:

// 2-minute timeout for AI operations
const controller = new AbortController();
const timeoutId = setTimeout(() => controller.abort(), 120000);

// User-friendly loading messages
<p className="loading-text">
  Please wait, this process can take 30-60 seconds while our AI agent 
  analyzes the movie and generates recommendations ✨
</p>
Key improvements:
  • ⏱️ Extended timeout to 2 minutes for AI operations
  • ๐ŸŽฏ Clear user expectations with realistic time estimates
  • ๐Ÿ”„ Graceful error handling with timeout-specific messages
  • ๐Ÿ“ฑ Loading states that don't feel broken

The Poster Image Quest

Backend Enhancement

The original backend only returned movie titles in recommendations. Users expect to see poster images! We enhanced the system to:

  1. Fetch complete metadata for the main movie ✅
  2. Parse AI-generated recommendations to extract movie titles
  3. Query OMDb API for each recommendation's metadata
  4. Include poster URLs in the API response

Performance Optimization

To balance richness with performance:

  • ๐ŸŽฏ Limit to 5 recommendations to avoid excessive API calls
  • ๐Ÿ›ก️ Fallback handling when movie metadata isn't found
  • ๐Ÿ“Š Detailed logging for debugging and monitoring

The Final Architecture

Frontend Stack

  • React 18 + TypeScript - Modern, type-safe development
  • Pure CSS - Custom utility system, no framework dependencies
  • Responsive Design - Mobile-first approach
  • Error Boundaries - Graceful handling of failures

Backend Enhancements

  • Spring Boot 3.x - Robust, production-ready API
  • Spring AI + Ollama - Local LLM for movie analysis
  • OMDb API Integration - Rich movie metadata
  • Intelligent Caching - Future enhancement opportunity

API Evolution


          {
            "movie": {
              "title": "Mission: Impossible",
              "poster": "https://...",
              "year": "1996",
              "imdbRating": "7.2",
              "plot": "Full plot description..."
            },
            "vibeAnalysis": "An exhilarating action-adventure...",
            "recommendations": [
              {
                "title": "The Bourne Identity",
                "poster": "https://...",
                "year": "2002",
                "imdbRating": "7.9"
              }
            ]
          }
         

Lessons Learned

1. Framework Complexity vs. Value

Tailwind's Promise: Rapid development with utility classes
Reality:
Build system complexity that outweighs benefits

Sometimes vanilla CSS is the better choice. Modern CSS is incredibly powerful:

  • CSS Grid and Flexbox for layouts
  • CSS Custom Properties for theming
  • CSS Container Queries for responsive design
  • CSS-in-JS when you need dynamic styles

2. AI UX Considerations

Building AI-powered applications requires different UX patterns:

  • Longer wait times are normal and expected
  • ๐Ÿ“ข Clear communication about processing time
  • ๐Ÿ”„ Progressive disclosure of results
  • ๐Ÿ›ก️ Robust error handling for AI failures

3. API Design Evolution

Starting simple and evolving based on frontend needs:

  • ๐ŸŽฏ Backend-driven initially (simple JSON responses)
  • ๐ŸŽจ Frontend-driven enhancement (rich metadata)
  • ๐Ÿ”„ Backward compatibility during transitions

4. The Beauty of Fundamentals

Modern development often pushes us toward complex abstractions, but sometimes the simplest solution is the best:

  • Pure CSS over CSS frameworks
  • Semantic HTML over div soup
  • Progressive enhancement over JavaScript-heavy approaches

Performance Results

After our optimizations:

  • ๐Ÿš€ Build time: 3 seconds (was 45+ seconds with Tailwind debugging)
  • ๐Ÿ“ฆ Bundle size: 15% smaller without Tailwind dependencies
  • Development experience: Hot reload works consistently
  • ๐ŸŽฏ User experience: Clear loading states, beautiful poster images

What's Next?

The Movie Vibes application is now production-ready with:

  • ✅ Beautiful, responsive UI
  • ✅ AI-powered movie analysis
  • ✅ Rich movie metadata with posters
  • ✅ Robust error handling
  • ✅ 2-minute AI operation support

Future enhancements could include:

  • ๐Ÿ—„️ Caching layer for popular movies
  • ๐Ÿ‘ฅ User accounts and favorites
  • ๐ŸŒ™ Dark mode theme
  • ๐Ÿณ Docker deployment setup
  • ๐Ÿงช Comprehensive testing suite

Conclusion: Embrace Simplicity

This journey reinforced a fundamental principle: complexity should solve real problems, not create them.

Tailwind CSS promised to accelerate our development but instead became a roadblock. Pure CSS, with its directness and simplicity, delivered exactly what we needed without the framework overhead.

Building AI-powered applications comes with unique challenges - long processing times, complex data transformations, and user experience considerations that traditional web apps don't face. Focus on solving these real problems rather than fighting your tools.

Sometimes the best framework is no framework at all.
Try Movie Vibes yourself:
  • Backend: mvn spring-boot:run
  • Frontend: npm start
  • Search for your favorite movie and discover its vibe! ๐ŸŽฌ✨

What's your experience with CSS frameworks? Have you found cases where vanilla CSS outperformed framework solutions? Share your thoughts in the comments!

Tech Stack:

  • Spring Boot 3.x + Spring AI
  • React 18 + TypeScript
  • Pure CSS (Custom Design System)
  • Ollama (Local LLM)
  • OMDb API

 

GitHub: tyrell/movievibes 

Building a Model Context Protocol (MCP) Server for Movie Data: A Deep Dive into Modern AI Integration


 

The Challenge: Bringing Movie Data to AI Assistants


As AI assistants become increasingly sophisticated, there's a growing need for them to access real-time, structured data from external APIs. While many AI models have impressive knowledge, they often lack access to current information or specialized databases. This is where the Model Context Protocol (MCP) comes in—a standardized way for AI systems to interact with external data sources and tools.

Today, I want to share my experience building an MCP server that bridges AI assistants with the Open Movie Database (OMDB) API, allowing any MCP-compatible AI to search for movies, retrieve detailed film information, and provide users with up-to-date movie data.

 

What is the Model Context Protocol?

The Model Context Protocol is a emerging standard that enables AI assistants to safely and efficiently interact with external tools and data sources. Think of it as a universal translator that allows AI models to:

  • ๐Ÿ” Search external databases
  • ๐Ÿ› ️ Execute specific tools and functions
  • ๐Ÿ“Š Retrieve real-time data
  • Integrate seamlessly with existing systems

MCP servers act as intermediaries, exposing external APIs through a standardized JSON-RPC interface that AI assistants can understand and interact with safely.

 

The Project: OMDB MCP Server

I decided to build an MCP server for the Open Movie Database (OMDB) API—a comprehensive movie database that provides detailed information about films, TV shows, and series. The goal was to create a production-ready server that would allow AI assistants to:

  1. Search for movies by title, year, and type
  2. Get detailed movie information including plot, cast, ratings, and awards
  3. Lookup movies by IMDB ID for precise identification

 

Technical Architecture

 

Core Technologies

  • Spring Boot 3.5.4 - For the robust web framework
  • Java 21 - Taking advantage of modern language features
  • WebFlux & Reactive WebClient - For non-blocking, asynchronous API calls
  • Maven - For dependency management and build automation
 

MCP Protocol Implementation

The server implements three core MCP endpoints:

 

1. Protocol Handshake (initialize)

{
  "jsonrpc": "2.0",
  "method": "initialize",
  "params": {
    "protocolVersion": "2024-11-05",
    "capabilities": {},
    "clientInfo": {"name": "ai-client", "version": "1.0.0"}
  }
}
 

2. Tool Discovery (tools/list)

Returns available tools that the AI can use:

  • search_movies
  • get_movie_details
  • get_movie_by_imdb_id
 

3. Tool Execution (tools/call)

Executes the requested tool with provided arguments and returns formatted results.

 

Smart Error Handling

One of the key challenges was implementing robust error handling. The server includes:

  • Input validation for required parameters
  • Graceful API failure handling with meaningful error messages
  • Timeout configuration to prevent hanging requests
  • Detailed logging for debugging and monitoring

 

Real-World Challenges and Solutions

 

Challenge 1: HTTPS Migration

Initially, the OMDB API calls were failing due to (my AI assistant ๐Ÿคจ ) using HTTP instead of HTTPS. Modern APIs increasingly require secure connections.

Solution: Updated all API calls to use HTTPS and configured the WebClient with proper SSL handling.

 

Challenge 2: DNS Resolution on macOS

Encountered Netty DNS resolution warnings that could impact performance on macOS systems.

Solution: Added the native macOS DNS resolver dependency:

<dependency>
    <groupId>io.netty</groupId>
    <artifactId>netty-resolver-dns-native-macos</artifactId>
    <classifier>osx-aarch_64</classifier>
</dependency>
 

Challenge 3: Response Formatting

Raw OMDB API responses needed to be formatted for optimal AI consumption.

Solution: Created custom formatters that present movie data in a structured, readable format:

private String formatMovieDetails(OmdbMovie movie) {
    StringBuilder sb = new StringBuilder();
    sb.append("๐ŸŽฌ ").append(movie.getTitle()).append(" (").append(movie.getYear()).append(")\n\n");
    
    if (movie.getRated() != null) sb.append("Rating: ").append(movie.getRated()).append("\n");
    if (movie.getRuntime() != null) sb.append("Runtime: ").append(movie.getRuntime()).append("\n");
    // ... additional formatting
    
    return sb.toString();
}
 

Example Usage

Once deployed, AI assistants can interact with the server naturally:

User: "Find movies about artificial intelligence from the 1990s"

AI Assistant (via MCP): Calls search_movies with parameters:

{
  "title": "artificial intelligence", 
  "year": "1990s"
}

Result: Formatted list of AI-themed movies from the 1990s with IMDB IDs for further lookup.

 

Key Features

 

๐Ÿš€ Production Ready

  • Comprehensive error handling
  • Input validation
  • Configurable timeouts
  • Detailed logging

Performance Optimized

  • Reactive, non-blocking architecture
  • Connection pooling
  • Efficient memory usage

๐Ÿ”ง Developer Friendly

  • Complete documentation
  • Test scripts included
  • Easy configuration
  • Docker-ready

๐ŸŒ Standards Compliant

  • Full MCP 2024-11-05 specification compliance
  • JSON-RPC 2.0 protocol
  • RESTful API design

 

Testing and Validation

The project includes comprehensive testing:

# Health check
curl http://localhost:8080/mcp/health

# Search for movies
curl -X POST http://localhost:8080/mcp \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc": "2.0", "method": "tools/call", 
"params": {"name": "search_movies", 
"arguments": {"title": "Matrix"}}}'
 

Lessons Learned

 

1. Protocol Standards Matter

Following the MCP specification exactly ensured compatibility with different AI clients without modification.

2. Error Handling is Critical

In AI integrations, clear error messages help both developers and AI systems understand and recover from failures.

3. Documentation Drives Adoption

Comprehensive documentation with examples makes the difference between a useful tool and one that sits unused.

4. Modern Java is Powerful

Java 21 features like pattern matching and records significantly improved code readability and maintainability.

 

Future Enhancements

The current implementation is just the beginning. Future enhancements could include:

  • Caching layer for frequently requested movies
  • Rate limiting to respect API quotas
  • Additional data sources (e.g., The Movie Database API)
  • Advanced search features (genre filtering, rating ranges)
  • Recommendation engine integration

 

Try It Yourself

The complete source code is available on GitHub: github.com/tyrell/omdb-mcp-server

To get started:

  1. Clone the repository
  2. Get a free OMDB API key from omdbapi.com
  3. Set your API key: export OMDB_API_KEY=your-key
  4. Run: mvn spring-boot:run
  5. Test: curl http://localhost:8080/mcp/health 

 

Conclusion

Building this MCP server was an excellent introduction to the Model Context Protocol and its potential for enhancing AI capabilities. The project demonstrates how modern Java frameworks like Spring Boot can be used to create robust, production-ready integrations between AI systems and external APIs.

As AI assistants become more prevalent, tools like MCP servers will become essential infrastructure—bridging the gap between AI intelligence and real-world data. The movie database server is just one example, but the same patterns can be applied to any API or data source.

The future of AI isn't just about smarter models; it's about giving those models access to the vast ecosystem of data and tools that power our digital world. MCP servers are a key piece of that puzzle.


 

Want to discuss this project or share your own MCP server experiences? Feel free to reach out or contribute to the project on GitHub!

 

Technical Specifications

  • Language: Java 21
  • Framework: Spring Boot 3.5.4
  • Protocol: MCP 2024-11-05
  • API: OMDB (Open Movie Database)
  • Architecture: Reactive, Non-blocking
  • License: MIT
  • Status: Production Ready
 

Repository Structure

omdb-mcp-server/
├── src/main/java/co/tyrell/omdb_mcp_server/
│   ├── controller/     # REST endpoints
│   ├── service/        # Business logic
│   ├── model/          # Data models
│   └── config/         # Configuration
├── README.md           # Complete documentation
├── test-scripts/       # Testing utilities
└── LICENSE             # MIT License
GitHub: https://github.com/tyrell/omdb-mcp-server