Wednesday, August 06, 2025

12 Factor Agents: Building Enterprise-Grade AI Systems

The Challenge: Most AI agents fail to meet production standards. They work great in demos but fall apart when faced with real-world enterprise requirements: reliability, scalability, maintainability, and security.

The Solution: 12 Factor Agents - a methodology inspired by the battle-tested 12 Factor App principles, adapted specifically for building production-ready AI agent systems.

Why Traditional Agent Frameworks Fall Short

After working with hundreds of AI builders and testing every major agent framework, a clear pattern emerges: 80% quality isn't good enough for customer-facing features. Most builders hit a wall where they need to reverse-engineer their chosen framework to achieve production quality, ultimately starting over from scratch.

"I've been surprised to find that most products billing themselves as 'AI Agents' are not all that agentic. A lot of them are mostly deterministic code, with LLM steps sprinkled in at just the right points to make the experience truly magical."
— Dex Horthy, Creator of 12 Factor Agents

The problem isn't with frameworks themselves—it's that good agents are comprised of mostly just software, not the "here's your prompt, here's a bag of tools, loop until you hit the goal" pattern that many frameworks promote.

What Are 12 Factor Agents?

12 Factor Agents is a methodology that provides core engineering principles for building LLM-powered software that's reliable, scalable, and maintainable. Rather than enforcing a specific framework, it offers modular concepts that can be incorporated into existing products.

Key Insight: The fastest way to get high-quality AI software in customers' hands is to take small, modular concepts from agent building and incorporate them into existing products—not to rebuild everything from scratch.

The 12 Factors Explained

1 Natural Language to Tool Calls

Convert natural language directly into structured tool calls. This is the fundamental pattern that enables agents to reason about tasks and execute them deterministically.


"create a payment link for $750 to Jeff" 
→ 
{
  "function": "create_payment_link",
  "parameters": {
    "amount": 750,
    "customer": "cust_128934ddasf9",
    "memo": "Payment for service"
  }
}
        

2 Own Your Prompts

Don't outsource prompt engineering to frameworks. Treat prompts as first-class code that you can version, test, and iterate on. Black-box prompting limits your ability to optimize performance.

Benefits:

  • Full control over instructions
  • Testable and version-controlled prompts
  • Fast iteration based on real-world performance
  • Transparency in what your agent is working with

3 Own Your Context Window

Don't rely solely on standard message formats. Engineer your context for maximum effectiveness—this is your primary interface with the LLM.

"At any given point, your input to an LLM in an agent is 'here's what's happened so far, what's the next step'"

Consider custom formats that optimize for:

  • Token efficiency
  • Information density
  • LLM comprehension
  • Easy human debugging

4 Tools Are Just Structured Outputs

Tools don't need to be complex. They're just structured JSON output from your LLM that triggers deterministic code. This creates clean separation between LLM decision-making and your application's actions.


if nextStep.intent == 'create_payment_link':
    stripe.paymentlinks.create(nextStep.parameters)
elif nextStep.intent == 'wait_for_approval': 
    # pause and wait for human intervention
else:
    # handle unknown tool calls
       

5 Unify Execution State and Business State

Simplify by unifying execution state (current step, waiting status) with business state (what's happened so far). This reduces complexity and makes systems easier to debug and maintain.

Benefits:

  • One source of truth for all state
  • Trivial serialization/deserialization
  • Complete history visibility
  • Easy recovery and forking

6 Launch/Pause/Resume with Simple APIs

Agents should be easy to launch, pause when long-running operations are needed, and resume from where they left off. This enables durable, reliable workflows that can handle interruptions.

7 Contact Humans with Tool Calls

Make human interaction just another tool call. Instead of forcing the LLM to choose between returning text or structured data, always use structured output with intents like request_human_input or done_for_now.

This enables:

  • Clear instructions for different types of human contact
  • Workflows that start with Agent→Human rather than Human→Agent
  • Multiple human coordination
  • Multi-agent communication

8 Own Your Control Flow

Build custom control structures for your specific use case. Different tool calls may require breaking out of loops to wait for human responses or long-running tasks.

Critical capability: Interrupt agents between tool selection and tool invocation—essential for human approval workflows.

9 Compact Errors into Context Window

When errors occur, compact them into useful context rather than letting them break the agent loop. This improves reliability and enables agents to learn from and recover from failures.

10 Small, Focused Agents

Build agents that do one thing well. Even as LLMs get more powerful, focused agents are easier to debug, test, and maintain than monolithic ones.

11 Trigger from Anywhere, Meet Users Where They Are

Agents should be triggerable from any interface—webhooks, cron jobs, Slack, email, APIs. Don't lock users into a single interaction mode.

12 Make Your Agent a Stateless Reducer

Design your agent as a pure function that takes the current state and an event, returning the new state. This functional approach improves testability and reasoning about agent behavior.

Enterprise Benefits

🔒 Security & Compliance

Human-in-the-loop approvals for sensitive operations, audit trails through structured state, and controlled execution environments.

📊 Observability

Complete visibility into agent decision-making, structured logs, and easy debugging through unified state management.

⚡ Reliability

Graceful error handling, pause/resume capabilities, and deterministic execution for mission-critical operations.

🔧 Maintainability

Version-controlled prompts, testable components, and modular architecture that evolves with your needs.

📈 Scalability

Stateless design, simple APIs, and focused agents that can be deployed and scaled independently.

🤝 Integration

Works with existing systems, doesn't require complete rewrites, and meets users where they already work.

Real-World Implementation

Unlike theoretical frameworks, 12 Factor Agents has emerged from real production experience. The methodology comes from builders who have:

  • Built and deployed customer-facing AI agents
  • Tested every major agent framework
  • Worked with hundreds of technical founders
  • Learned from production failures and successes
"Most of them are rolling the stack themselves. I don't see a lot of frameworks in production customer-facing agents."

Getting Started

The beauty of 12 Factor Agents is that you don't need to implement all factors at once. Start with the factors most relevant to your current challenges:

  1. Experiencing prompt issues? Start with Factor 2 (Own Your Prompts)
  2. Need human oversight? Implement Factor 7 (Contact Humans with Tool Calls)
  3. Debugging problems? Focus on Factor 5 (Unify State) and Factor 3 (Own Context Window)
  4. Reliability concerns? Implement Factor 6 (Launch/Pause/Resume) and Factor 8 (Own Control Flow)

The Future of Enterprise AI

As AI becomes critical infrastructure for enterprises, the principles that made web applications reliable and scalable become essential for AI systems too. 12 Factor Agents provides that foundation—battle-tested engineering practices adapted for the unique challenges of LLM-powered applications.

Key Takeaway: Great agents aren't just about having the right model or the perfect prompt. They're about applying solid software engineering principles to create systems that work reliably in the real world.

The methodology acknowledges that even as LLMs continue to get exponentially more powerful, there will always be core engineering techniques that make LLM-powered software more reliable, scalable, and maintainable.

Learn More

The complete 12 Factor Agents methodology, including detailed examples, code samples, and workshops, is available at github.com/humanlayer/12-factor-agents. The project is open source and actively maintained by the community.

For enterprises looking to implement production-grade AI agents, 12 Factor Agents provides the roadmap from proof-of-concept to production-ready system—one factor at a time.

Friday, August 01, 2025

Building a Modern React Frontend for Movie Vibes: A Journey Through CSS Frameworks, AI Timeouts, and Real-World Development

How it started ...

A couple of days ago, I shared the creation of Movie Vibes, an AI-powered Spring Boot application that analyzes movie "vibes" using Spring AI and Ollama. The backend was working beautifully, but it was time to build a proper user interface. What started as a simple "add React + Tailwind" task turned into an educational journey through modern frontend development challenges, framework limitations, and the beauty of getting back to fundamentals.


How it's going ... 

The Original Plan: React + Tailwind CSS

The plan seemed straightforward:

  • ✅ React 18 + TypeScript for the frontend
  • ✅ Tailwind CSS for rapid styling
  • ✅ Modern, responsive design
  • ✅ Quick development cycle

How hard could it be? Famous last words.


The Tailwind CSS Nightmare

The Promise vs. Reality

Tailwind CSS markets itself as a "utility-first CSS framework" that accelerates development. In theory, you get: 

  • Rapid prototyping with utility classes
  • Consistent design tokens
  • Smaller CSS bundles
  • No context switching between CSS and HTML

In practice, with Create React App and Tailwind v4, we got:

  • 🚫 Build failures due to PostCSS plugin incompatibilities
  • 🚫 Cryptic error messages about plugin configurations
  • 🚫 Hours of debugging CRACO configurations
  • 🚫 Version conflicts between Tailwind v4 and CRA's PostCSS setup

The Technical Issues

The error that started it all:
Error: Loading PostCSS Plugin failed: tailwindcss directly as a PostCSS plugin has moved to @tailwindcss/postcss

We tried multiple solutions:

  1. CRACO configuration - Failed with plugin conflicts
  2. Downgrading to Tailwind v3 - Still had PostCSS issues
  3. Custom PostCSS config - Broke Create React App's build process
  4. Ejecting CRA - Nuclear option, but defeats the purpose

The Breaking Point

After spending more time debugging Tailwind than actually building features, I made a decision: dump Tailwind entirely. Sometimes the best solution is the simplest one.

The Pure CSS Renaissance

Going Back to Fundamentals

Instead of fighting with framework abstractions, we built a custom CSS design system that:

  • Compiles instantly - No build step complications
  • Full control - Every pixel exactly where we want it
  • No dependencies - Zero external CSS frameworks
  • Better performance - Only the CSS we actually use
  • Maintainable - Clear, semantic class names

The CSS Architecture


          /* Semantic, maintainable class names */
          .movie-card {
            background: white;
            border-radius: 12px;
            box-shadow: 0 4px 6px -1px rgba(0, 0, 0, 0.1);
            transition: box-shadow 0.3s ease;
          }

          .movie-card:hover {
          	box-shadow: 0 20px 25px -5px rgba(0, 0, 0, 0.1);
          }

          /* Responsive design without utility class bloat */
          @media (max-width: 768px) 
          {
            .movie-card {
              /* Mobile-specific styles */
            }
          }
          

Compare this to Tailwind's approach:


<!-- Tailwind: Utility class soup -->
<div className="bg-white rounded-xl shadow-lg p-6 hover:shadow-2xl 
            	transition-shadow duration-300 md:p-8 lg:p-10">
        
Our approach is more readable, maintainable, and debuggable.

The AI Timeout Challenge

The Problem

Once the UI was working, we discovered a new issue: AI operations take time. Our local Ollama model could take 30-60 seconds to analyze a movie and generate recommendations. The frontend was timing out before the AI finished processing.

The Solution

We implemented a comprehensive timeout strategy:

// 2-minute timeout for AI operations
const controller = new AbortController();
const timeoutId = setTimeout(() => controller.abort(), 120000);

// User-friendly loading messages
<p className="loading-text">
  Please wait, this process can take 30-60 seconds while our AI agent 
  analyzes the movie and generates recommendations ✨
</p>
Key improvements:
  • ⏱️ Extended timeout to 2 minutes for AI operations
  • 🎯 Clear user expectations with realistic time estimates
  • 🔄 Graceful error handling with timeout-specific messages
  • 📱 Loading states that don't feel broken

The Poster Image Quest

Backend Enhancement

The original backend only returned movie titles in recommendations. Users expect to see poster images! We enhanced the system to:

  1. Fetch complete metadata for the main movie ✅
  2. Parse AI-generated recommendations to extract movie titles
  3. Query OMDb API for each recommendation's metadata
  4. Include poster URLs in the API response

Performance Optimization

To balance richness with performance:

  • 🎯 Limit to 5 recommendations to avoid excessive API calls
  • 🛡️ Fallback handling when movie metadata isn't found
  • 📊 Detailed logging for debugging and monitoring

The Final Architecture

Frontend Stack

  • React 18 + TypeScript - Modern, type-safe development
  • Pure CSS - Custom utility system, no framework dependencies
  • Responsive Design - Mobile-first approach
  • Error Boundaries - Graceful handling of failures

Backend Enhancements

  • Spring Boot 3.x - Robust, production-ready API
  • Spring AI + Ollama - Local LLM for movie analysis
  • OMDb API Integration - Rich movie metadata
  • Intelligent Caching - Future enhancement opportunity

API Evolution


          {
            "movie": {
              "title": "Mission: Impossible",
              "poster": "https://...",
              "year": "1996",
              "imdbRating": "7.2",
              "plot": "Full plot description..."
            },
            "vibeAnalysis": "An exhilarating action-adventure...",
            "recommendations": [
              {
                "title": "The Bourne Identity",
                "poster": "https://...",
                "year": "2002",
                "imdbRating": "7.9"
              }
            ]
          }
         

Lessons Learned

1. Framework Complexity vs. Value

Tailwind's Promise: Rapid development with utility classes
Reality:
Build system complexity that outweighs benefits

Sometimes vanilla CSS is the better choice. Modern CSS is incredibly powerful:

  • CSS Grid and Flexbox for layouts
  • CSS Custom Properties for theming
  • CSS Container Queries for responsive design
  • CSS-in-JS when you need dynamic styles

2. AI UX Considerations

Building AI-powered applications requires different UX patterns:

  • Longer wait times are normal and expected
  • 📢 Clear communication about processing time
  • 🔄 Progressive disclosure of results
  • 🛡️ Robust error handling for AI failures

3. API Design Evolution

Starting simple and evolving based on frontend needs:

  • 🎯 Backend-driven initially (simple JSON responses)
  • 🎨 Frontend-driven enhancement (rich metadata)
  • 🔄 Backward compatibility during transitions

4. The Beauty of Fundamentals

Modern development often pushes us toward complex abstractions, but sometimes the simplest solution is the best:

  • Pure CSS over CSS frameworks
  • Semantic HTML over div soup
  • Progressive enhancement over JavaScript-heavy approaches

Performance Results

After our optimizations:

  • 🚀 Build time: 3 seconds (was 45+ seconds with Tailwind debugging)
  • 📦 Bundle size: 15% smaller without Tailwind dependencies
  • Development experience: Hot reload works consistently
  • 🎯 User experience: Clear loading states, beautiful poster images

What's Next?

The Movie Vibes application is now production-ready with:

  • ✅ Beautiful, responsive UI
  • ✅ AI-powered movie analysis
  • ✅ Rich movie metadata with posters
  • ✅ Robust error handling
  • ✅ 2-minute AI operation support

Future enhancements could include:

  • 🗄️ Caching layer for popular movies
  • 👥 User accounts and favorites
  • 🌙 Dark mode theme
  • 🐳 Docker deployment setup
  • 🧪 Comprehensive testing suite

Conclusion: Embrace Simplicity

This journey reinforced a fundamental principle: complexity should solve real problems, not create them.

Tailwind CSS promised to accelerate our development but instead became a roadblock. Pure CSS, with its directness and simplicity, delivered exactly what we needed without the framework overhead.

Building AI-powered applications comes with unique challenges - long processing times, complex data transformations, and user experience considerations that traditional web apps don't face. Focus on solving these real problems rather than fighting your tools.

Sometimes the best framework is no framework at all.
Try Movie Vibes yourself:
  • Backend: mvn spring-boot:run
  • Frontend: npm start
  • Search for your favorite movie and discover its vibe! 🎬✨

What's your experience with CSS frameworks? Have you found cases where vanilla CSS outperformed framework solutions? Share your thoughts in the comments!

Tech Stack:

  • Spring Boot 3.x + Spring AI
  • React 18 + TypeScript
  • Pure CSS (Custom Design System)
  • Ollama (Local LLM)
  • OMDb API

 

GitHub: tyrell/movievibes 

Building a Model Context Protocol (MCP) Server for Movie Data: A Deep Dive into Modern AI Integration


 

The Challenge: Bringing Movie Data to AI Assistants


As AI assistants become increasingly sophisticated, there's a growing need for them to access real-time, structured data from external APIs. While many AI models have impressive knowledge, they often lack access to current information or specialized databases. This is where the Model Context Protocol (MCP) comes in—a standardized way for AI systems to interact with external data sources and tools.

Today, I want to share my experience building an MCP server that bridges AI assistants with the Open Movie Database (OMDB) API, allowing any MCP-compatible AI to search for movies, retrieve detailed film information, and provide users with up-to-date movie data.

 

What is the Model Context Protocol?

The Model Context Protocol is a emerging standard that enables AI assistants to safely and efficiently interact with external tools and data sources. Think of it as a universal translator that allows AI models to:

  • 🔍 Search external databases
  • 🛠️ Execute specific tools and functions
  • 📊 Retrieve real-time data
  • Integrate seamlessly with existing systems

MCP servers act as intermediaries, exposing external APIs through a standardized JSON-RPC interface that AI assistants can understand and interact with safely.

 

The Project: OMDB MCP Server

I decided to build an MCP server for the Open Movie Database (OMDB) API—a comprehensive movie database that provides detailed information about films, TV shows, and series. The goal was to create a production-ready server that would allow AI assistants to:

  1. Search for movies by title, year, and type
  2. Get detailed movie information including plot, cast, ratings, and awards
  3. Lookup movies by IMDB ID for precise identification

 

Technical Architecture

 

Core Technologies

  • Spring Boot 3.5.4 - For the robust web framework
  • Java 21 - Taking advantage of modern language features
  • WebFlux & Reactive WebClient - For non-blocking, asynchronous API calls
  • Maven - For dependency management and build automation
 

MCP Protocol Implementation

The server implements three core MCP endpoints:

 

1. Protocol Handshake (initialize)

{
  "jsonrpc": "2.0",
  "method": "initialize",
  "params": {
    "protocolVersion": "2024-11-05",
    "capabilities": {},
    "clientInfo": {"name": "ai-client", "version": "1.0.0"}
  }
}
 

2. Tool Discovery (tools/list)

Returns available tools that the AI can use:

  • search_movies
  • get_movie_details
  • get_movie_by_imdb_id
 

3. Tool Execution (tools/call)

Executes the requested tool with provided arguments and returns formatted results.

 

Smart Error Handling

One of the key challenges was implementing robust error handling. The server includes:

  • Input validation for required parameters
  • Graceful API failure handling with meaningful error messages
  • Timeout configuration to prevent hanging requests
  • Detailed logging for debugging and monitoring

 

Real-World Challenges and Solutions

 

Challenge 1: HTTPS Migration

Initially, the OMDB API calls were failing due to (my AI assistant 🤨 ) using HTTP instead of HTTPS. Modern APIs increasingly require secure connections.

Solution: Updated all API calls to use HTTPS and configured the WebClient with proper SSL handling.

 

Challenge 2: DNS Resolution on macOS

Encountered Netty DNS resolution warnings that could impact performance on macOS systems.

Solution: Added the native macOS DNS resolver dependency:

<dependency>
    <groupId>io.netty</groupId>
    <artifactId>netty-resolver-dns-native-macos</artifactId>
    <classifier>osx-aarch_64</classifier>
</dependency>
 

Challenge 3: Response Formatting

Raw OMDB API responses needed to be formatted for optimal AI consumption.

Solution: Created custom formatters that present movie data in a structured, readable format:

private String formatMovieDetails(OmdbMovie movie) {
    StringBuilder sb = new StringBuilder();
    sb.append("🎬 ").append(movie.getTitle()).append(" (").append(movie.getYear()).append(")\n\n");
    
    if (movie.getRated() != null) sb.append("Rating: ").append(movie.getRated()).append("\n");
    if (movie.getRuntime() != null) sb.append("Runtime: ").append(movie.getRuntime()).append("\n");
    // ... additional formatting
    
    return sb.toString();
}
 

Example Usage

Once deployed, AI assistants can interact with the server naturally:

User: "Find movies about artificial intelligence from the 1990s"

AI Assistant (via MCP): Calls search_movies with parameters:

{
  "title": "artificial intelligence", 
  "year": "1990s"
}

Result: Formatted list of AI-themed movies from the 1990s with IMDB IDs for further lookup.

 

Key Features

 

🚀 Production Ready

  • Comprehensive error handling
  • Input validation
  • Configurable timeouts
  • Detailed logging

Performance Optimized

  • Reactive, non-blocking architecture
  • Connection pooling
  • Efficient memory usage

🔧 Developer Friendly

  • Complete documentation
  • Test scripts included
  • Easy configuration
  • Docker-ready

🌐 Standards Compliant

  • Full MCP 2024-11-05 specification compliance
  • JSON-RPC 2.0 protocol
  • RESTful API design

 

Testing and Validation

The project includes comprehensive testing:

# Health check
curl http://localhost:8080/mcp/health

# Search for movies
curl -X POST http://localhost:8080/mcp \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc": "2.0", "method": "tools/call", 
"params": {"name": "search_movies", 
"arguments": {"title": "Matrix"}}}'
 

Lessons Learned

 

1. Protocol Standards Matter

Following the MCP specification exactly ensured compatibility with different AI clients without modification.

2. Error Handling is Critical

In AI integrations, clear error messages help both developers and AI systems understand and recover from failures.

3. Documentation Drives Adoption

Comprehensive documentation with examples makes the difference between a useful tool and one that sits unused.

4. Modern Java is Powerful

Java 21 features like pattern matching and records significantly improved code readability and maintainability.

 

Future Enhancements

The current implementation is just the beginning. Future enhancements could include:

  • Caching layer for frequently requested movies
  • Rate limiting to respect API quotas
  • Additional data sources (e.g., The Movie Database API)
  • Advanced search features (genre filtering, rating ranges)
  • Recommendation engine integration

 

Try It Yourself

The complete source code is available on GitHub: github.com/tyrell/omdb-mcp-server

To get started:

  1. Clone the repository
  2. Get a free OMDB API key from omdbapi.com
  3. Set your API key: export OMDB_API_KEY=your-key
  4. Run: mvn spring-boot:run
  5. Test: curl http://localhost:8080/mcp/health 

 

Conclusion

Building this MCP server was an excellent introduction to the Model Context Protocol and its potential for enhancing AI capabilities. The project demonstrates how modern Java frameworks like Spring Boot can be used to create robust, production-ready integrations between AI systems and external APIs.

As AI assistants become more prevalent, tools like MCP servers will become essential infrastructure—bridging the gap between AI intelligence and real-world data. The movie database server is just one example, but the same patterns can be applied to any API or data source.

The future of AI isn't just about smarter models; it's about giving those models access to the vast ecosystem of data and tools that power our digital world. MCP servers are a key piece of that puzzle.


 

Want to discuss this project or share your own MCP server experiences? Feel free to reach out or contribute to the project on GitHub!

 

Technical Specifications

  • Language: Java 21
  • Framework: Spring Boot 3.5.4
  • Protocol: MCP 2024-11-05
  • API: OMDB (Open Movie Database)
  • Architecture: Reactive, Non-blocking
  • License: MIT
  • Status: Production Ready
 

Repository Structure

omdb-mcp-server/
├── src/main/java/co/tyrell/omdb_mcp_server/
│   ├── controller/     # REST endpoints
│   ├── service/        # Business logic
│   ├── model/          # Data models
│   └── config/         # Configuration
├── README.md           # Complete documentation
├── test-scripts/       # Testing utilities
└── LICENSE             # MIT License
GitHub: https://github.com/tyrell/omdb-mcp-server 
 
 

Tuesday, July 29, 2025

Building MovieVibes: A Vibe Coding Journey with Agentic AI

 "At first it was just a fun idea — what if a movie recommendation engine could understand the vibe of a film, not just its genre or rating?"

That simple question kicked off one of my most rewarding experiments in Vibe Coding and Agentic AI — powered entirely by Ollama running locally on my machine.

 

Motivation: Coding by Vibe, not by Ticket

Lately, I’ve been inspired by the idea of "Vibe Coding" — a freeform, creative development style where we start with a concept or feeling and let the code evolve organically, often in partnership with an AI assistant. It’s not about Jira tickets or rigid specs; it’s about prototyping fast and iterating naturally.

My goal was to build a movie recommendation app where users enter a movie title and get back a vibe-based summary and some thoughtful movie suggestions — not just by keyword match, but by understanding why someone liked the original movie.

 

Stage 1: The Big Idea

I started with a prompt:

"Take a movie name from the user, determine its vibe using its genre, plot, and characters, and recommend similar movies."

The app needed to:

  • Fetch movie metadata from the OMDb API
  • Use a local LLM (via Ollama) to generate a vibe summary and similar movie suggestions
  • Serve results via a clean JSON API

We scaffolded a Spring Boot project, created REST controllers and services, and started building out the logic to integrate with both the OMDb API and the locally running Ollama LLM.

 

Stage 2: Engineering the Integration

Things were going smoothly until they weren’t. 😅

Compilation Errors

When we added the OmdbMovieResponse model, our service layer suddenly couldn't find the getTitle(), getPlot(), etc. methods — even though they clearly existed. The culprit? Missing getters (at least that's what we thought at the time...).


We tried:

  • Manually writing getters ✅
  • Using Lombok’s @Getter annotation ✅
  • Cleaning and rebuilding Maven ✅

Still, values were null at runtime.

 

The Root Cause

Turns out the problem was with URL encoding of the title parameter. Movie titles with spaces (like The Matrix) weren’t properly encoded, which broke the API call. Once we fixed that, everything clicked into place. 🎯

Note: The AI would never have figured this out by itself. This was just my natural instincts kicking in to guide the AI as I would direct any other human developer. Also, It has been ages since I worked on a Spring boot project with Maven. However, the usual gotchas are still there in the year 2025 🙄. 

 

Stage 3: Talking to the LLM (via Ollama)

This was where things got really fun.

Instead of relying on cloud APIs like OpenAI, I used Ollama, a local runtime for open-source LLMs. It let me:

  • Run a model like LLaMA or Mistral locally
  • Avoid API keys and cloud latency
  • Iterate on prompts rapidly without rate limits

The app sends movie metadata (genre, plot, characters) to the local LLM with a tailored prompt. The LLM returns:

  • A summarized “vibe” of the movie
  • A list of recommended films with similar emotional or narrative energy

The results were surprisingly nuanced and human-like.

 


Tests, Cleanups, and Git Prep

To make the app production-ready:

  • We wrote integration tests using MockMvc
  • Hid API keys in .env files and excluded them via .gitignore
  • Structured the MovieVibeRecommendationResponse as a list of objects, not just strings
  • Wrote a solid README.md for onboarding others

 

Going Agentic

With the basic loop working, I asked:

How can this app become Agentic AI?

We designed the logic to act more like an agent than a pipeline:

  1. It fetches movie metadata
  2. Synthesizes emotional and narrative themes
  3. Determines recommendations with intent — not just similarity

This emergent behavior made the experience feel more conversational and human, despite being fully automated and offline.

 

Reflections

This project was peak Vibe Coding — no rigid architecture upfront, just a flowing experiment with a clear purpose and evolving ideas.

The use of Ollama was especially empowering. Running an LLM locally gave me:

  • Full control of the experience
  • No API costs or usage caps
  • A deeper understanding of how AI can enhance personal and creative tools

 

Next Steps

For future improvements, I'd love to:

  • Add a slick front-end UI (maybe with React or Tailwind)
  • Let users rate and fine-tune their recommendations
  • Persist data for returning visitors
  • Integrate retrieval-augmented generation for even smarter results

But even as an MVP, the app feels alive. It understands vibe. And that’s the magic. I committed the code to my Github at https://github.com/tyrell/movievibes. All this was done in a few hours since publishing my previous post about Spring AI


A Word on Spring AI

While this project used a more manual approach to interact with Ollama, I’m excited about the emerging capabilities of Spring AI. It promises to simplify agentic workflows by integrating LLMs seamlessly into Spring-based applications — with features like prompt templates, model abstractions, embeddings, and even memory-backed agents.

As Spring AI matures, I see it playing a major role in production-grade, AI-powered microservices. It aligns well with Spring’s core principles: abstraction, convention over configuration, and testability. 

 

Try the idea. Build something weird. Talk to your code. Let it talk back. Locally.

 

UPDATE (01/AUG/2025): Read the sequel of this here

 

Monday, July 28, 2025

Introduction to Spring AI: Bringing the Power of AI to the Spring Ecosystem

Artificial Intelligence is no longer a niche capability—it’s rapidly becoming a foundational element across enterprise applications. Whether you're building smarter chatbots, generating insights from unstructured content, or integrating Large Language Models (LLMs) into your workflows, developers increasingly need streamlined ways to plug AI into real-world systems.

 

That’s where Spring AI steps in.

In this blog post, I’ll introduce Spring AI, a new project from the Spring team that brings first-class support for integrating generative AI and foundation models into Spring-based applications. It’s an exciting addition to the Spring ecosystem that aims to make AI integration as natural as working with data sources or messaging.

 

What is Spring AI?

Spring AI is an open-source project that provides a unified and consistent programming model to work with modern AI capabilities like:

  • Large Language Models (LLMs) such as OpenAI, Azure OpenAI, Hugging Face, and Ollama
  • Embedding Models for semantic search
  • Vector Stores (like Redis, Milvus, Qdrant, Pinecone, and PostgreSQL with pgvector)
  • Prompt Templates, RAG (Retrieval-Augmented Generation) workflows, and tool execution

The project is deeply inspired by Spring Data and Spring Cloud, and brings that same level of abstraction and consistency to AI workflows.

 

Key Features of Spring AI

 

1. Unified LLM API

Spring AI provides a consistent interface across multiple LLM providers like:

  • OpenAI
  • Azure OpenAI
  • Hugging Face
  • Ollama

This allows you to write code once and switch providers with minimal changes.

var response = chatClient.call(new Prompt("Tell me a joke about Spring Boot"));
System.out.println(response.getResult());

2. Prompt Engineering Made Easy

PromptTemplate template = new PromptTemplate("Translate this text to French: {text}");
template.add("text", "Hello, world!");

3. Support for RAG (Retrieval-Augmented Generation)

Integrate AI responses with external knowledge sources using vector search. Spring AI supports various vector stores and offers abstractions for embedding, storing, and retrieving content semantically.

Embedding embedding = embeddingClient.embed("Spring is great for microservices!");
vectorStore.add(new EmbeddingDocument("id-1", embedding, metadata));

4. Integration with Spring Boot

Spring AI is a first-class citizen in the Spring ecosystem. It works seamlessly with Spring Boot and supports features like:

  • Declarative configuration using application.yml
  • Integration with Actuator and Observability
  • Use of @Bean, @Configuration, and dependency injection

5. Tool Execution and Function Calling

Spring AI supports tool calling and function execution—critical for agent-based applications.

 

A Simple Use Case

Let’s say you’re building a customer support chatbot. With Spring AI, you can:

  1. Use OpenAI to handle natural language queries.
  2. Store support articles in a vector database.
  3. Implement RAG to enhance responses using your private knowledge base.
  4. Define functions (e.g., "create support ticket") that the model can call programmatically.

The entire pipeline is manageable using familiar Spring idioms.

 

Why Use Spring AI?

If you’re already using Spring Boot in your backend stack, Spring AI provides:

  • Consistency: Familiar APIs and configuration patterns.
  • Portability: Swap providers or vector stores with minimal refactoring.
  • Flexibility: Fine-grained control over prompts, embeddings, and function calls.
  • Productivity: Rapid prototyping and integration without boilerplate.
 

Getting Started

To get started:

  1. Add Spring AI to your Maven or Gradle project:
<dependency>
  <groupId>org.springframework.ai</groupId>
  <artifactId>spring-ai-openai-spring-boot-starter</artifactId>
</dependency>
  1. Configure your provider:
spring:
  ai:
    openai:
      api-key: ${OPENAI_API_KEY}
  1. Inject and use the ChatClient, EmbeddingClient, or other components in your service.

Official guide: https://docs.spring.io/spring-ai/reference

 

The Future of Enterprise AI with Spring

Spring AI represents a big leap in making AI accessible for mainstream enterprise developers. Instead of reinventing the wheel, teams can build intelligent systems using familiar patterns and strong ecosystem support.

Whether you’re building smart assistants, enhancing search, or enabling decision support, Spring AI offers a solid foundation.

 

I’ll be diving deeper into use cases and tutorials in future posts—stay tuned!

Wednesday, July 23, 2025

AI‑Assisted Coding vs Vibe Coding: Understanding the Costs, Benefits, and Risks in the Modern Enterprise

Over the last decade, large enterprises have consistently evolved their engineering practices—from Agile to DevOps, from Microservices to Platform Engineering. In 2025, two trends dominate the software development landscape: AI‑Assisted Coding and Vibe Coding. While they may overlap in tooling and intent, their emphasis and enterprise implications are different.

This post unpacks and contrasts these two approaches in terms of costs, benefits, and risks, with added context on their origins and key voices shaping the conversation.

 

AI‑Assisted Coding: Origins & Philosophy

The term "AI-assisted coding" became common in the late 2010s, as tools emerged to enhance developer productivity using machine learning. Early tools offered code completion, bug detection, and automated refactoring, trained on public code repositories. GitHub Copilot (2021), Amazon CodeWhisperer (2022), and now AI-native IDEs like Replit have integrated deep AI support.

“The Copilot team found that developers completed tasks 55% faster with AI suggestions.”GitHub Research, 2023

 

Vibe Coding: Origins & Philosophy


Coined by Andrej Karpathy (OpenAI cofounder and former Tesla AI director) on X in February 2025:

“There’s a new kind of coding I call ‘vibe coding’, where you fully give in to the vibes, embrace exponentials, and forget that the code even exists.”

The approach emphasizes working in "flow" with AI companions, focusing on intent and outcome over code syntax. It resonates with those building prototypes, MVPs, or experimenting creatively with LLMs.

“It’s not really coding – I just see stuff, say stuff, run stuff, and copy-paste stuff, and it mostly works.” – Karpathy

Critics argue that vibe coding trivializes the rigor of engineering:

"The term is misleading... It implies ease, but in practice it's mentally exhausting." – Andrew Ng, LangChain Conf 2025

 

Comparison Table: AI‑Assisted Coding vs Vibe Coding

Aspect AI‑Assisted Coding Vibe Coding
Origins & Philosophy ML-powered developer tools since ~2018 Coined by Karpathy in 2025; emphasizes flow and experimentation
Primary Focus Automating code & improving quality Prototyping, fast ideation, and flow
Costs Tool licensing, training, integration, code review overhead DevX investment, non-standard environments, prompt engineering
Benefits Speed, quality, better onboarding, fewer bugs Innovation, accessibility, creativity, team morale
Risks IP & security concerns, skill atrophy, AI hallucination Lack of accountability, fragile prototypes, inconsistent quality
Enterprise Fit Structured SDLC tasks, junior dev enablement MVPs, PoCs, experimental sprints, hackathons


 

Strategic Guidance for Enterprises

  1. Adopt AI‑Assisted Coding tactically: Use in boilerplate-heavy domains, onboarding flows, or test generation.
  2. Enable Vibe Coding safely: Encourage in low-risk environments, MVP tracks, or labs.
  3. Create DevX Guardrails: Establish standards around LLM prompts, code quality, and model trust boundaries.
  4. Measure Beyond Vibes: Track actual productivity metrics (defect rate, rework, test coverage) alongside developer satisfaction.

 

Conclusion

AI-assisted coding is here to stay. It augments developers and supports production-quality delivery. Vibe coding, meanwhile, embodies a cultural shift—toward creativity, rapid feedback, and AI-human interaction. Together, they represent the next evolution of software development.

The challenge for tech leaders is to harness both trends intentionally—balancing structure with spontaneity, and quality with speed.

 

How is your team adapting to this new era of development? 

Saturday, June 14, 2025

Is AI Shifting CI/CD Left?

Exploring the New Frontiers of Intelligent DevOps

In the (ever-evolving) world of enterprise technology and software engineering, Continuous Integration and Continuous Deployment (CI/CD) have become foundational pillars of modern delivery pipelines. But a new trend is emerging — driven by the rise of AI-powered tooling — that’s challenging conventional boundaries of when CI/CD begins.

There’s growing consensus among technologists and product engineers that AI is shifting CI/CD left. But is this really happening? And if so, what does it mean in practice?

Let’s unpack the hypothesis and explore how artificial intelligence is transforming the way software is designed, tested, and deployed.

 

What Does "Shifting Left" Mean in CI/CD?

The concept of "shifting left" refers to moving critical activities such as testing, security checks, compliance validation, and performance analysis earlier in the software development lifecycle (SDLC) — ideally, before code even reaches the integration pipeline.

Traditionally, CI/CD begins after a developer writes code and pushes it to a shared repository. From there, pipelines run automated tests, perform builds, and deploy the code into various environments.

But AI is now disrupting that sequence.

 

How is AI Shifting CI/CD Further Left?

 

1. AI-Driven Code Generation with Built-In CI Hygiene

Tools like GitHub Copilot, Amazon CodeWhisperer, and Tabnine are more than just autocomplete helpers. They're becoming context-aware copilots that can:

  • Generate code with appropriate logging, error handling, and testing hooks built-in
  • Suggest fixes and improvements aligned with CI linting and formatting rules
  • Alert developers to potential build or test failures before the first commit

In essence, these tools bring aspects of CI/CD directly into the IDE.

 

2. Automated Test Generation at Design Time

One of the most exciting frontiers is AI-generated tests:

  • Given a function or method, AI can propose unit tests, integration tests, and mocks on the fly
  • Some tools even analyze user stories or acceptance criteria and write tests from natural language requirements

This means test coverage is no longer an afterthought — it’s embedded into development workflows from the very start, reinforcing CI-readiness even before integration begins.

 

3. Security and Compliance: Shift-Left DevSecOps via AI

AI is making DevSecOps truly shift-left by:

  • Flagging security misconfigurations or dependency vulnerabilities in real time
  • Detecting hardcoded secrets or license violations in the editor
  • Aligning code with enterprise compliance policies automatically

This reduces the friction between developers and security teams, embedding governance early in the dev lifecycle.

 

4. Intelligent CI/CD Pipeline Creation

Writing CI/CD YAML configurations can be complex and error-prone. AI is now helping by:

  • Translating natural language inputs into valid GitHub Actions, GitLab CI, or Jenkinsfiles
  • Tailoring pipelines to specific build environments, test suites, and deployment patterns
  • Making it easier for teams to adopt best practices without deep DevOps expertise

Some platforms even use AI to recommend pipeline improvements based on historical failures or bottlenecks.

 

5. Infrastructure and Deployment Insights — Before a Line is Deployed

AI can now assist in designing Infrastructure as Code (IaC) and deployment topologies before infra is provisioned:

  • Suggesting Terraform or CloudFormation templates aligned with the application
  • Recommending container orchestration, secrets management, or observability toolchains based on the architecture

This collapses the gap between software design and production readiness.

 

Architecture Realization in the Age of AI-Driven CI/CD

One of the core responsibilities of Enterprise and Solution Architects is to ensure Architecture Realization — the translation of abstract blueprints and target-state models into working, compliant, and sustainable solutions in production.

However, realizing architecture has often been challenging due to the disconnect between upfront architectural intent and downstream engineering execution. The farther downstream architectural principles are checked — in code reviews, test reports, or go-live readiness — the more diluted they become.

This is precisely where AI's shift-left impact on CI/CD can become a game-changer for architecture teams.


Embedding Architectural Guardrails Upstream

AI-enhanced developer tools can now detect — and in some cases enforce — architectural decisions at the point of code authoring:

  • Suggesting correct usage of shared libraries, patterns, or design principles
  • Flagging violations of architectural standards (e.g., synchronous calls to asynchronous systems)
  • Mapping low-level implementations back to solution blueprints or enterprise guidelines

This empowers architects to shift architectural governance leftward, embedding compliance and alignment within the development flow.

 

AI as a Realization Accelerator


LLMs can help solution architects generate baseline infrastructure-as-code, API contracts, or sequence diagrams directly from architecture models or user stories. This:

  • Reduces the handoff gap between architecture and engineering
  • Improves traceability from high-level decisions to code artifacts
  • Accelerates the iterative refinement of architecture through working prototypes

 

Intelligent Feedback Loops for Architecture Evolution

With AI embedded in CI/CD telemetry, architects can access new insights such as:

  • Which architectural decisions correlate with slower deployments or more defects
  • Where design intent is being consistently ignored or misinterpreted
  • Whether technical debt is accumulating around specific architecture choices

This creates continuous architecture feedback loops, essential for adapting and evolving architecture in real time.

 

Empowering Federated Architecture Models

In large-scale agile enterprises, centralized architecture can’t scale alone. AI tooling that shifts CI/CD left also enables federated architecture practices, where delivery teams take more responsibility for alignment and realization — with AI acting as an intelligent guide.

This supports architecture operating models such as the Architecture Owner role in SAFe, or the concept of architecture as code in platform teams.

 

The Bottom Line

As AI pushes CI/CD left, architecture is no longer a PowerPoint exercise — it becomes executable, testable, and enforceable much earlier in the lifecycle.

This marks the dawn of a new software development paradigm — one where automation is intelligent, feedback is immediate, and DevOps is embedded from the start.

For Enterprise and Solution Architects, this is a profound opportunity to:

  • Ensure traceable realization of architectural intent
  • Accelerate delivery while reducing risk
  • Continuously improve architecture with real-world signals

 

What’s Next?

In future posts, we’ll explore:

  • Real-world tools and plugins enabling this shift
  • AI-powered DevSecOps in action
  • How to redesign CI/CD governance in an AI-augmented world

Thursday, June 05, 2025

2025 Internet Trends: The AI Surge – Key Takeaways from Mary Meeker's Report

Mary Meeker and the BOND team have released their much-anticipated 2025 Internet Trends report—and this year, the focus is clear: Artificial Intelligence. What began as a collection of “disparate data-points” turned into a sweeping 300+ page document detailing how AI is transforming everything—from internet usage and enterprise software to labor markets and geopolitics (p. 2).

 

1. AI Adoption is Outpacing the Internet Era

In just 17 months, OpenAI’s ChatGPT scaled from 100 million to 800 million weekly active users, an 8x growth rate that dwarfs the pace of early internet platforms (p. 55).

“AI user and usage trending is ramping materially faster…and the machines can outpace us.” – Mary Meeker, p. 2

To put it into context, while the Internet took 23 years to reach 90% of its global user base outside North America, ChatGPT did it in just three years (p. 56).

 

2. Capital Expenditure in AI is Exploding

Tech giants—Apple, NVIDIA, Microsoft, Alphabet, Amazon (AWS), and Meta—are projected to spend a massive $212B in CapEx in 2024, a 63% increase over the last decade (p. 97). This is not just infrastructure—it’s a full-scale arms race to define the future of AI platforms.

 

3. Performance Up, Costs Down

Training compute has grown at 360% annually over the past 15 years (p. 15), while inference costs per token have steadily fallen. This convergence has spurred developer participation: NVIDIA’s AI ecosystem has grown to 6 million developers (p. 38), and Google’s Gemini AI ecosystem now boasts 7 million developers, a 5x year-over-year increase (p. 39).

 

4. Simultaneous Global Adoption

Unlike the first wave of the internet, AI adoption isn’t starting in Silicon Valley and diffusing globally—it’s going global from day one. China is not only a key competitor but also a significant contributor to open-source models and industrial robotics (p. 289, p. 293).

“AI leadership could beget geopolitical leadership – and not vice-versa.” – p. 8

 

5. AI Is Reshaping the Workforce

AI job postings in the U.S. have surged by +448% since 2018, while non-AI tech roles are actually down 9% (p. 302). Across enterprises, over 75% of global CMOs are already using or testing generative AI tools (p. 70). And legacy players like JP Morgan and Kaiser Permanente are modernizing entire systems using AI (p. 72, p. 73).

 

6. AI Has Gone Human

In a March 2025 study, 73% of participants mistook AI responses as human in Turing-style tests (p. 42). ChatGPT and other models are now matching or surpassing human performance on reasoning benchmarks like MMLU (p. 41), and generating realistic images, audio, and even translated voices (p. 44–47).

 

7. Risks Are Real, But So Is Optimism

The report is also clear about the risks: algorithmic bias, employment displacement, surveillance, and AI weaponisation. But the long-term view leans optimistic:

“Success in creating AI could be the biggest event in the history of our civilization. But it could also be the last – unless we learn how to avoid the risks.” – Stephen Hawking, p. 51

 

Final Thoughts

AI is no longer a lab experiment—it is the defining technology of our time. As Mary Meeker puts it, the compounding power of AI is now layered on top of decades of internet infrastructure. The result? Faster adoption, broader impact, and massive change.

Whether you’re a technologist, policymaker, or curious citizen, the message is clear: It’s AI-first now.

 

Full Report:  https://www.bondcap.com/report/tai/#view/0 

Saturday, March 08, 2025

Are Flat Hierarchies the Future of Work?


The traditional organizational structure, with its multiple layers of management, is increasingly being challenged by a new model: the flat hierarchy. In a flat hierarchy, there are fewer layers of management between the top and bottom of the organization, and individual contributors are given more autonomy and decision-making power.

This trend is being driven by several factors, including the need for organizations to be more agile and responsive to change, the increasing availability of technology that enables employees to work more independently, and the growing desire of employees for more autonomy and control over their work.


The Pandemic's Impact: Exposing Inefficiencies

The COVID-19 pandemic served as a massive, unplanned experiment in remote work, and it illuminated some critical truths about organizational structures. One of the most significant revelations was the limited value that many middle management layers provided in today's work environment, especially in organizations where Knowledge Workers are the main producers of the organisation's output.

  • Increased Autonomy:
    • With forced remote work, individual contributors had to become more self-reliant. Many discovered they could effectively manage their tasks and collaborate with colleagues without constant supervision.
    • This demonstrated that, with the right tools and clear goals, employees can thrive with greater autonomy.
     
  • Reduced Need for Oversight:
    • The pandemic revealed that much of the perceived need for middle management oversight was rooted in presenteeism—the idea that physical presence equates to productivity.
    • When output was measured by results rather than hours spent in the office, the necessity of constant managerial monitoring diminished.
     
  • Streamlined Communication:
    • Remote work forced organizations to adopt digital communication tools, which often bypassed traditional hierarchical communication channels.
    • This resulted in more direct and efficient information flow, highlighting the potential for streamlined communication in flatter organizations.
     
  • Andy Jasse and Amazon:

 

Benefits of Flat Hierarchies

There are a number of benefits to adopting a flat hierarchy. One of the most significant is that it can help to improve communication and collaboration within an organization. When there are fewer layers of management, information can flow more freely between employees, and it is easier for employees to connect with each other and work together towards common goals.

Flat hierarchies can also help to increase employee engagement and motivation. When employees are given more autonomy and control over their work, they are more likely to feel invested in their jobs and to be motivated to perform at their best.

Finally, flat hierarchies can help to reduce costs. When there are fewer managers, organizations can save money on salaries and other overhead costs. In my view, this not only reduces costs, but also frees up budget to reward those individual contributors who are directly responsible for the output.


How to Implement a Flat Hierarchy

If you are considering implementing a flat hierarchy in your organization, there are a few things you need to do. First, you need to clearly define roles and responsibilities. This will help to ensure that everyone knows what they are responsible for and that there is no duplication of effort.

Second, you need to invest in training and development for your employees. This will help them to develop the skills they need to succeed in a flat hierarchy, such as decision-making, problem-solving, and communication.

Finally, you need to create a culture of trust and transparency. This will help to ensure that employees feel comfortable taking risks and making decisions.

 

Conclusion

Flat hierarchies are becoming increasingly common in organizations of all sizes. The pandemic has accelerated this trend, demonstrating the limitations of traditional management structures and the benefits of empowering individual contributors. By reducing the number of layers of management and empowering individual contributors, organizations can become more agile, efficient, and responsive to change.

 

Additional reading ...

  1. The Rise of Flat Organizational Structures
  2. The Benefits of Flat Organizational Structures
  3. How to Implement a Flat Organizational Structure 

 

Sunday, January 05, 2025

ReAct Prompting: Elevating Large Language Models with Reasoning and Action

Large Language Models (LLMs) have revolutionized how we interact with machines, but they often struggle with tasks that require complex reasoning, decision-making, and interaction with the real world. Enter ReAct Prompting, a novel approach that empowers LLMs to exhibit more human-like intelligence by incorporating reasoning, action, and observation into their decision-making process.


What is ReAct Prompting?

ReAct Prompting is a framework that guides LLMs to perform tasks by:

  1. Reasoning: The LLM first analyzes the given task and generates a sequence of thoughts or reasoning steps. This involves breaking down the problem, identifying relevant information, and considering potential solutions.

  2. Action: Based on its reasoning, the LLM decides on an action to take. This could involve retrieving information from a knowledge base, performing a calculation, or interacting with an external tool or API.

  3. Observation: After performing the action, the LLM observes the outcome and updates its internal state accordingly. This feedback loop allows the model to refine its understanding of the situation and adjust its subsequent actions.

Key Advantages of ReAct Prompting:

  • Enhanced Reasoning and Decision-Making: By explicitly modeling reasoning and action, ReAct enables LLMs to tackle complex problems that require multi-step planning and decision-making.
  • Improved Task Performance: ReAct has demonstrated significant improvements in various tasks, including question answering, dialogue systems, and robotic control.
  • Increased Transparency and Explainability: The explicit reasoning steps generated by the LLM provide insights into its decision-making process, making it easier to understand and debug.
  • Greater Flexibility and Adaptability: ReAct can be easily adapted to different tasks and environments by simply modifying the available actions and the observation feedback mechanism.


Example: ReAct Prompting for a Restaurant Recommendation

Imagine you're using an LLM to find a restaurant for dinner. A ReAct Prompting approach might involve the following steps:

  1. Reasoning:

    • "I need to find a restaurant that serves Italian food and is within walking distance of my hotel."
    • "I should check online reviews to see which restaurants are highly rated."
  2. Action:

    • "Search Google Maps for 'Italian restaurants near [hotel address]'."
    • "Read the top 3 reviews for each of the top-rated restaurants."
  3. Observation:

    • "Restaurant A has excellent reviews but is a bit pricey."
    • "Restaurant B has good reviews and is more affordable."
  4. Reasoning:

    • "I'm on a budget, so Restaurant B seems like a better option."
  5. Action:

    • "Make a reservation at Restaurant B."

 

An example written using  Python

 
from langchain.chains import ReActChain
from langchain.llms import OpenAI

# Replace with your actual OpenAI API key
llm = OpenAI(model_name="text-davinci-003", temperature=0.7)

react_chain = ReActChain(
llm=llm,
verbose=True,
max_iterations=3,
tools=["search"]
)

# Example usage:
prompt = "Find me the best Italian restaurant near Times Square in New York City."
result = react_chain.run(prompt)

print(result)

How it works:

  • The ReActChain will internally guide the LLM through a series of reasoning and action steps.
  • The LLM will generate thoughts, such as "I need to find Italian restaurants near Times Square," and then decide on an action, such as "Search Google Maps for 'Italian restaurants near Times Square'."
  • The "search" tool will be used to query Google Maps, and the results will be fed back to the LLM.
  • The LLM will then analyze the search results, potentially refine its reasoning, and decide on further actions or generate the final recommendation.

 

Conclusion

ReAct Prompting represents a significant step towards creating more intelligent and versatile LLMs. By incorporating reasoning, action, and observation into their decision-making process, these models can tackle increasingly complex tasks and exhibit more human-like behavior. As research in this area continues to advance, we can expect to see even more sophisticated and capable AI systems that can seamlessly integrate with and navigate the real world.