The AI Hype Hangover: Why Pragmatic Devs are Choosing Boring Code over LLMs

Hey everyone, Alex here. Welcome back to another edition of Coding with Alex at sysseder.com. If you’ve spent any time on Hacker News, TechCrunch, or your team's Slack channels lately, you’ve probably felt the overwhelming pressure that you should be "AI-ifying" every single layer of your stack. The current narrative suggests that if you aren't routing your database queries through an LLM agent, writing your CSS with Copilot, or replacing your deterministic heuristics with a generative model, you're somehow falling behind.

But a fascinating reality check has been bubbling up to the top of the developer community lately: No, everyone is not using AI for everything. In fact, we are starting to enter the "trough of disillusionment" phase of the generative AI hype cycle, and it’s the best thing that could have happened to software engineering.

Today, we’re going to step back from the marketing hype and look at this from a pragmatic, engineering-first perspective. We’ll talk about why top-tier engineering teams are actively pulling back from LLM-dependent architectures, the hidden costs of non-deterministic systems, and how to write clean, deterministic code to solve problems that AI is currently over-complicating. Let's dive in.

The Hidden Tax of the "AI Everywhere" Architecture

When GPT-4 dropped, the immediate reaction was to treat it as a universal API endpoint. Need to parse some messy user input? Send it to the LLM. Need to categorize support tickets? LLM. Need to extract structured JSON from a legacy PDF? LLM.

On paper, this looks like incredibly fast prototyping. In production, however, developers quickly ran into three massive roadblocks: latency, cost, and non-determinism.

1. The Latency and Throughput Bottleneck

In web development, we fight for millisecond optimizations. We implement Redis caching, optimize indexing, and minimize bundle sizes to keep our Time to Interactive (TTI) low. Passing a user request through an LLM API instantly blows up your budget. Even the fastest models (like Claude 3 Haiku or GPT-4o mini) will routinely take 500ms to 2 seconds to return a response. For synchronous web flows, this is a terrible user experience.

2. The Cost of Tokens at Scale

While API pricing has plummeted, it is still orders of magnitude more expensive than running traditional code. Running a complex prompt with system instructions, few-shot examples, and a large context window across millions of API requests per month will result in a shocking AWS or OpenAI bill. Conversely, a compiled Go binary or a Node.js microservice running on a cheap ECS cluster can process millions of requests for pennies.

3. The Nightmare of Non-Determinism

As developers, we love predictability. If f(x) = y today, we expect f(x) = y tomorrow. With LLMs, even with the temperature set to 0, you cannot guarantee absolute determinism. Prompt drift, model updates behind the API, and edge cases in user inputs can lead to unexpected, structured-breaking outputs that bypass your validation layers and crash your frontend.

A Real-World Example: Parsing Complex Data

Let's look at a concrete scenario. Suppose you are building an expense tracking app. Users upload a raw string of text (e.g., from an SMS alert or an email receipt), and you need to parse out the amount, the currency, and the merchant.

The "hype-driven" approach is to write a complex prompt, send it to an LLM, use tool-calling (JSON mode), and hope the JSON structure is valid. Here is what that looks like in Node.js using an LLM SDK:

// The Over-Engineered AI Approach
import OpenAI from 'openai';

const openai = new OpenAI();

async function parseTransactionWithAI(rawText) {
  const response = await openai.chat.completions.create({
    model: "gpt-4o-mini",
    messages: [
      {
        role: "system",
        content: "Extract transaction details as JSON with keys: amount (float), currency (3-letter code), merchant (string)."
      },
      {
        role: "user",
        content: rawText
      }
    ],
    response_format: { type: "json_object" }
  });

  return JSON.parse(response.choices[0].message.content);
}

This works, but it’s slow, costs money per API call, requires internet access, and introduces a dependency on a third-party API. Now, let’s look at how a pragmatic developer solves the exact same problem for 95% of use cases using a simple, fast, deterministic TypeScript function with regular expressions and structured parsing.

// The Pragmatic, Boring, Ultra-Fast Approach
interface Transaction {
  amount: number | null;
  currency: string;
  merchant: string | null;
}

export function parseTransactionDeterministic(rawText: string): Transaction {
  const result: Transaction = {
    amount: null,
    currency: 'USD', // Default fallback
    merchant: null
  };

  // 1. Extract Amount and Currency
  // Matches $120.50, €99, 45.00 EUR, 1500 JPY, etc.
  const currencyRegex = /(?:\$|€|£|¥|USD|EUR|GBP|JPY)\s?\d+(?:\.\d{2})?|\d+(?:\.\d{2})?\s?(?:USD|EUR|GBP|JPY|\$|€|£|¥)/i;
  const match = rawText.match(currencyRegex);

  if (match) {
    const rawMatch = match[0];
    // Extract numeric value
    const numberMatch = rawMatch.match(/\d+(?:\.\d{2})?/);
    if (numberMatch) {
      result.amount = parseFloat(numberMatch[0]);
    }

    // Determine currency
    if (rawMatch.includes('$') || rawMatch.toUpperCase().includes('USD')) result.currency = 'USD';
    else if (rawMatch.includes('€') || rawMatch.toUpperCase().includes('EUR')) result.currency = 'EUR';
    else if (rawMatch.includes('£') || rawMatch.toUpperCase().includes('GBP')) result.currency = 'GBP';
    else if (rawMatch.includes('¥') || rawMatch.toUpperCase().includes('JPY')) result.currency = 'JPY';
  }

  // 2. Extract Merchant (Heuristics based on common patterns)
  // E.g., "Spent $10.00 at Starbucks" or "Payment to Uber"
  const merchantRegex = /(?:at|to|from|charge\sby)\s+([A-Z0-9][a-zA-Z0-9\s'\.\-]{1,29})/i;
  const merchantMatch = rawText.match(merchantRegex);
  if (merchantMatch && merchantMatch[1]) {
    result.merchant = merchantMatch[1].trim();
  }

  return result;
}

Let's compare the two approaches side-by-side:

  • Latency: The AI approach takes 300ms to 1000ms. The regex-based deterministic approach takes less than 0.05 milliseconds.
  • Cost: The AI approach costs real money per call. The deterministic approach is free and runs locally on the CPU.
  • Security: The deterministic approach has zero risk of prompt injection or data leakage to a third-party API.
  • Reliability: 100% deterministic. You can easily write unit tests for every edge case without worrying about random model updates.

The Hybrid "Fallback" Architecture

Am I saying you should never use AI? Absolutely not. AI is incredible at handling highly unstructured, fuzzy, and semantic tasks where traditional algorithms fall apart. However, the best engineers aren't replacing their entire backend with AI; they are building hybrid systems where traditional code handles the heavy lifting, and LLMs are used only as a fallback.

Here is what a robust, cost-effective hybrid architecture looks like:

Hybrid Architecture Diagram

In this architecture, when data enters the system:

  1. The request first hits our ultra-fast, deterministic parser.
  2. If the deterministic parser successfully extracts the required data with high confidence, we return the result instantly (taking < 1ms).
  3. Only if the deterministic parser fails or returns low-confidence flags do we route the request to the LLM agent.

By implementing this pattern, you can deflect up to 90% of your AI traffic, reducing your API costs by 90%, speeding up response times for the vast majority of your users, and keeping a predictable, easily debugged core backend.

When You SHOULD (and Shouldn't) Reach for AI

To help you navigate architectural decisions at your job or on your side projects, here is a quick rule-of-thumb checklist of when to stick to traditional code versus when to leverage generative AI models.

Do Not Use AI For:

  • Mathematical calculations: LLMs are notoriously bad at math because they predict tokens, not execute formulas. Always use code.
  • State management and business logic: Keep your routing, state machines, and business rules inside deterministic code (e.g., database constraints, state libraries).
  • Structured data mapping (known schemas): If you are mapping API responses from one JSON schema to another, write a standard mapping function or use tools like Zod.
  • Simple search: For searching database fields, use Postgres indexes, full-text search, or Elasticsearch before jumping straight to Vector Databases and semantic embedding models.

Do Use AI For:

  • Unstructured semantic search: When users want to search your database using abstract concepts (e.g., searching a photo library for "cozy winter feelings" rather than specific tags).
  • Creative generation and summarization: Summarizing lengthy articles, generating personalized email drafts, or brainstorming.
  • Fuzzy translation and classification: Classifying incoming support tickets into high-level sentiment categories or translating text between dynamic languages.

Conclusion

As developers, our ultimate goal is to build reliable, maintainable, cost-effective, and fast software that solves real problems. Sometimes, the most exciting tool in the developer toolkit is not the shiny new LLM—it is a well-optimized SQL index, a robust regular expression, or a carefully designed state machine.

Don’t let the hype make you feel insecure about writing traditional, "boring" code. The most successful startups and enterprise teams are the ones that use AI surgically, keeping their core architectures predictable and efficient.

What about you? Have you had to roll back an "AI feature" in favor of a deterministic solution? Or have you successfully built a hybrid architecture that keeps latency down? Let me know in the comments below!

Until next time, happy coding!

Post a Comment

Previous Post Next Post