Skip to content

Why ai-armor?

The Problem

Building AI-powered applications in production means dealing with problems that don't exist in prototypes:

  • Cost spirals -- A single misconfigured loop can burn through $1,000 in minutes. Without per-user budgets, one power user can exhaust your entire API allocation.
  • No rate limiting -- AI provider rate limits are per-API-key, not per-user. You need application-level rate limiting to protect your service.
  • Zero observability -- You have no idea which models cost the most, which users are heaviest, or whether your cache is actually saving money.
  • Safety gaps -- Prompt injection, PII leakage, and runaway token counts are real attack vectors in production AI systems.

Every team ends up building these guardrails from scratch. The result is thousands of lines of bespoke infrastructure code that is hard to test, easy to get wrong, and impossible to share across projects.

The Solution

ai-armor is a single TypeScript package that handles all of these concerns:

ts
import { createArmor } from 'ai-armor'

const armor = createArmor({
  rateLimit: {
    strategy: 'sliding-window',
    rules: [{ key: 'user', limit: 30, window: '1m' }],
  },
  budget: {
    daily: 50,
    monthly: 500,
    perUser: 10,
    onExceeded: 'downgrade-model',
    downgradeMap: { 'gpt-4o': 'gpt-4o-mini' },
  },
  cache: {
    enabled: true,
    strategy: 'exact',
    ttl: 3600,
  },
  logging: {
    enabled: true,
    include: ['model', 'tokens', 'cost', 'latency'],
  },
})

One configuration object. All guardrails active. Works with any AI provider, any framework, any runtime.

Comparison

Featureai-armorBuild from scratchLiteLLM (Python)
LanguageTypeScript-nativeYour languagePython
Rate limitingSliding windowMust implementBasic
Cost tracking69 models, auto pricingManual pricing tableYes
Budget controlsDaily/monthly/per-user + auto downgradeMust implementBasic
Response cachingO(1) LRU with TTLMust implementRedis required
Safety guardrailsPrompt injection, PII, token limitsMust implementNo
Model routingAliases + tier-based routingMust implementYes
LoggingStructured logs with callbacksMust implementYes
AI SDK integrationFirst-class middlewareN/AN/A
Nuxt module@ai-armor/nuxtN/AN/A
Setup time5 minutesDays/weeks30 minutes
Dependencies1 (gpt-tokenizer)ManyHeavy

Key Differentiators

TypeScript-native

ai-armor is written in strict TypeScript with full type inference. Every configuration option, every callback, every return type is fully typed. No any, no runtime surprises.

ts
// Full IntelliSense for all config options
const armor = createArmor({
  budget: {
    daily: 50,
    onExceeded: 'downgrade-model', // autocomplete: 'block' | 'warn' | 'downgrade-model'
    downgradeMap: {
      'gpt-4o': 'gpt-4o-mini',
    },
  },
})

// Return types are fully typed
const result = await armor.checkBudget('gpt-4o', { userId: 'user-1' })
// result.allowed: boolean
// result.action: string
// result.suggestedModel?: string

Framework-agnostic

ai-armor works with any TypeScript project. Use it directly with provider SDKs, as Vercel AI SDK middleware, or as HTTP middleware for Express/Hono/Fastify:

  • Direct SDK -- Call armor.checkRateLimit(), armor.trackCost(), etc. manually
  • AI SDK middleware -- wrapLanguageModel() with automatic protection
  • HTTP middleware -- createArmorHandler() for any Connect-compatible server

Zero-config defaults

Every feature is opt-in. Start with just rate limiting, add cost tracking later, enable caching when you need it. Sensible defaults mean you don't need to configure everything upfront.

Minimal dependencies

The core package depends only on gpt-tokenizer (token counting). No heavy frameworks, no runtime bloat.

Next Steps

Released under the MIT License.