Yet Another AI Gateway?

January 27, 2026

There’s no shortage of AI gateways today. If you just want something that works, cloud options like OpenRouter or Vercel AI Gateway get you moving fast. If you need a more enterprise-ready setup, self-hosted solutions like LiteLLM and Portkey are well-established.

The problem we kept running into: they’re still off-the-shelf gateways. You can tweak settings, but once you want to extend them in a meaningful way, you hit a wall.

Hebo Gateway exists for teams that want the gateway to be part of their app, teams that need full extensibility.

This is the v0.1 release of Hebo Gateway, an open-source, self-hostable AI gateway designed to be embedded as a framework into your own app. You fully own your providers, model catalog, routing logic, and request lifecycle without forking or fighting a vendor.

It’s not trying to replace the established gateway solutions. It’s for the cases where those tools are almost right — but not quite flexible enough.

If you know better-auth, this should feel familiar: better-auth is “auth you embed in your app,” not a hosted auth platform you integrate around. Hebo Gateway applies the same idea to AI gateways — an embeddable, composable core that puts you in full control.

What Hebo Gateway provides

Hebo Gateway is a configurable AI gateway. You can use it (i) as a standalone self-hosted gateway service, (ii) embedded inside an existing API / backend or just (iii) for its low-level schema and converter helpers. It includes:

🌐 OpenAI-compatible /chat/completions, /embeddings, and /models
🔌 Works with any Vercel AI SDK provider
🧭 Canonical model IDs and normalized parameters
🗂️ Extensible model catalog with rich metadata
🪝 Hook system for auth, routing, rate limits, and observability
🧩 Framework-native integration (Elysia, Hono, Next.js, TanStack, WinterCG)
👁️ Observability via OTel GenAI semantic conventions (Langfuse-compatible)

Quick Start

Install NPM Package

We love Bun, but if you really want to go with NPM or others, you can of course.

bun install @hebo-ai/gateway @ai-sdk/groq elysia

Configure Providers and Models

This sets up a provider registry, a canonical model catalog and OpenAI-compatible endpoints.

import { createGroq } from "@ai-sdk/groq";
import { gateway, defineModelCatalog } from "@hebo-ai/gateway";
import { withCanonicalIdsForGroq } from "@hebo-ai/gateway/providers/groq";
import { gptOss20b } from "@hebo-ai/gateway/models/openai";

export const gw = gateway({
    providers: {
        groq: withCanonicalIdsForGroq(
            createGroq({
                apiKey: process.env.GROQ_API_KEY,
            }),
        ),
    },
    models: defineModelCatalog({
        gptOss20b({
            providers: ["groq"]
        })
    }),`
});

Mount the gateway

Hebo Gateway exposes a WinterCG-compatible handler. You mount it where it makes sense.

ElysiaJS example

import { Elysia } from "elysia";

const app = new Elysia().mount("/v1/gateway/", gw.handler).listen(3000);

console.log("🐒 Hebo Gateway running on http://localhost:3000");

No new server model. No framework lock-in. Also works with Hono, Next.js, TanStack Start and almost any other framework.

Call it like OpenAI

Since Hebo Gateway exposes OpenAI-compatible endpoints, you can use most existing AI SDKs unchanged (including Vercel AI SDK and OpenAI's own SDK).

import { createOpenAICompatible } from "@ai-sdk/openai-compatible";
import { generateText } from "ai";

const hebo = createOpenAICompatible({
  name: "hebo",
  baseURL: "http://localhost:3000/v1/gateway",
});

const { text } = await generateText({
  // Notice how this is using 'openai/gpt-oss-20b' instead of 'gpt-oss-20b'
  // The gateway automatically maps model IDs to the upstream provider ones
  model: hebo("openai/gpt-oss-20b"),
  prompt: "Tell me a joke about monkeys",
});

console.log(text);

Hooks: where the flexibility lives

Hooks let you plug into the gateway lifecycle without modifying the core. You can enforce authentication, implement custom rate limits, route dynamically between providers, resolve model aliases, add observability or logging, and transform requests and responses.

const gw = gateway({
    providers: { ... },
    models: { ... },
    hooks: {
        onRequest: async ({ request }) => {
            // auth, rate limits
            return undefined;
        }

        before: async ({ operation, body }) => {
            // request shaping
            return undefined;
        },

        resolveModelId: async ({ operation, modelId }) => {
            // alias → canonical ID
            return undefined;
        },

        resolveProvider: async ({ providers, models, operation, resolvedModelId  }) => {
            // custom routing logic
            return undefined;
        },

        after: async ({ result }) => {
            // response transformation
            return undefined;
        },

        onResponse: async ({ response }) => {
            // response logging
            return undefined;
        },
    },
});

For the full documentation and advanced scenarios, see the README.md on our GitHub project.

What’s Next

v0.1 was focused on getting the foundation right. The goal was to provide a solid, composable core that teams can confidently build on without committing to a rigid gateway runtime.

Looking ahead, our immediate focus is on hardening that core. We’re working to ensure full OpenAI compatibility across all request types and edge cases, including streaming behavior, reasoning modes, and less-common API paths. After that we'll probably move on to include support for the recently announced Open Responses API.

Hebo Gateway is early and evolving quickly. You can follow the progress and report any issues on GitHub, talk to the team on Discord, or follow updates on X (@heboai).