Streaming

Streaming lets you deliver tokens as they are generated, so users see output immediately. It improves perceived latency, supports long responses without timeouts, and enables realtime UX like typing indicators.

Vercel AI SDK

import { createOpenAICompatible } from "@ai-sdk/openai-compatible";
import { streamText } from "ai";

const hebo = createOpenAICompatible({
  name: "hebo",
  apiKey: process.env.HEBO_API_KEY,
  baseURL: "https://gateway.hebo.ai/v1",
});

const result = streamText({
  model: hebo("openai/gpt-oss-20b"),
  prompt: "Tell me a very long story about monkeys",
});

for await (const chunk of result.textStream) {
  process.stdout.write(chunk);
}

Overview

Gateway

MCP

Evals

Changelog