Skip to main content
Streaming lets you deliver tokens as they are generated, so users see output immediately. It improves perceived latency, supports long responses without timeouts, and enables realtime UX like typing indicators.
Vercel AI SDK
import { createOpenAICompatible } from "@ai-sdk/openai-compatible";
import { streamText } from "ai";

const hebo = createOpenAICompatible({
  apiKey: process.env.HEBO_API_KEY,
  baseURL: "https://gateway.hebo.ai/v1",
});

const result = await streamText({
  model: hebo("openai/gpt-oss-20b"),
  prompt: "Tell me a very long story about monkeys",
});

for await (const chunk of result.textStream) {
  process.stdout.write(chunk);
}