When building the observability layer for Hebo Platform, we evaluated several databases typically used for telemetry and analytics.
One name inevitably comes up in those discussions: ClickHouse.
ClickHouse is extremely fast and widely used across observability platforms. Many modern systems — from log analytics to product telemetry — are built on top of it.
But after evaluating several options, we decided to build Hebo’s observability stack on GreptimeDB instead. The reason comes down to a fundamental difference in what these databases were designed for.
The name tells half the story
GreptimeDB was designed from the beginning as a time-series database. Timestamps drive the storage layout: data is partitioned by time ranges, time columns are automatically indexed, and the write path is optimized for the append-heavy, monotonically increasing pattern that telemetry produces. Compaction, TTL, and downsampling are built into the engine rather than bolted on.
Compare that to ClickHouse, which started as a general-purpose analytical database optimized for large-scale OLAP queries. It stores data in a columnar format and excels at aggregations across massive datasets, but time is just another dimension — not a first-class concern in the storage engine or query planner.
Observability data is fundamentally time-series data. Every trace, span, metric, and log entry is anchored in time — and when your entire workload revolves around time-based ingestion and range queries, a storage engine built for that access pattern has a real advantage.
We are ingesting traces, token usage, inference latency, model metadata, and many other signals generated by AI systems. This workload looks much more like time-series telemetry than traditional analytics, and GreptimeDB’s architecture reflects that directly.
Object Storage Native (Without Cost Explosion)
Observability data grows quickly. A system that records traces, logs, and AI telemetry can accumulate terabytes of data in a relatively short time.
Most databases handle this by eventually tiering cold data to object storage, but that usually means operating two separate systems — the primary DB on block storage, and a separate cold tier in S3 — with synchronization logic, duplicated bytes, and two sets of infrastructure to maintain.
GreptimeDB’s architecture is different. It uses a disaggregated storage model where the storage layer is separated from the compute layer from the start. Under the hood, GreptimeDB runs a WAL for durability, an in-memory memtable for recent writes, and flushes SST files (sorted string tables) directly to S3-compatible object storage. A local cache layer keeps hot segments fast. There’s no second system to operate and no duplicated data — the object store is the database storage.
This is the same architectural direction that modern cloud-native databases like Snowflake and Neon have taken, and it’s the right model for workloads with large retention requirements. For observability data that accumulates continuously and is queried infrequently beyond a recent window, it maps well.
All of that comes with strong performance, as demonstrated by the Billion JSON-Document Challenge where GreptimeDB outperforms every other database tested, including ClickHouse.
A Simpler Observability Architecture
The other major reason we chose GreptimeDB is that it dramatically simplifies the ingestion pipeline.
If you look at the Langfuse v3 infrastructure evolution post, you’ll see a representative serious observability stack: Kafka for buffering, ingestion workers, transformation pipelines, ClickHouse for storage, and S3 for cold data. That architecture is well-reasoned and handles Langfuse’s scale. It also means operating five or six separate systems.
GreptimeDB collapses most of that because it exposes a native OTLP endpoint — the same protocol your SDK already speaks. Traces are written directly over HTTP without any intermediate queue or processing worker needed.
Standard observability pipeline:
SDK → queue → workers → transformation → database
Our pipeline with GreptimeDB:
SDK → OpenTelemetry → GreptimeDB
This works because GreptimeDB handles the separation of concerns internally. It’s built around three roles: a stateless frontend that handles query and ingestion protocols, datanodes that store the actual data, and metasrv that coordinates routing across the cluster. These scale independently of each other, so you don’t need to build that horizontal scalability yourself with Kafka and workers — the database provides it. The result is fewer moving parts, less operational overhead, and a system that’s much easier to reason about.
Built for OpenTelemetry (and Gen-AI Observability)
At Hebo, OpenTelemetry is our primary signal format. GreptimeDB has native OTLP support and maps trace and span data directly to its columnar storage model.
This integrates cleanly with the emerging Gen-AI semantic conventions, which define a standard schema for LLM observability. Fields like:
gen_ai.operation.name— the type of operation (chat, completion, embedding, etc.)gen_ai.request.model/gen_ai.response.model— the model requested and the model that actually served the responsegen_ai.usage.input_tokens/gen_ai.usage.output_tokens— token counts for cost and quota trackinggen_ai.input.messages/gen_ai.output.messages— the full message input and outputgen_ai.server.request.duration— inference latency per request
One particularly useful property here is that GreptimeDB uses a dynamic schema. When a new span attribute arrives that hasn't been seen before, the column is created automatically — no migrations, no schema management. As the Gen-AI semantic conventions evolve and new fields get added, they just appear without any changes on our side.
All of this lands as queryable SQL columns in the opentelemetry_traces table. Span attributes are flattened directly into columns, so you can write SELECT "span_attributes.gen_ai.request.model", AVG("span_attributes.gen_ai.server.request.duration") FROM opentelemetry_traces GROUP BY "span_attributes.gen_ai.request.model" directly against your trace data, with no custom schema design or ETL step required.
Great Developer Experience from Laptop to Cluster
GreptimeDB transitions smoothly from development to production.
Locally, it runs as a single Docker container in standalone mode — all three roles bundled into one process. That’s sufficient for development and works fine for smaller production deployments too.
When you need to scale, Helm charts deploy cluster mode, where the three roles described above split into independently scalable components. Heavy query load? Scale the frontends. More ingestion volume? Add datanodes. You don’t have to scale everything together.
The open-source version covers all of this. The enterprise edition adds features like read replicas for read/write separation under high query concurrency, and there’s GreptimeCloud if you’d rather not operate the infrastructure yourself. The architecture is the same across all tiers, so there’s no lock-in to a particular deployment model.
An Extremely Responsive Team
An impressive aspect of working with GreptimeDB has been the responsiveness of the team.
Their Slack community is active, and when we ran into a Unicode encoding issue during querying, the team responded quickly. Their CTO opened a GitHub issue to track it directly.
For a younger project, that kind of engagement matters a lot. We’re also looking forward to the upcoming JSON v2 data type, which will make working with structured observability metadata — tool call arguments, model parameters, custom attributes — significantly cleaner.
What’s Next
Observability is the first use case, but not the last. We’re planning to use GreptimeDB for the upcoming Hebo /conversations API as well — storing and querying large-scale AI interaction histories. The same architecture that handles telemetry applies directly to interaction logs at scale.
If you’re building AI infrastructure, observability platforms, or LLM tooling, GreptimeDB is worth a serious look. The combination of a time-series-native storage engine, built-in object storage, and native OTLP ingestion is a strong fit for this class of workload.
And if you want to see it in action, try the Hebo observability dashboard — traces, token usage, inference latency, and Gen-AI signals, all in one place.