Hebo Evals:w Evaluate Prompts / LLMs / Agents

Hebo Evals: Markdown for Evals, a human-first format

We explored existing evaluation solutions — and while powerful, most felt built for developers. But who ultimately owns the outcomes? The business does.

Effective evals shouldn’t live in code. They need to be written, reviewed, and iterated on by business teams — the people who actually define what “good” looks like.

And honestly: who enjoys writing evals full of curly braces, JSON, and DSLs?

Agents already speak Markdown. Humans do too. Markdown is not only a natural format for agents — it’s easier for people to read, write, and reason about.

On top of that, Markdown has a rich ecosystem of tooling, enabling Notion-like editing experiences that lower the barrier even further.

That’s why we introduced .MDE — Markdown for Evals. A simple, human-first format for defining evaluation logic that both business and technical teams can understand and evolve together.

This is v0.1, and we’re actively looking for early feedback. Give it a try — and let us know what you think via X (@heboai) or on Discord.