Open-source platform from Comet for debugging and evaluating LLM and agent apps: full tracing of calls, tools, and agent steps, LLM-as-a-judge and heuristic evals, prompt management, and production dashboards. Self-host via Docker or Kubernetes, or use Comet's hosted cloud.
Observability · Comet
Opik
Open-source LLM evaluation, tracing, and monitoring.
Model support
BYO key / model
Framework-agnostic tracing; LLM-as-a-judge metrics run on a model you configure.
Where it runs
- Web
- API
Tags
- #observability
- #evaluation
- #tracing
- #open-source
Related in Observability
View Arize Phoenix details ObservabilityFREEMIUMOpen coreARArize Phoenix
Arize AI
Open-source LLM tracing + evaluation. Strong on retrieval debugging.
Phoenix is Arize's open-source observability — run locally in a notebook or as a service. Especially strong for inspecting RAG pipelines, finding bad chunks, and tracking retrieval quality over time.
AI insight: Spins up inside a Jupyter notebook and is sharpest at RAG debugging — finding the bad chunk that poisoned a retrieval.
- open-source
- tracing
- rag
- retrieval-debugging
View Helicone details ObservabilityFREEMIUMOpen coreHEHelicone
Helicone
Drop-in LLM proxy with logging, caching, and cost tracking.
One-line integration — change your OpenAI/Anthropic base URL and get a dashboard with every prompt, response, latency, and dollar tracked. Adds caching and rate-limit handling without code changes.
AI insight: Integrate by changing one base-URL line — no SDK wrapper — and it's open-source, so you can self-host the proxy.
- proxy
- logging
- caching
- cost-tracking
View Langfuse details ObservabilityFREEMIUMOpen coreLALangfuse
Langfuse
Open-source LLM observability. Self-hostable, OpenTelemetry-native.
Tracing, evals, prompt management, and dataset tooling for LLM apps — self-host on your own infra or use Langfuse Cloud. The open-source default when you want full ownership of your observability stack.
AI insight: The self-hostable, OpenTelemetry-native answer to LangSmith — pick it when observability data has to stay on your own infra.
- open-source
- tracing
- evals
- self-hosted
View LangSmith details ObservabilityFREEMIUMLALangSmith
LangChain
LangChain's hosted observability + eval platform.
Tracing, dataset management, eval orchestration, and prompt playground from the LangChain team. Pairs naturally if LangChain or LangGraph already runs in your stack, but works standalone via SDKs.
AI insight: Despite the name it works without LangChain in your stack — but it's cloud-only, where Langfuse lets you self-host.
- tracing
- evals
- datasets
- langchain