Local-First Observability. Why Your LLM Traffic Should Stay Home

Orchid Team5 min read

Think about what actually flows through your LLM calls. System prompts that encode your business logic and competitive edge. User messages that may contain customer names, account details, and health or financial information. Retrieved documents from your internal knowledge base. Tool outputs from your production databases.

Now consider what most LLM observability platforms ask you to do with that traffic. Send a copy of all of it to their cloud.

For some teams that's an acceptable trade. For many others, in healthcare, finance, legal, or anywhere with a serious data processing agreement, it's a non-starter that turns a tooling decision into a months-long vendor review. And even when it's allowed, every additional copy of sensitive data is an additional thing that can leak.

Orchid takes a different position. Observability data should live where it's generated. On your machine, in your VPC, under your control.

What Local-First Means in Practice

Orchid is a single Rust binary. It runs as a proxy between your application and your LLM providers, records every exchange, and serves a web visualizer, a query API, and an MCP server. The whole story is in Record, Inspect, Replay. Architecturally, the local-first commitment breaks down into a few concrete guarantees.

Requests go only where they were already going. The proxy forwards traffic exclusively to the upstream APIs your application was calling anyway. There is no side channel, no telemetry endpoint, no phone-home. Orchid adds zero new destinations to your network diagram.

Everything recorded stays in a local SQLite database. Captured exchanges are written to a single database file inside the container or on your mounted volume. There is no cloud backend because there is no cloud. Backup, retention, and access control work with the tools you already use for files.

Secrets are scrubbed before disk. The Authorization header carrying your provider API key is forwarded untouched to the upstream but never stored. Headers, query strings, and body fields with secret-like names, such as keys, tokens, passwords, credentials, and cookies, are stored as [REDACTED]. The scrubbing happens in memory before anything is written.

One honest caveat, because trust requires honesty. Redaction works by recognizing field names, not by scanning prompt contents. Prompt and completion text is recorded verbatim, which is the entire point of a forensic recorder. If someone pastes a secret into a prompt, it will be stored with the rest of the prompt text. Treat the database file with the same care as your application logs.

Small Enough to Audit, Boring Enough to Trust

A security posture is easier to evaluate when the system is simple. Orchid's runtime footprint is one process and one file.

One binary. The proxy, the query API, the embedded React visualizer, and the MCP server all ship in a single executable. There is no fleet of services to harden.
One SQLite file. Not a managed database, not a message queue, not an object store. A file you can copy, encrypt, inspect with standard tools, or delete.
Bounded by design. Built-in retention automatically prunes old sessions, oldest first, by age and by database size cap. Recordings don't grow without limit, and the active session is never pruned mid-run.

This simplicity also makes deployment flexible. Run it on a laptop during development, as a sidecar in your cluster, or on a small VM shared by the team. The data stays wherever you put it.

No Vendor Lock-In, by Construction

Local-first has a quieter benefit beyond privacy. Your data is yours in format, not just in location.

Recordings live in SQLite, arguably the most durable and widely supported storage format in software. Sessions export to plain JSON fixtures through the API or the visualizer, which is how the zero-cost replay testing workflow moves recordings into CI. If you stop using Orchid tomorrow, your historical data remains readable with tools that will exist for decades.

Compare that with cloud observability platforms, where your traces live behind a proprietary API and an export is whatever the vendor's rate limits allow.

Local Doesn't Mean Limited

It's reasonable to assume a local tool trades away capability. Here the recording is the product, and it's all local, so nothing is traded away.

The embedded visualizer gives you timeline views, payload inspection, search, and session comparison at http://localhost:4321.
The MCP server lets your coding assistant query recordings, and it runs on the same port with the same auth.
Cost tracking computes USD attribution per exchange with a pricing schema you control.
Replay mode serves recorded responses for offline tests and clean performance benchmarks.

And because the proxy is header-driven, all of it works from any language without an SDK.

Questions Worth Asking Any Observability Vendor

Whether or not you choose Orchid, these questions are worth putting to anything that touches your LLM traffic.

Where exactly does my prompt data rest, and who can query it?
What happens to stored API keys and credentials in captured requests?
Can I export everything in an open format, today, without a support ticket?
What is the retention story, and do I control it?
How many new network destinations does this add to my architecture?

Orchid's answers are short. Your disk, never stored, yes and it's JSON and SQLite, you control it with two flags, and zero.

If those are the answers you want, the proxy is one Docker command away. Your traffic never leaves home, and you stop trading privacy for visibility. For more detail on security, API key handling, and data retention, see the FAQ. Get started at orchidtrace.xyz or visit the GitHub repository.