Platform

Tech stack

The specific pieces we use — Cloudflare Workers, Claude, AI Gateway — and how we built it so we can swap any of them.

A short tour of the technologies under Luna’s agent platform, what each one does for us, and why we chose it.

Cloudflare — where the whole thing runs

Every Luna-owned piece of this system runs on Cloudflare. We picked it for three reasons:

  • One account, many building blocks. Workers (stateless compute), Durable Objects (stateful storage at the edge), R2 (object storage), Pages (static sites — including this intranet), AI Gateway (LLM audit + rate limiting), Access (Zero Trust auth). They all share billing, authentication, and observability, which means we don’t stitch together five different vendors.
  • Service bindings. Internal Workers can call each other without going over the public internet. When luna-router calls luna-agent-basal, there’s no DNS lookup, no TLS handshake — it’s an edge-private call. Faster and harder to attack.
  • Cost structure. We pay per-request and per-GB, not per-instance. An idle agent costs nothing.

The specific Cloudflare primitives we use

PrimitiveWhat we use it for
WorkersEvery Luna service (luna-slack-dm, luna-router, luna-agent-basal, luna-ai-proxy) is a Worker. The Basal agent is currently deployed under the legacy name luna-agent-pa — same Worker, name will catch up to the brand on the next pass.
Durable ObjectsOne per person — holds your Basal conversation history in SQLite.
R2Three buckets: luna-agent-config (prompt templates + agent config, etag-cached), luna-audit-log (Workers Logpush trace events at 100% sampling), and luna-agent-archive (long-term conversation archives). No egress fees.
AI GatewayThe luna-agents gateway sits in front of every model call. Rate limits, cost caps, audit trail, provider-key injection.
Analytics EnginePer-Worker datasets (luna_agent_pa_events, luna_ai_proxy_events, luna_router_events) capture latency, model, status, and hashed user id for every request.
Workers LogpushStreams Workers Trace Events to the luna-audit-log R2 bucket at 100% sampling. Full request-level observability without an external logging vendor.
AccessZero Trust auth — gates this intranet (and the future admin dashboard) behind Google Workspace SSO for @lunadiabetes.com.
PagesHosts this intranet. Static HTML built with Astro, deployed on every push.

Claude — the model behind the answers

The language model powering Basal today is Claude, from Anthropic. Specifically, we default to Claude Opus 4.7 (Anthropic’s flagship reasoning model), with the ability to fall back to Claude Sonnet 4.6 for lower-latency work and Claude Haiku 4.5 for cheap, fast classification tasks.

We picked Claude for four reasons specific to Luna’s situation:

  • Safety and refusal behavior. Medical context is full of edge cases where an AI could give dangerous advice. Claude is trained to be careful in exactly those situations without being uselessly hedged.
  • Zero Data Retention. Luna’s Anthropic workspace has ZDR enabled, meaning our interactions are never logged, stored, or used to train Anthropic’s models. This is a contractual guarantee, not a setting.
  • Long context. Claude’s 200K-token context window means we can feed it long documents, transcripts, or conversation histories without awkward chunking.
  • Tool use. Claude is built to call tools (APIs, searches, code) reliably. That matters a lot for the next wave of agents — like the Data Agent, which needs to reliably call a SQL tool.

Model switching — designed for it from day one

Here’s an architectural choice worth calling out: the agents (luna-agent-basal, future ones) don’t know which model they’re talking to. They just post a message to luna-ai-proxy and get a response.

The proxy is where model selection happens. Switching from Claude Opus 4.7 to Sonnet 4.6, or adding a Google or Workers AI model alongside, is a configuration change in one place — not a rewrite across every agent.

Providers wired up through the gateway today:

  • Anthropic — Claude (default for conversational work).
  • Google AI Studio — available as an alternate provider.
  • Workers AI — Cloudflare’s own model catalog, used for cheaper classification and embedding tasks as we add them.

The proxy selects a provider based on the request path it receives (/anthropic/, /google-ai-studio/, /workers-ai/). Agents send a message and the proxy decides — which means we can A/B test, auto-fail-over, or swap providers entirely without changing any agent code.

This site

The intranet you’re reading runs on the same stack:

  • Astro — static-site generator, builds pure HTML.
  • Tailwind CSS — styling, keyed to Luna’s design system.
  • Cloudflare Pages — hosts the built HTML.
  • Cloudflare Pages Functions — the /api/chat endpoint that powers the in-browser chat box. It lives at the network edge and forwards requests into luna-ai-proxy, so the browser never sees any model API keys.
  • Cloudflare Access — wraps everything, enforcing Google Workspace SSO before any request reaches the site.

You can see the architecture for the request flow, or the privacy page for what’s kept private and what isn’t.