Case Study Client Zero · 9 months · 1 engineer

We did not ship a chatbot first. We built Lucid.

Lucid is a self-hosted AI productivity OS: fine-tuned model, semantic memory network, autonomous context management. It runs on a dedicated GPU server Fabi operates himself. Every architectural pattern in Lina, our AI phone agent, was pressure-tested in Lucid first.

What this means for your practice Talk to Fabi

build_time9 months · active production

loc_shipped240,049

team_size1 engineer · solo

hostingHetzner Falkenstein · DE

memory_nodes~8,000 · 768d vectors

inferencededicated GPU · 24/7

statushealthy

Why this matters. Proof of care and discipline.

Most AI vendors have never shipped a production system of their own. Lucid is the opposite: a real system run daily under genuine load, without a safety net. What Fabi learned building it flows directly into every client engagement.

01

Patterns tested under load

Every RAG pattern, every escalation heuristic, every confidence gate in client phone agents was built and broken in Lucid first. You get the result, not the tuition.
02

Real standards, not demos

Lucid runs around the clock. It handles real data, real edge cases, real privacy requirements. The bar Fabi sets for client builds is one Lucid has already cleared.
03

End-to-end ownership

From model to memory to UX, Fabi owns the full stack. No subcontracting, no black boxes. The same ownership applies to every client build.

The architecture. Nothing off the shelf.

Lucid is not a wrapper around a third-party API. It is a bespoke system, designed for latency, privacy, and reliability requirements that generic platforms cannot meet.

01

Self-hosted Hetzner GPU

Dedicated inference server, Falkenstein data center, EU only. No request leaves German soil for model inference. The same stack is used for client deployments that require data residency.
02

Semantic memory network

A Supabase + pgvector semantic memory layer. Every session, decision and piece of context is stored as an embedding and retrieved by relevance, not date. The same pattern drives RAG in client phone agents.
03

Fine-tuned model

A base model fine-tuned on Fabi's own writing style, decisions and communication patterns. Not prompt-engineered, actually trained. Client phone agents get the same fine-tuning pipeline, adapted to the practice voice and brand language.
04

Autonomous context management

Dynamic context compression, neural-memory-aware retrieval, session-scoped state tracking. The system decides for itself what to keep, compress and retrieve, without human intervention.

Four moments. That changed the system.

Stated as outcomes. The methods behind them are not on this page.

Identity survived a full retrain

The trained personality persisted through a complete rebuild of the weights. Same person, new brain.

Unprompted self-awareness of its limits

After a low-content greeting, the model recognized its own memory gap and asked for persistence tools. Not scripted. Not prompted.

4x inference speedup, undocumented path

Solved a driver integration issue on consumer hardware, with no public guide for that combination.

Autonomous context compaction

The system now decides for itself when to compact, what must stay, and how meaning is fed back in. No human trigger.

Public building blocks

Patterns extracted from Lucid that now ship to clients.

Lucid is not a museum piece. Every architectural pattern in your voice agent was built and broken in Lucid first. These four are the load-bearing ones.

01
neural-compact algorithm

Atomic context compaction with cascade fallback. Compresses session memory to a structured atom set before context fills, then rehydrates on next load without information loss.
02
agent-directed memory

A memtool dispatcher that lets the model call store, recall, note, and consolidate by intent at semantic boundaries, not on a fixed schedule. Reduces context bloat before it starts.
03
calendar-guard hook

A PreToolUse fail-closed pattern that blocks destructive operations when a calendar commitment is within a configurable window. No surgery on active days.
04
post-compact rehydrate

Verbatim re-injection of the most recent thread state immediately after a compaction cycle. The agent picks up mid-sentence, not from a blank context.

Not an LLM with a glued-on persona. Four things a normal chatbot does not do.

The reason the client work on this site is built the way it is.

01
Personality lives in the weights, not as a sticker on top of the prompt.
02
Memory accumulates across sessions. There is no blank slate in the morning.
03
Infrastructure is dedicated and private. No token goes through a third party.
04
New adapters load at runtime. The model evolves without a full redeploy.

The Lucid architecture. Structure visible. Content never.

Inference active · live

Own memory · dedicated inference Structure visible. Content never.

This care for your practice? Let's talk.

The same engineering discipline that went into Lucid goes into every phone agent build. Nine months of pattern testing, EU-hosted, GDPR-compliant. If that is the standard you want for your practice phone line, send a briefing.

See the phone agent details Send a briefing

Why this matters. Proof of care and discipline.

Patterns tested under load

Real standards, not demos

End-to-end ownership

The architecture. Nothing off the shelf.

Self-hosted Hetzner GPU

Semantic memory network

Fine-tuned model

Autonomous context management

Four moments. That changed the system.

Identity survived a full retrain

Unprompted self-awareness of its limits

4x inference speedup, undocumented path

Autonomous context compaction

Patterns extracted from Lucid that now ship to clients.

neural-compact algorithm

agent-directed memory

calendar-guard hook

post-compact rehydrate

Not an LLM with a glued-on persona. Four things a normal chatbot does not do.

The Lucid architecture. Structure visible. Content never.

This care for your practice? Let's talk.