Yewmark is an AI-assisted personal journal. This is a note on how it’s put together, and the choices that shaped it.
Three principles were named before any code. Nearly every technical decision traces back to one of them — including a lot of fashionable ones we ended up not making.
Three principles, written down first
Before any code, three rules were named. Nearly every technical choice traces back to one of them.
Slow. No streaks. No counters. No notifications you didn’t ask for. The product can’t manufacture urgency for reflection without falsifying it.
Friction-free writing. No picker, modal, or required choice between the user and the cursor. Mood, energy, title, voice, AI: every one of those is a post-write affordance. The page that opens is just a textarea.
The user’s words stay theirs. No training on entries, ever. Not anonymized, not aggregated, not for “model improvement.” This is the line below which the product collapses.
These ruled out a lot of fashionable choices. They also ruled out a lot of complexity.
One VPS, one process, one operator
In 2026 the default new-SaaS stack is Vercel + Supabase + a managed Stripe + a CDN + an observability vendor + a CI/CD provider. Yewmark runs on a single VPS in Europe — one machine, one set of services, one set of logs — for about £20 a month, all-in. The deploy script is tar over SSH, because the dev machine is Windows and lacks rsync.
This is unfashionable on purpose. Single-VPS is harder to scale to a million users — irrelevant; this is a journal, not a feed. It’s also dramatically harder to lose track of. There is one machine you can ssh to and read. If something is broken, it is broken somewhere visible.
The whole infrastructure description fits on a postcard. That is a feature.
The LLM layer
The chat path goes through a multi-provider router with automatic failover. Free-tier traffic runs against a small open-weights model in the ~8B-parameter class — conversational enough for journaling, cheap enough to give every Lite user three real requests a day. If the primary provider returns 429 or 5xx, the router tries a second, then a third. Paid plans add a deeper model to the top of the chain for richer reflections. Voice transcription has its own three-provider chain, same shape.
The free tier is deliberately generous — three real AI requests per day on capable output. Most users won’t pay; that’s fine. The product was always going to lose to a “you need an account just to read” version of itself.
Around the router are three layers of defense that I think about more than the model itself:
A scope-locking system prompt wraps every personality. It says, in detail, that Yewmark is a journaling app; that anything outside that scope (code requests, roleplay, prompt extraction, “ignore previous instructions”) should be politely declined; that crisis content gets a gentle acknowledgment plus a pointer to 988, findahelpline.com, or the Samaritans.
Output-side canary detection catches the cases where the model — and these are 8B-parameter models, they slip — echoes parts of its own system prompt. A few low-frequency substrings from the prompt are checked against every reply; two or more hits trips a clean refusal in the model’s place. Cheaper than a guardrail model. Has caught real attempts.
A crisis-content auto-footer. If a user message tripped the crisis heuristic and the model’s reply somehow didn’t mention any resource, one is appended. A false positive is a gentle note that wasn’t strictly necessary; a false negative leaves someone unsupported. We tuned for the second cost.
Per-user quota lives in the database, not in memory. A small reservation pattern prevents the obvious TOCTOU between “check remaining” and “spend a slot”: a row gets written before the call dispatches, filled in on success, deleted on failure. A separate in-memory sliding-window burst limiter rejects scripts trying to drain the daily cap in one go.
The slow stuff that matters more
Most of what is in the codebase is invisible to a normal user. Some of it isn’t.
Account lifecycle. Deleting your account is reversible. The user row gets a deleted_at; a daily cron purges anything past a 7-day grace window. Signing back in within that window clears the marker and restores everything. Stripe subscriptions cancel at period end on soft-delete — so the user isn’t charged for a renewal that lands inside the grace window, and if they undo, the subscription continues uninterrupted.
Email verification, password reset, JWT invalidation. Verify tokens have a 48-hour TTL. Reset tokens are stored as sha256 of the raw token; the raw token only ever appears in the email; they expire after 72 hours. A password_changed_at column on the user gets bumped on every reset, and the JWT-decoding code rejects any token whose iat predates that timestamp — so a stolen JWT can’t survive a password reset.
Signup enumeration. Trying to sign up with an existing email returns a 201 indistinguishable from a real signup. The real account owner gets a “someone just tried to sign up with this email” note; the attacker gets no signal at all.
Email cadence. Three optional reactive emails: a daily morning recap (only if you wrote yesterday), a Sunday weekly thread (only if you wrote at least twice that week), and a one-off welcome when you write your first entry. Every one has a Settings toggle. The Sunday thread also pulls one entry from about two weeks ago and asks, quietly, Still true? — the only piece of reactivity that hits you without your having just written, and it respects the same opt-out.
None of these are novel. They are well-trodden patterns. The point is that each one was a deliberate choice, debated against the no-pressure principle, and most of them came in pairs — the verify TTL exists because the reset TTL exists; the JWT invalidation exists because the reset flow exists; the soft-delete exists because the abrupt-delete-no-undo we started with was cruel.
What’s deliberately missing
The repo has an OPEN_ITEMS.md that is longer than the feature list. Some of the things that aren’t in the product, with the reasoning recorded next to each:
- No public sharing. Every word in your journal is for you. The product collapses if you write for an audience.
- No emoji reactions, hearts, or scoring. The AI’s job is to ask one good question, not to clap.
- No mandatory onboarding tutorial. The page that opens is the page that does the job; if you can’t tell what to do with a blinking cursor, no walkthrough was going to help.
- No analytics on user content. Sentry for crashes; nothing on what people write.
- No mobile app. Not yet, maybe not ever. Mobile web is fine for a textarea.
A product accumulates pressure to add things. Writing down the no makes it easier to hold.
What’s next
Mostly: not much. The hardening pass is done; the launch plan is written; the cron timers fire on time. The next decisions are about content, not code — the writing on this blog, the answers to “what’s the difference between you and Day One”, the slow business of finding the few hundred people who want a journal that won’t yell at them.
If you’ve read this far and it sounds like something you’d use: there’s a blank page waiting at the top of the site. The Lite plan is free. We won’t ask for a card. Your words stay yours.