Lagoon — Built for AI Authors

If you've been using an AI for collaborative fiction, you've run into the wall. The model forgets what happened three scenes ago. It breaks character. It writes "His eyes widened in shock." for the fourth time in a row. The context fills up and suddenly your character doesn't know her own backstory.

Lagoon was built to fix all of that. It's not a chat app with a system prompt — it's a writing environment where the system is actively working to keep your story coherent, your character in voice, and your prose quality high. Every turn, automatically.

Bring your own models. Venice.ai · Ollama (local) · Any OpenAI-compatible endpoint. Paste your API keys and run. No subscriptions, no cloud accounts, no data leaving your machine.

What's in here

01 Your character, persistent across hundreds of turns 02 Memory — the story remembers itself 03 Anchors — world knowledge that doesn't waste tokens 04 Character Awareness — automatic reveal tracking 05 Style Overseer — fixing AI's bad prose habits 06 Author's Note — your persistent nudge 07 Long stories don't die 08 Writing tools — fork, nudge, regenerate 09 Dual Model — characters talking to each other 10 Model freedom 11 Getting started

01 Your character, persistent across hundreds of turns

In Lagoon, a character is more than a system prompt. It's a complete profile — voice, personality, backstory, world context, tone rules — that travels with every conversation. Your character doesn't reset between sessions. The same voice, the same rules, every time you open a chat.

System Prompt

Who they are, how they speak, what they never do. Permanent — it never gets pruned from context no matter how long the story gets.

World Context

The setting, the rules of your world, background the model needs to know. Also permanent in context. Separate from the character voice.

Opening Line

Set an intro statement — the character's first words before you've said anything. Establishes voice and scene from turn zero.

Avatar

Generate a portrait from your character description with one click. Stored locally. Persists across every chat with that character.

One config, many stories. Character configs are reusable. Open a new chat with the same character and everything carries over — voice, rules, lore, tone. The story history is separate; the character is not.

02 Memory — the story remembers itself

After every exchange, Lagoon chunks the conversation and builds a searchable memory of what happened. Before each new turn, it finds the chunks most relevant to what you just said and quietly injects them into context. The story recalls its own past — without you having to remind it.

Without memory You: "What did she say about her brother?"

AI: "I'm not sure — could you remind me what was discussed earlier in our conversation?"

With Lagoon RAG memory You: "What did she say about her brother?"

AI: (retrieves the exchange from turn 23 automatically) "She mentioned he left for the coast before the war. Never came back."

How it works — without the jargon

Every few turns, Lagoon takes your conversation and breaks it into chunks. Each chunk gets a fingerprint. When you write something new, Lagoon finds the chunks whose fingerprints most closely match — and quietly drops them into context, right before your turn. You never see it. The model just knows.

100%

Local — runs on your machine, no API calls for memory

~8–10

Turns before memory starts contributing meaningfully

∞

No session limit — memory grows with the story

Configurable. Adjust how many past chunks to retrieve, how closely they need to match, how many tokens they're allowed to use — all live in Memory Settings, no restart needed. Or turn it off entirely.

03 Anchors — world knowledge that doesn't waste tokens

You've got world-building facts, character histories, faction details, location descriptions. If you put them all in the system prompt, they eat your context budget on every single turn — even when they're irrelevant. Anchors only fire when you need them.

Write a lore entry. Assign it keywords. When those keywords appear in the recent conversation, the entry injects silently as context for that turn. When they don't appear, it costs nothing.

Without Anchors Option A: cram everything into the system prompt, blowing 2,000 tokens on backstory the model doesn't need this turn.

Option B: leave it out, and watch the model invent history on the fly.

With Anchors "The Kael family history" entry has keywords: Kael, family, parents, Mira's father.

It only appears in context when someone mentions the Kaels. Otherwise: 0 tokens used.

⚡
Priority order. Multiple entries can fire at once. Higher-priority entries inject first. Put "who she is at her core" above "that obscure noble house she mentioned once."
💰
Token budget. Set a cap on how many tokens Anchors can use per turn. If you hit the limit, lower-priority entries get dropped. Your context budget stays predictable.
🔍
Scan depth. Controls how far back in the conversation Lagoon looks for keywords. Default: last 15 messages. Adjust if your story has long gaps between references.

04 Character Awareness — automatic reveal tracking Unique

This is the feature that doesn't exist anywhere else. Mark a lore entry as "character not yet aware." From that point on, the model is told: this is true, but your character doesn't know it yet. When the reveal finally happens in the story — when the model writes it — Lagoon detects it automatically and marks the entry as known. No manual tracking. No forgetting to update it. It just works.

You write: "Her brother is still alive — she doesn't know yet"

→

Mark as: "character not yet aware"

Lagoon injects: "This is true, but Elena doesn't know it. If she learns this turn, signal it."

→

Model writes the reveal. Lagoon catches the signal.

Entry flips to: "aware"

→

Toast notification: "Elena now knows: her brother is alive."

Why this matters: In a long story with multiple reveals, tracking "does she know this yet?" manually is exhausting and error-prone. Lagoon handles it automatically. The model knows what the character knows. The moment that changes, so does the system.

05 Style Overseer — fixing AI's bad prose habits Unique

AI models have consistent, repeatable bad habits. They double-space after sentences. They write single dramatic lines as their own paragraph. They slip out of third-person. They open responses by restating what you just said. They write unattributed dialogue that floats with no action anchor. After a long session, these patterns compound and the prose degrades.

The Style Overseer reviews every response after it streams, flags each violation, and lets you fix it in place.

What you get without intervention She looked at him. Her eyes widened in shock. She couldn't believe what she was hearing.

Three sentences. Three paragraphs. The middle one is a cliché standing alone for dramatic effect.

After the Overseer flags and you accept She looked at him. She couldn't believe what she was hearing.

Isolated dramatic sentence merged into the paragraph. Cliché gone. One click.

Six built-in rules (on by default, optional)

✗
Isolated dramatic sentences. "She was already gone." standing alone as its own paragraph. The Overseer flags it. You decide if it stays.
✗
Double spacing. Two spaces after a period. Small thing, constant thing. Caught automatically.
✗
POV slippage. Third-person story suddenly reads "I felt a chill run down my spine." The model broke perspective. Flagged.
✗
Correction acknowledgment. "You're right, I apologize for the confusion earlier—" The model stepped outside the story to respond to your direction. Stripped.
✗
Verbatim echo. The AI opens by paraphrasing what you just said. "You reached for the door — and as your hand closed around the handle…"
✗
Unattributed dialogue. A line of dialogue with no action, tag, or anchor to show who's moving in the scene.

And your own rules, in plain English

Add rules like: "Do not use the phrase 'a mix of'" or "Never open a paragraph with a gerund" or "This character does not use contractions." Written in plain English. Applied every turn.

How accepting a correction works

Overseer flags a violation with an excerpt

→

You hit Accept

Offending text replaced in-place in the message bubble

→

Rule added to Author's Note: "DO NOT: [excerpt]"

Author's Note injects near generation point next turn

→

Model trained away from the pattern. It compounds.

Auto-accept mode: Enable it and the Overseer runs silently on every response — no badge, no prompt. Every violation is fixed automatically. Corrections still accumulate in the Author's Note. The prose quality just steadily improves on its own.

Model presets: GLM, DeepSeek, Llama, and Qwen each have known bad habits. Load the preset for your model and those model-specific rules are added automatically.

06 Author's Note — your persistent nudge

Some instructions need to be close to where the model generates, not buried at the top of a long context. The Author's Note re-injects at a configurable depth from the bottom on every single turn — right before the model writes. Perfect for tone reminders, current scene context, or things you don't want the model to forget mid-session.

Without it You put tone guidance in the system prompt 200 messages ago. The model has long since drifted. You're now getting warm, gentle responses from a character who should be cold and controlled.

Author's Note [Tone] She is controlled, not warm. Every kindness she shows costs her something. Do not let her soften.

Re-injected 4 messages from the bottom. Every turn. She doesn't soften.

Depth is configurable. Default is 4 messages from the bottom — close enough to matter, not so close it crowds the model. For fast-paced scenes, move it to 2. For more stable characters, push it out to 8.

Session-only override: The sidebar Author's Note panel is session-only. Changes there don't touch the saved character config. Close the chat, they're gone. Use it for temporary scene-specific notes — the config version handles the permanent voice rules.

07 Long stories don't die

Every AI has a context limit — a cap on how much text it can hold in its attention at once. When you hit it, something has to go. Lagoon handles this gracefully, with your approval, so you never lose the thread of a long story.

What happens when the context fills up

Context hits 75% of the model's limit (configurable)

→

Lagoon generates a summary of the conversation so far

Summary is flagged as pending. A banner appears.

→

Nothing is deleted until you approve

You approve. Older messages are pruned.

→

Summary injects as context. Story continues.

Summaries stack. Each new one is told what's already been summarized so it doesn't repeat. The character's permanent system prompt and world context are never touched — only the conversation history gets pruned.

Manual trigger available. Don't wait for the threshold. Prompt Monitor tab → Summarize Now. Use it to create a chapter break, clean up a sprawling scene, or just keep the context tidy on your own schedule.

08 Writing tools — fork, nudge, regenerate

Every assistant response has a toolbar. These aren't chat features — they're writing features. The difference matters in a long story.

Fork

Branch the story from any response. A new chat is created containing the full history up to that point. The original is untouched. Explore parallel paths without committing.

Nudge & Regenerate

Write an out-of-character direction — "make her angrier", "slow this scene down", "he would never say that, try again." It injects invisibly for the regeneration only. Never appears in the visible story.

Writing Tools Menu

Rewrite with a specific directive — expand, condense, shift tone, change perspective. One click, regenerates immediately using that directive. The menu repositions smartly so it's always visible.

Keep for Export

Mark specific responses. When you're done, export only the kept messages. Build a clean draft from your best exchanges without manually copying and pasting. The mark persists between sessions.

Edit in Place

Click any response to edit it directly. The edited version replaces what was there and saves to the chat history. Fix a sentence without regenerating the whole thing.

Delete Pair

Remove an exchange you don't want in the history — both the user turn and the response. The story rolls back cleanly to the previous exchange.

09 Dual Model — characters talking to each other

Set up two characters — each with their own model, system prompt, and temperature — and let them run an automated conversation. You write the opening line. They do the rest until you stop them.

Each character brings their full setup. Model A responds, Model B receives that as a prompt and responds, they alternate. You control the pace: pause at any moment, resume, add a turn limit, or stop and redirect. Manual intervention — editing or regenerating any message — auto-pauses so you don't lose your place.

Where it shines: generating a scene between two characters where you want both voices to be reactive to each other, not scripted. Two models, each playing their role, each responding to what the other actually said. The dialogue stays organic.

10 Model freedom

Your character config stores the model. Switch mid-story by editing the config — the conversation history carries over unchanged. Venice today, a local Ollama model when you're offline. One writing environment, all your models.

Venice.ai Ollama (local) KoboldCpp llama.cpp LM Studio Any OpenAI-compatible endpoint

BYOK — paste your API keys in settings. No accounts, no Lagoon subscription. Lagoon is a local application, not a service.

Venice E2EE: For Venice's TEE models, Lagoon sets up a full end-to-end encrypted session — your messages are encrypted before they leave your machine and decrypted only inside the secure enclave. Venice cannot see the content.

11 Getting started

Requirements: Python 3.10+. No Node.js, no Docker, no Electron. Runs entirely on your machine.

Windows

1
Download LagoonSetup.exe from the releases page and run it.
2
A console window will open and install Python dependencies. This takes a few minutes — don't close it.
3
Click the Start Menu shortcut — Lagoon starts in the background and opens in your browser automatically.
4
Enter your Venice API key in Settings and create your first character.

Mac / Linux

1
Clone the repo.
2
Run python setup.py — creates a virtual environment, installs dependencies, walks you through API key setup.
3
Run: python app.py
4
Open https://localhost:5007 — create your first character and start writing.

First character setup: Start with a system prompt that describes who they are in their own voice, not what they should do. Add a world context block for setting details. Drop an Author's Note with your current tone requirements. You're writing within two messages.

LAGOON v1.3 · LOCAL-FIRST · BYOK · NO CLOUD

Built by a solo contractor who writes fiction and got tired of the context wall.
Every feature here exists because it was needed, not because it was impressive.

// Your Story Deserves Better Than a Chat Window

01 Your character, persistent across hundreds of turns

02 Memory — the story remembers itself

How it works — without the jargon

03 Anchors — world knowledge that doesn't waste tokens

04 Character Awareness — automatic reveal tracking Unique

05 Style Overseer — fixing AI's bad prose habits Unique

Six built-in rules (on by default, optional)

And your own rules, in plain English

How accepting a correction works

06 Author's Note — your persistent nudge

07 Long stories don't die

What happens when the context fills up

08 Writing tools — fork, nudge, regenerate

09 Dual Model — characters talking to each other

10 Model freedom

11 Getting started

Windows

Mac / Linux