Lagoon v1.3 · For AI Authors

// Your Story Deserves Better Than a Chat Window

Lagoon is a local AI writing workspace built specifically for long-form collaborative fiction. Character-driven, memory-equipped, prose-quality obsessed.

If you've been using an AI for collaborative fiction, you've run into the wall. The model forgets what happened three scenes ago. It breaks character. It writes "His eyes widened in shock." for the fourth time in a row. The context fills up and suddenly your character doesn't know her own backstory.

Lagoon was built to fix all of that. It's not a chat app with a system prompt — it's a writing environment where the system is actively working to keep your story coherent, your character in voice, and your prose quality high. Every turn, automatically.

Bring your own models. Venice.ai · Ollama (local) · Any OpenAI-compatible endpoint. Paste your API keys and run. No subscriptions, no cloud accounts, no data leaving your machine.

What's in here
01 Your character, persistent across hundreds of turns 02 Memory — the story remembers itself 03 Anchors — world knowledge that doesn't waste tokens 04 Character Awareness — automatic reveal tracking 05 Style Overseer — fixing AI's bad prose habits 06 Author's Note — your persistent nudge 07 Long stories don't die 08 Writing tools — fork, nudge, regenerate 09 Dual Model — characters talking to each other 10 Model freedom 11 Getting started

01 Your character, persistent across hundreds of turns

In Lagoon, a character is more than a system prompt. It's a complete profile — voice, personality, backstory, world context, tone rules — that travels with every conversation. Your character doesn't reset between sessions. The same voice, the same rules, every time you open a chat.

System Prompt
Who they are, how they speak, what they never do. Permanent — it never gets pruned from context no matter how long the story gets.
World Context
The setting, the rules of your world, background the model needs to know. Also permanent in context. Separate from the character voice.
Opening Line
Set an intro statement — the character's first words before you've said anything. Establishes voice and scene from turn zero.
Avatar
Generate a portrait from your character description with one click. Stored locally. Persists across every chat with that character.
One config, many stories. Character configs are reusable. Open a new chat with the same character and everything carries over — voice, rules, lore, tone. The story history is separate; the character is not.

02 Memory — the story remembers itself

After every exchange, Lagoon chunks the conversation and builds a searchable memory of what happened. Before each new turn, it finds the chunks most relevant to what you just said and quietly injects them into context. The story recalls its own past — without you having to remind it.

Without memory You: "What did she say about her brother?"

AI: "I'm not sure — could you remind me what was discussed earlier in our conversation?"
With Lagoon RAG memory You: "What did she say about her brother?"

AI: (retrieves the exchange from turn 23 automatically) "She mentioned he left for the coast before the war. Never came back."

How it works — without the jargon

Every few turns, Lagoon takes your conversation and breaks it into chunks. Each chunk gets a fingerprint. When you write something new, Lagoon finds the chunks whose fingerprints most closely match — and quietly drops them into context, right before your turn. You never see it. The model just knows.

100%
Local — runs on your machine, no API calls for memory
~8–10
Turns before memory starts contributing meaningfully
No session limit — memory grows with the story
Configurable. Adjust how many past chunks to retrieve, how closely they need to match, how many tokens they're allowed to use — all live in Memory Settings, no restart needed. Or turn it off entirely.

03 Anchors — world knowledge that doesn't waste tokens

You've got world-building facts, character histories, faction details, location descriptions. If you put them all in the system prompt, they eat your context budget on every single turn — even when they're irrelevant. Anchors only fire when you need them.

Write a lore entry. Assign it keywords. When those keywords appear in the recent conversation, the entry injects silently as context for that turn. When they don't appear, it costs nothing.

Without Anchors Option A: cram everything into the system prompt, blowing 2,000 tokens on backstory the model doesn't need this turn.

Option B: leave it out, and watch the model invent history on the fly.
With Anchors "The Kael family history" entry has keywords: Kael, family, parents, Mira's father.

It only appears in context when someone mentions the Kaels. Otherwise: 0 tokens used.

04 Character Awareness — automatic reveal tracking Unique

This is the feature that doesn't exist anywhere else. Mark a lore entry as "character not yet aware." From that point on, the model is told: this is true, but your character doesn't know it yet. When the reveal finally happens in the story — when the model writes it — Lagoon detects it automatically and marks the entry as known. No manual tracking. No forgetting to update it. It just works.

You write: "Her brother is still alive — she doesn't know yet"
Mark as: "character not yet aware"
Lagoon injects: "This is true, but Elena doesn't know it. If she learns this turn, signal it."
Model writes the reveal. Lagoon catches the signal.
Entry flips to: "aware"
Toast notification: "Elena now knows: her brother is alive."
Why this matters: In a long story with multiple reveals, tracking "does she know this yet?" manually is exhausting and error-prone. Lagoon handles it automatically. The model knows what the character knows. The moment that changes, so does the system.

05 Style Overseer — fixing AI's bad prose habits Unique

AI models have consistent, repeatable bad habits. They double-space after sentences. They write single dramatic lines as their own paragraph. They slip out of third-person. They open responses by restating what you just said. They write unattributed dialogue that floats with no action anchor. After a long session, these patterns compound and the prose degrades.

The Style Overseer reviews every response after it streams, flags each violation, and lets you fix it in place.

What you get without intervention She looked at him. Her eyes widened in shock. She couldn't believe what she was hearing.

Three sentences. Three paragraphs. The middle one is a cliché standing alone for dramatic effect.
After the Overseer flags and you accept She looked at him. She couldn't believe what she was hearing.

Isolated dramatic sentence merged into the paragraph. Cliché gone. One click.

Six built-in rules (on by default, optional)

And your own rules, in plain English

Add rules like: "Do not use the phrase 'a mix of'" or "Never open a paragraph with a gerund" or "This character does not use contractions." Written in plain English. Applied every turn.

How accepting a correction works

Overseer flags a violation with an excerpt
You hit Accept
Offending text replaced in-place in the message bubble
Rule added to Author's Note: "DO NOT: [excerpt]"
Author's Note injects near generation point next turn
Model trained away from the pattern. It compounds.
Auto-accept mode: Enable it and the Overseer runs silently on every response — no badge, no prompt. Every violation is fixed automatically. Corrections still accumulate in the Author's Note. The prose quality just steadily improves on its own.
Model presets: GLM, DeepSeek, Llama, and Qwen each have known bad habits. Load the preset for your model and those model-specific rules are added automatically.

06 Author's Note — your persistent nudge

Some instructions need to be close to where the model generates, not buried at the top of a long context. The Author's Note re-injects at a configurable depth from the bottom on every single turn — right before the model writes. Perfect for tone reminders, current scene context, or things you don't want the model to forget mid-session.

Without it You put tone guidance in the system prompt 200 messages ago. The model has long since drifted. You're now getting warm, gentle responses from a character who should be cold and controlled.
Author's Note [Tone] She is controlled, not warm. Every kindness she shows costs her something. Do not let her soften.

Re-injected 4 messages from the bottom. Every turn. She doesn't soften.

Depth is configurable. Default is 4 messages from the bottom — close enough to matter, not so close it crowds the model. For fast-paced scenes, move it to 2. For more stable characters, push it out to 8.

Session-only override: The sidebar Author's Note panel is session-only. Changes there don't touch the saved character config. Close the chat, they're gone. Use it for temporary scene-specific notes — the config version handles the permanent voice rules.

07 Long stories don't die

Every AI has a context limit — a cap on how much text it can hold in its attention at once. When you hit it, something has to go. Lagoon handles this gracefully, with your approval, so you never lose the thread of a long story.

What happens when the context fills up

Context hits 75% of the model's limit (configurable)
Lagoon generates a summary of the conversation so far
Summary is flagged as pending. A banner appears.
Nothing is deleted until you approve
You approve. Older messages are pruned.
Summary injects as context. Story continues.

Summaries stack. Each new one is told what's already been summarized so it doesn't repeat. The character's permanent system prompt and world context are never touched — only the conversation history gets pruned.

Manual trigger available. Don't wait for the threshold. Prompt Monitor tab → Summarize Now. Use it to create a chapter break, clean up a sprawling scene, or just keep the context tidy on your own schedule.

08 Writing tools — fork, nudge, regenerate

Every assistant response has a toolbar. These aren't chat features — they're writing features. The difference matters in a long story.

Fork
Branch the story from any response. A new chat is created containing the full history up to that point. The original is untouched. Explore parallel paths without committing.
Nudge & Regenerate
Write an out-of-character direction — "make her angrier", "slow this scene down", "he would never say that, try again." It injects invisibly for the regeneration only. Never appears in the visible story.
Writing Tools Menu
Rewrite with a specific directive — expand, condense, shift tone, change perspective. One click, regenerates immediately using that directive. The menu repositions smartly so it's always visible.
Keep for Export
Mark specific responses. When you're done, export only the kept messages. Build a clean draft from your best exchanges without manually copying and pasting. The mark persists between sessions.
Edit in Place
Click any response to edit it directly. The edited version replaces what was there and saves to the chat history. Fix a sentence without regenerating the whole thing.
Delete Pair
Remove an exchange you don't want in the history — both the user turn and the response. The story rolls back cleanly to the previous exchange.

09 Dual Model — characters talking to each other

Set up two characters — each with their own model, system prompt, and temperature — and let them run an automated conversation. You write the opening line. They do the rest until you stop them.

Each character brings their full setup. Model A responds, Model B receives that as a prompt and responds, they alternate. You control the pace: pause at any moment, resume, add a turn limit, or stop and redirect. Manual intervention — editing or regenerating any message — auto-pauses so you don't lose your place.

Where it shines: generating a scene between two characters where you want both voices to be reactive to each other, not scripted. Two models, each playing their role, each responding to what the other actually said. The dialogue stays organic.

10 Model freedom

Your character config stores the model. Switch mid-story by editing the config — the conversation history carries over unchanged. Venice today, a local Ollama model when you're offline. One writing environment, all your models.

Venice.ai Ollama (local) KoboldCpp llama.cpp LM Studio Any OpenAI-compatible endpoint

BYOK — paste your API keys in settings. No accounts, no Lagoon subscription. Lagoon is a local application, not a service.

Venice E2EE: For Venice's TEE models, Lagoon sets up a full end-to-end encrypted session — your messages are encrypted before they leave your machine and decrypted only inside the secure enclave. Venice cannot see the content.

11 Getting started

Requirements: Python 3.10+. No Node.js, no Docker, no Electron. Runs entirely on your machine.

Windows

Mac / Linux

First character setup: Start with a system prompt that describes who they are in their own voice, not what they should do. Add a world context block for setting details. Drop an Author's Note with your current tone requirements. You're writing within two messages.

LAGOON v1.3 · LOCAL-FIRST · BYOK · NO CLOUD

Built by a solo contractor who writes fiction and got tired of the context wall.
Every feature here exists because it was needed, not because it was impressive.