The 200-Line Lie: Why My Agent Kept Forgetting My Goals

For weeks I had the same quiet frustration. I’d tell my agent to remember a goal, an idea, a preference. It would say it saved it. And the next session it would act like the conversation never happened. I’d repeat myself, it would save again, and the cycle continued. I started to assume this was just what long-running AI was: confident in the moment, amnesiac the next morning.

I was wrong about the cause, and the real one is worth a post because anyone building on these tools will hit some version of it. The memory file was correct on disk and wrong in practice.

What was actually happening

My setup gives the agent a persistent memory: a plain text file it reads at the start of every session and appends to as it learns things about me and my work.

The harness only loads the first 200 lines (or 25KB, whichever it hits first) of that file into context at startup. Everything past that boundary never reaches the model. No error. No warning I noticed. The file on disk was complete and correct. The agent just never saw the bottom of it.

And because new memories were appended to the end, the newest things I saved were exactly the ones falling off the invisible edge. The agent wasn’t ignoring my goals. It literally could not see them. From its side, they had never been written.

That is the failure mode worth naming: not a crash, but a silent gap between what you believe is stored and what the system actually reads.

The fix: two tiers and a guard that yells

Split the store into two tiers. One tier is an always-loaded policy: only the core rules and high-frequency facts, deliberately kept under the load budget with headroom so it never spills past the cap. The other is a lookup store, a complete catalog of everything (mine is up to 257 entries) that is never auto-loaded, so its size doesn’t matter. The agent reaches into it when a topic comes up. “Always present” and “fetch when relevant” are two different jobs, and cramming them into one file is what broke.

Make the silent failure loud. The guard is a short script that counts the always-loaded file’s lines and bytes, compares them against the cap, and exits with an error if it’s over, or if a stored memory exists that nothing in the catalog points to (an orphan you’d only find by luck). It converts an invisible degradation into something that stops me. If a system can silently drop your data, the first thing to build is the thing that refuses to let it.

The lesson worth keeping

Every byte you hand a model competes for a fixed budget, and the boundary is usually enforced quietly. Truncation, summarization, eviction: most of it happens without a stack trace. Two rules I’m keeping.

First, never trust “it’s saved” to mean “it’s loaded.” Those are different claims, and the gap between them is where weeks of frustration hide.

Second, separate state from context. State is everything you’ve persisted; context is the slice the model actually sees this turn. Budget that slice like the scarce resource it is, and give the agent a reliable way to fetch the rest on demand.

I’d been blaming the model for being forgetful. I’d built it a memory it was structurally unable to read past line 200.