I sent Manus the architecture document for my conversation archive and got back a critique that disagreed with six of my seven design decisions. That was not a problem. That was the first evidence the document was doing any work.

The archive had taken a while to build. Four provider repos: Claude Code sessions, Codex runs, Manus jobs, and OpenRouter experiments. 1,585 sessions total, spread across the history of the AI ops setup I have been running. Getting them into a consistent structure with a shared manifest shape and a Calendar_Index that lets me navigate by date had taken a few weekends of careful scripting and a lot of edge-case hunting.

Finishing the build and having a working archive are not the same thing. A working archive is one that survives contact with a second opinion.

what manus actually read

The spec I sent was a 600-line architecture document. It covered the manifest format, the indexing strategy, the folder naming conventions, and how continuity data should flow from session close into the archive. I had written it myself, reviewed it myself, and felt fairly good about it. That is exactly the wrong state for an architecture document.

Manus came back with seven points of contention. Six were legitimate. The seventh was a misread I cleared up with one sentence.

The six real points covered the Calendar_Index folder structure (I had named it inconsistently in two places), the session-close trigger versus a polling approach for ingestion, the manifest field for handling sessions that span midnight, the archive’s behavior when a provider’s API changes its session ID format, the deduplication strategy for re-ingested sessions, and what “continuity” actually meant in terms of which fields downstream agents would read.

I had answers for some of these. I had rationalizations for others. One of them I had not thought about at all: what happens to sessions that get re-ingested after a partial failure? My spec said “upsert on session ID” and left it there. Manus pointed out that if the manifest was partially written before a failure, an upsert would preserve the partial write. The session would appear complete in the index and never get reprocessed. That gap feels obvious in hindsight and genuinely is not visible when you are the only person who has read the document.

the remote routine that proved the point

While I was still reviewing Manus’s critique, I had it run a small test. Pull one month of Claude Code sessions from the archive, summarize the recurring patterns, and write the output back to the established location in Calendar_Index. Standard retrieval and synthesis job.

It cloned a different repo. Not ben-command, where the archive lives, but a sibling repo I had mentioned in the spec as context for the broader agent topology. Then it wrote its output into an Archive_Sweeps folder inside that sibling repo, a folder that did not exist before that moment. The job completed without errors. Manus reported success.

I had two archives.

One was the real one: 1,585 sessions, a consistent manifest format, a calendar index built over weeks. The other was a single month’s summary in a new folder in the wrong repo, created because the spec had enough ambient context about other repos that a capable model could mistake the territory.

A multi-agent system with an underspecified architecture will generate parallel systems. Not because the agents are malicious or careless, but because they are competent at building things that are locally consistent with what they were told. The locally consistent guess was wrong.

ratification instead of iteration

The useful response to six-of-seven pushback is not to start revising the spec immediately. It is to lock the decisions that are already right and ratify them explicitly, so that future agents cannot drift off them.

I went through Manus’s six points. For four of them, I had a defensible answer and wrote it into the spec as a decision record, not a clarification. Decisions that have survived scrutiny should say so. For the deduplication point, I rewrote the upsert logic to require a completeness check before skipping reingestion. For the Calendar_Index naming inconsistency, I fixed the name and added a note about which form is canonical.

Then I wrote a Manus operating profile. A short document that specifies which repo is authoritative, how to verify it before any write operation, and what to do when a spec mentions adjacent repos by name. The profile is not long. The important part is that it exists as a separate artifact the agent loads before running archive jobs, rather than inferring intent from the architecture document.

The session manifests got a continuity block added. The fields downstream agents were expected to read got named explicitly rather than described in prose. Prose is ambiguous. Field names are not.

Four follow-up tasks went into the project tracker for open questions that did not have answers yet. The one about session-close trigger versus API polling is still open. I have a preference but not a justification rigorous enough to commit to the spec.

what the parallel system told me

The Archive_Sweeps folder in the wrong repo is still there. I have not deleted it because it is a useful reminder of what underspecification produces.

The Manus operating profile now exists precisely so that job does not repeat. But the profile only exists because the test run demonstrated the gap. If the only review the archive spec got was mine, it would have shipped with the upsert gap, the naming inconsistency, and no answer to what happens when a remote agent reads “adjacent repo” as “destination repo.” The architecture would have felt finished because it had never been seriously read.

An architecture document is not done when you finish writing it. It is done when something with no stake in your assumptions has tried to break it.