Storage formats

Pick the format that fits the data, not by default. A common starting point is CSV when no stronger signal exists, but spreadsheet-first or SQLite-first are equally valid when the workflow calls for them. Migration is always available — notice the pain and migrate that specific logbook.

CSV

Best when every row has the same flat columns, values are short strings or numbers, and you want to open it in Excel or any text editor. Weakest when a column frequently needs nested sub-fields or when you need cross-logbook joins.

Use for: ideation logbooks, feedback collectors, flat decision logs.

JSON Lines (.jsonl)

One JSON object per line. Best when rows have optional or nested fields, or when column sets vary across rows. Still appendable, still greppable, still human-readable. Weakest when you want to open it in Excel or run column-based CLI ops.

Use for: logbooks where the schema is soft, where one row might have 4 fields and the next has 8.

SQLite

Best when a logbook naturally spans multiple connected tables, when you need real queries (joins, aggregates, GROUP BY, indexes), or when row volume exceeds a few thousand. Trades away the open-in-any-editor inspectability but gains everything a relational database gives.

Use for: multi-table logbooks, logbooks that are clearly databases in disguise, or when the CLI query layer starts reimplementing SQL badly.

Spreadsheet

Best when humans are primary editors or reviewers, you need visual sorting/filtering/comments, and the team prefers a UI over a CLI. Valid backend for the same logbook pattern.

Use for: human-heavy collaboration, low engineering overhead, non-engineers who need to interact with the data.

Markdown tables

Most human-readable, worst to append programmatically, painful to query. Only use when the logbook is tiny (under 20 rows), hand-maintained, and read more often than written. Rarely the right call for agent-written logbooks.


Designing a stable entry contract

Row identity

Every row needs a stable identity that survives edits, reordering, and export. Decide early: is identity a sequential id, a natural key (skill_name + run_id), or a generated UUID? Natural keys are readable but brittle if the key fields change. Sequential ids are simple but meaningless outside the logbook. Pick one rule and document it in the logbook header or README.

Field semantics

When two contributors both write a priority column, do they mean the same thing? Define each column in one sentence at logbook creation. If the definition drifts — one person uses "priority" for business value and another for implementation urgency — the logbook is silently corrupted. The fix is to split the column or rename it, not to hope for convergence.

Partial rows

Not every contributor knows every field. Decide whether missing fields are empty strings, explicit nulls, or "unknown." Pick one convention per logbook. Mixed conventions make filters unreliable.

Corrections vs. supersession

For audit-sensitive logbooks (retro observations, decision logs), append a correction row and mark the original. For refinement-heavy logbooks (backlog shaping, ideation scoring), patch in place — the current state matters more than the history. State the rule in the logbook header.

Schema versioning

When you add a column, existing rows won't have it. When you rename a column, old queries break. The principle: a schema change that silently breaks existing queries is worse than a schema that stays slightly imperfect.

Anti-pattern: backend denial

When you need joins, constraints, migrations, or stronger concurrency, the answer is not to abandon the logbook — it's to upgrade the backend. Keep the logbook abstraction but move from CSV to SQLite or Postgres. The opposite failure is calling a database a logbook: once you grow real foreign-key constraints, transactional updates, or an ORM, you've earned a database and should call it that.

"Isn't this just Notion or Airtable?"

A Notion database or Airtable base passes most of the logbook criteria: shared, structured, queryable, multi-contributor, persistent. They are valid backends — they belong in the "Spreadsheet" category above. What they don't provide is the methodology that makes a logbook operational for agent workflows — schema discipline, tool-based queries, execution actions, and orchestration. These are roles, not infrastructure.