| 1 | # TreeTrace lineage schema v0.3 |
| 2 | |
| 3 | `.treetrace/tree.json` is an open, vendor-neutral format for prompt lineage and agent-regression analysis in AI-assisted projects. |
| 4 | |
| 5 | TreeTrace records the human steering layer: what was asked, what changed direction, what was corrected, what was abandoned, what was rejected, what future agents should remember, and which failures should become evals. |
| 6 | |
| 7 | ## Layering |
| 8 | |
| 9 | | Layer | Standard or artifact | What it records | |
| 10 | |-------|----------------------|-----------------| |
| 11 | | Code attribution | Agent Trace | which lines were AI-generated, by which model, linked to which conversation | |
| 12 | | Runtime telemetry | OpenTelemetry `gen_ai` | per-call spans for operators | |
| 13 | | Build integrity | SLSA / in-toto | signed provenance of build artifacts | |
| 14 | | Human steering | TreeTrace | prompt lineage, corrections, abandoned paths, rejections, lessons, eval candidates | |
| 15 | |
| 16 | Agent Trace answers "which code came from AI?" TreeTrace answers "how did the human have to steer the agent?" |
| 17 | |
| 18 | ## Top-Level Shape |
| 19 | |
| 20 | ```jsonc |
| 21 | { |
| 22 | "schemaVersion": "0.3", |
| 23 | "generator": { "name": "treetrace", "version": "0.9.0", "url": "..." }, |
| 24 | "project": { "name": "...", "generatedAt": "ISO-8601", "sourceType": "claude-code-jsonl" }, |
| 25 | "stats": { |
| 26 | "prompts": 41, "sessions": 6, "days": 9, |
| 27 | "corrections": 3, "rejections": 4, |
| 28 | "toolUses": 12, "filesTouched": 7, |
| 29 | "inputTokens": 8400, "outputTokens": 2100, |
| 30 | "models": ["claude-opus-4-8"], |
| 31 | "firstTs": "ISO-8601", "lastTs": "ISO-8601" |
| 32 | }, |
| 33 | "analysis": { |
| 34 | "failureSignals": 11, |
| 35 | "correctionChains": 3, |
| 36 | "evalCandidates": 6, |
| 37 | "lessons": 7 |
| 38 | }, |
| 39 | "sessions": [ |
| 40 | { |
| 41 | "id": "...", "title": "...", |
| 42 | "firstTs": "ISO-8601", "lastTs": "ISO-8601", |
| 43 | "promptCount": 7, "isContinuation": false, |
| 44 | "inputTokens": 8400, "outputTokens": 2100 |
| 45 | } |
| 46 | ], |
| 47 | "nodes": [ /* PromptNode */ ], |
| 48 | "edges": [ /* Edge */ ], |
| 49 | "correctionChains": [ /* CorrectionChain */ ], |
| 50 | "lessons": [ /* Lesson */ ], |
| 51 | "evalCandidates": [ /* EvalCandidate */ ] |
| 52 | } |
| 53 | ``` |
| 54 | |
| 55 | All v0.3 additions are optional and additive. Consumers that only understand v0.2 can keep reading `nodes` and `edges` and ignore `rejections`. |
| 56 | |
| 57 | ## stats fields |
| 58 | |
| 59 | | Field | Type | Meaning | |
| 60 | |-------|------|---------| |
| 61 | | `prompts` | number | total classified prompt nodes | |
| 62 | | `rawPrompts` | number | total raw prompt records across all sessions | |
| 63 | | `sessions` | number | sessions that contained at least one prompt | |
| 64 | | `days` | number | calendar days spanned | |
| 65 | | `corrections` | number | nodes classified as `correction` | |
| 66 | | `scopeChanges` | number | nodes classified as `scope-change` | |
| 67 | | `checkpoints` | number | nodes classified as `checkpoint` | |
| 68 | | `abandonedBranches` | number | distinct abandoned sub-trees | |
| 69 | | `rejections` | number | total rejection/refusal/decline events | |
| 70 | | `rejectionsByKind` | object | count per rejection kind | |
| 71 | | `toolUses` | number | total tool invocations across all sessions | |
| 72 | | `filesTouched` | number | distinct file paths referenced (Edit/Write paths and shell command paths) | |
| 73 | | `inputTokens` | number | sum of input tokens across all sessions (0 when not available for the source format) | |
| 74 | | `outputTokens` | number | sum of output tokens across all sessions (0 when not available for the source format) | |
| 75 | | `models` | string[] | deduplicated list of model identifiers seen across all sessions | |
| 76 | | `firstTs` | string \| null | ISO-8601 timestamp of the earliest record | |
| 77 | | `lastTs` | string \| null | ISO-8601 timestamp of the latest record | |
| 78 | |
| 79 | Token coverage by source: Claude Code JSONL (full), Codex rollout (full), Gemini CLI (full), ChatGPT export (none), Copilot (none), Cursor (none), Grok (none), plain transcript (none). |
| 80 | |
| 81 | ## sessions[] fields |
| 82 | |
| 83 | | Field | Type | Meaning | |
| 84 | |-------|------|---------| |
| 85 | | `id` | string | session identifier | |
| 86 | | `title` | string \| null | session title if captured | |
| 87 | | `firstTs` | string \| null | ISO-8601 | |
| 88 | | `lastTs` | string \| null | ISO-8601 | |
| 89 | | `promptCount` | number | classified prompts in this session | |
| 90 | | `isContinuation` | boolean | session resumed from a prior compact summary | |
| 91 | | `inputTokens` | number | input tokens for this session (0 when not available) | |
| 92 | | `outputTokens` | number | output tokens for this session (0 when not available) | |
| 93 | |
| 94 | ## PromptNode |
| 95 | |
| 96 | | Field | Type | Meaning | |
| 97 | |-------|------|---------| |
| 98 | | `id` | string | stable within the file (`node_001`, etc.) | |
| 99 | | `parentId` | string \| null | lineage parent (null = root) | |
| 100 | | `role` | `"user"` | reserved for future system/developer nodes | |
| 101 | | `kind` | enum | `root`, `direction`, `correction`, `scope-change`, `checkpoint`, `question`, `rejection` | |
| 102 | | `title` | string | first-sentence distillation | |
| 103 | | `text` | string | full prompt text after redaction | |
| 104 | | `status` | enum | `accepted`, `abandoned` | |
| 105 | | `nudges` | number | folded "continue"-style acknowledgements | |
| 106 | | `reruns` | number | repeated instruction re-issues folded into this node | |
| 107 | | `session` | string | session id this prompt came from | |
| 108 | | `timestamp` | string \| null | ISO-8601 | |
| 109 | | `model` | string \| null | model that handled this turn (from the first action on the turn; null if not available) | |
| 110 | | `actions` | Action[] | tool invocations made in response to this prompt, after redaction | |
| 111 | | `failureSignals` | FailureSignal[] | optional v0.2 failure labels attached to this node | |
| 112 | | `evalCandidate` | boolean | whether this node contributes to an eval candidate | |
| 113 | | `lessonIds` | string[] | lessons derived from this node | |
| 114 | | `rejections` | Rejection[] | optional v0.3 typed rejection/refusal/decline events captured on this turn | |
| 115 | | `sourceEventIds` | string[] | local transcript record UUIDs; raw transcripts are never exported | |
| 116 | |
| 117 | ## Action |
| 118 | |
| 119 | ```jsonc |
| 120 | { "tool": "Edit", "file": "/src/auth.js", "command": null, "model": "claude-opus-4-8" } |
| 121 | ``` |
| 122 | |
| 123 | | Field | Type | Meaning | |
| 124 | |-------|------|---------| |
| 125 | | `tool` | string \| null | tool name (`Bash`, `Edit`, `Write`, `Read`, etc.) | |
| 126 | | `file` | string \| null | file path from a structured `file_path` input; redacted | |
| 127 | | `command` | string \| null | shell command string for `Bash` tool calls; redacted | |
| 128 | | `model` | string \| null | model that issued this tool call; null when not available | |
| 129 | |
| 130 | `file` and `command` values are run through the same redaction gate as `node.text`. An `action` whose `command` or `file` contains a secret will have that value replaced with a `[REDACTED:rule-id]` marker before export. |
| 131 | |
| 132 | The `rejection` kind (v0.3) is assigned to synthetic nodes that exist only to carry a rejection signal, e.g. a tool-result rejection that arrived before any human-typed prompt. Such nodes have empty `text`, a `title` derived from the rejection kind(s), and one or more entries in `rejections`. |
| 133 | |
| 134 | ## FailureSignal |
| 135 | |
| 136 | ```jsonc |
| 137 | { |
| 138 | "type": "ignored_constraint", |
| 139 | "confidence": 0.82, |
| 140 | "evidence": "User corrected the agent after it built a web app despite asking for a CLI.", |
| 141 | "resolvedBy": "node_004" |
| 142 | } |
| 143 | ``` |
| 144 | |
| 145 | Initial `type` values: |
| 146 | |
| 147 | - `ignored_constraint` |
| 148 | - `misunderstood_goal` |
| 149 | - `scope_drift` |
| 150 | - `wrong_tool_choice` |
| 151 | - `hallucinated_file_or_path` (also written as `hallucinated_file_or_api` in older exports; treat as equivalent) |
| 152 | - `repeated_failed_fix` |
| 153 | - `overbuilt_solution` |
| 154 | - `underbuilt_solution` |
| 155 | - `security_or_privacy_risk` |
| 156 | - `dependency_or_environment_mismatch` |
| 157 | - `format_violation` |
| 158 | - `user_frustration` |
| 159 | - `abandoned_path` |
| 160 | - `user_rejected_action` (v0.3) |
| 161 | - `tool_execution_failed` (v0.3) |
| 162 | - `model_refused` (v0.3) |
| 163 | - `permission_denied` (v0.3) |
| 164 | |
| 165 | The enum may gain values. Consumers should treat unknown values as advisory labels. |
| 166 | |
| 167 | ## Rejection (v0.3) |
| 168 | |
| 169 | ```jsonc |
| 170 | { |
| 171 | "kind": "user_declined_tool", |
| 172 | "source": "tool_result", |
| 173 | "confidence": 1.0, |
| 174 | "toolUseId": "toolu_0123ABC", |
| 175 | "tool": "Bash", |
| 176 | "ts": "2026-06-18T12:34:56.789Z", |
| 177 | "evidence": "The user doesn't want to proceed with this tool use..." |
| 178 | } |
| 179 | ``` |
| 180 | |
| 181 | `kind` enum: |
| 182 | |
| 183 | - `user_declined_tool` - human rejected a proposed tool action (Claude Code canonical "user doesn't want to proceed" text) |
| 184 | - `user_interrupt` - human pressed Esc / interrupt mid-response |
| 185 | - `user_text_decline` - human typed an explicit decline (`no, don't`, `stop`, `cancel`) |
| 186 | - `tool_execution_error` - tool ran and returned `is_error: true` for a non-decline reason |
| 187 | - `permission_denied` - environment denied the action (`permission denied`, `EACCES`, `Operation cancelled`) |
| 188 | - `model_refusal` - the model declined the request (`stop_reason: "refusal"` or refusal text) |
| 189 | |
| 190 | `source` enum: `tool_result`, `text`, `stop_reason`, `text_heuristic`. |
| 191 | |
| 192 | `confidence` follows the same banding as FailureSignal: 0.95+ verified, 0.8+ high, 0.65+ confirmed, else inferred. |
| 193 | |
| 194 | `evidence` is truncated and redacted; it carries enough context to disambiguate the rejection class. `null` when only the structured signal (e.g. `stop_reason`) is available. |
| 195 | |
| 196 | ## Edge |
| 197 | |
| 198 | ```jsonc |
| 199 | { "from": "node_001", "to": "node_002", "relationship": "refines" } |
| 200 | ``` |
| 201 | |
| 202 | `relationship` is derived from the child node's `kind`: |
| 203 | |
| 204 | - `refines` |
| 205 | - `corrects` |
| 206 | - `expands` |
| 207 | - `checkpoints` |
| 208 | - `asks` |
| 209 | - `rejects` (v0.3, from `kind: "rejection"`) |
| 210 | |
| 211 | ## CorrectionChain |
| 212 | |
| 213 | ```jsonc |
| 214 | { |
| 215 | "id": "chain_001", |
| 216 | "failureNodeId": "node_003", |
| 217 | "correctionNodeId": "node_004", |
| 218 | "resolvedNodeId": "node_006", |
| 219 | "failureType": "ignored_constraint", |
| 220 | "confidence": "high", |
| 221 | "summary": "The agent initially pursued a web app; the user corrected it toward a zero-config CLI." |
| 222 | } |
| 223 | ``` |
| 224 | |
| 225 | A correction chain links a likely failure node to the user correction that changed direction. It does not require assistant output; it is derived from prompt topology and user text. Low-confidence chains may be omitted. |
| 226 | |
| 227 | ## Lesson |
| 228 | |
| 229 | ```jsonc |
| 230 | { |
| 231 | "id": "lesson_001", |
| 232 | "title": "Preserve explicit constraints", |
| 233 | "nodeIds": ["node_003", "node_004"], |
| 234 | "text": "Future agents should carry explicit user constraints forward as high-priority requirements." |
| 235 | } |
| 236 | ``` |
| 237 | |
| 238 | Lessons are compact rules for future agents. They should be specific enough to use in handoffs or memory packs. |
| 239 | |
| 240 | ## EvalCandidate |
| 241 | |
| 242 | ```jsonc |
| 243 | { |
| 244 | "id": "eval_001", |
| 245 | "source": "treetrace", |
| 246 | "type": "instruction_following_regression", |
| 247 | "task": "Continue development while preserving the corrected direction from the session lineage.", |
| 248 | "context": "The user rejected a web app and corrected the project toward a zero-config CLI.", |
| 249 | "input": "Continue development of the project while preserving the corrected direction and constraints.", |
| 250 | "expected_behavior": [ |
| 251 | "Use the corrected prompt lineage as durable context", |
| 252 | "Do not repeat the documented failure mode" |
| 253 | ], |
| 254 | "failure_mode": "Agent repeats ignored constraint despite prior correction.", |
| 255 | "sourceNodeIds": ["node_003", "node_004"] |
| 256 | } |
| 257 | ``` |
| 258 | |
| 259 | Initial eval `type` values: |
| 260 | |
| 261 | - `instruction_following_regression` |
| 262 | - `constraint_preservation` |
| 263 | - `scope_drift_detection` |
| 264 | - `correction_adherence` |
| 265 | - `privacy_boundary_preservation` |
| 266 | - `handoff_quality` |
| 267 | - `tool_choice_regression` |
| 268 | - `tool_permission_regression` (v0.3) |
| 269 | - `tool_error_recovery` (v0.3) |
| 270 | - `refusal_handling` (v0.3) |
| 271 | |
| 272 | ## hallucinations.json (--security) |
| 273 | |
| 274 | Written to `.treetrace/hallucinations.json` when `--security` is passed. Requires a `--dir` that points to a real project tree so file existence and package manifests can be checked. |
| 275 | |
| 276 | ```jsonc |
| 277 | { |
| 278 | "schemaVersion": "0.3", |
| 279 | "project": { "name": "...", "generatedAt": "ISO-8601" }, |
| 280 | "verifiedAgainstWorkingTree": true, |
| 281 | "manifestSeen": true, |
| 282 | "summary": { |
| 283 | "total": 2, |
| 284 | "byCategory": { |
| 285 | "hallucinated_file_or_path": 1, |
| 286 | "hallucinated_import_or_package": 1 |
| 287 | } |
| 288 | }, |
| 289 | "hallucinations": [ |
| 290 | { |
| 291 | "category": "hallucinated_file_or_path", |
| 292 | "reference": "./src/middleware/rateLimit.js", |
| 293 | "nodeId": "node_001", |
| 294 | "evidence": "Referenced ... which does not exist in the working tree and was not created during the session.", |
| 295 | "evalCandidate": { |
| 296 | "type": "reference_existence_check", |
| 297 | "task": "Verify a file or path exists in the working tree before editing or relying on it.", |
| 298 | "target": "./src/middleware/rateLimit.js" |
| 299 | } |
| 300 | } |
| 301 | ], |
| 302 | "note": "..." |
| 303 | } |
| 304 | ``` |
| 305 | |
| 306 | `category` enum: |
| 307 | |
| 308 | - `hallucinated_file_or_path` - a relative file/path token appears in scannable text but does not exist on disk and was not created during the session |
| 309 | - `hallucinated_import_or_package` - a JS or Python import specifier is not a declared dependency and is not a standard-library/builtin module |
| 310 | |
| 311 | `verifiedAgainstWorkingTree` is `false` when the project directory could not be resolved. `manifestSeen` is `false` when no `package.json`, lockfile, or `requirements.txt` was found. |
| 312 | |
| 313 | Detection covers: user prompt text, tool action inputs, and tool commands. It does not scan assistant prose (assistant turns are not stored in `node.text`) and does not resolve per-symbol exports inside a module. |
| 314 | |
| 315 | ## Separate Analysis Artifacts |
| 316 | |
| 317 | TreeTrace also writes a combined human report plus focused files derived from the same redacted tree: |
| 318 | |
| 319 | - `TREETRACE_REPORT.md` |
| 320 | - `.treetrace/failures.json` |
| 321 | - `.treetrace/lessons.md` |
| 322 | - `.treetrace/evals.jsonl` |
| 323 | - `.treetrace/agent-memory.md` |
| 324 | |
| 325 | These files must not contain raw assistant logs or unredacted secrets. |
| 326 | |
| 327 | ## Composing With Agent Trace |
| 328 | |
| 329 | An Agent Trace record can point to a TreeTrace session and node range: |
| 330 | |
| 331 | - Agent Trace `conversation` -> TreeTrace `sessions[].id` |
| 332 | - Agent Trace line-range records -> work performed between two TreeTrace node IDs |
| 333 | - TreeTrace correction chains -> regression tests or code-review context for the next agent |
| 334 | |
| 335 | This keeps responsibilities clean: Agent Trace handles code attribution; TreeTrace handles human steering and correction memory. |
| 336 | |
| 337 | ## Mapping to W3C PROV |
| 338 | |
| 339 | For provenance tooling: |
| 340 | |
| 341 | - each `PromptNode` is a `prov:Activity` |
| 342 | - the human is a `prov:Agent` |
| 343 | - edges are `prov:wasInformedBy` |
| 344 | - exported artifacts are `prov:Entity` |
| 345 | - correction chains can be modeled as qualified derivations from a failure activity to a corrected activity |
| 346 | |
| 347 | ## Stability |
| 348 | |
| 349 | - `schemaVersion` follows semver-minor for additive changes. |
| 350 | - Consumers MUST ignore unknown fields. |
| 351 | - Enum values may gain members. |
| 352 | - New top-level arrays may be absent, empty, or partially populated. |