Zion Boggan zionboggan.com ↗

Harden Rust tlog recovery

Fail closed when recovered local transparency-log records are malformed, out of sequence, or hash-mismatched, and preserve exact leaf bytes for new records.

Document the local tlog leaf record shape and registry burn-in impact.

Co-authored-by: Codex (GPT-5.4) <noreply@openai.com>
24763eb   Zion Boggan committed on May 25, 2026 (4 weeks ago)
CHANGELOG.md +3 -1
@@ -34,7 +34,9 @@
validation checks missing or out-of-range event tlog indexes against the
on-disk tlog size. Validation also compares event rows to the corresponding
tlog leaf payload so an index cannot point at unrelated evidence and still
- pass burn-in checks.
+ pass burn-in checks. Local tlog recovery now rejects malformed records,
+ non-contiguous indexes, and leaf-hash mismatches instead of silently
+ ignoring corrupted lines during startup or validation.
- **GitHub Actions runtime hygiene.** Main CI workflows opt into the GitHub
Actions Node 24 runtime before the hosted runner default changes.
- **Rust policy test parity.** Fixed the `oversight-policy` crate's manifest
README.md +5 -3
@@ -101,6 +101,8 @@ Rust registry writes now fail closed if the local transparency log cannot
append, so new evidence rows cannot silently lose their audit trail.
The validator also checks that event rows point at matching tlog leaf payloads,
not just in-range indexes.
+The local transparency log now fails closed when recovered leaf records are
+malformed, out of sequence, or hash-mismatched.
The next Rust-registry gate is operational burn-in: longer-running deployment
tests against real operator databases and a final wire-format stability
@@ -439,13 +441,13 @@ current stable line.
| Rust oversight-formats | 40 | green |
| Rust oversight-manifest | 3 | green |
| Rust oversight-policy | 7 | green |
-| Rust oversight-registry | 10 | green |
+| Rust oversight-registry | 11 | green |
| Rust oversight-rekor | 10 | green |
| Rust oversight-semantic | 8 | green |
-| Rust oversight-tlog | 7 | green |
+| Rust oversight-tlog | 12 | green |
| Rust oversight-watermark | 4 | green |
| Cross-language conformance | 3 | green |
-| Total automated Rust unit tests | 127 | all green |
+| Total automated Rust unit tests | 133 | all green |
## Design principles (what Oversight never does)
docs/REGISTRY_DEPLOYMENT.md +2 -1
@@ -137,7 +137,8 @@ beacons, watermarks, events, corpus rows, identity mismatches, malformed
event `extra` JSON, malformed corpus metadata JSON, duplicate or negative
tlog indexes, missing event tlog indexes, event tlog indexes outside the
on-disk tlog size, event rows whose indexed tlog leaf carries unrelated
-evidence, malformed manifest JSON, invalid manifest signatures, and
+evidence, malformed or non-contiguous local tlog leaf records, tlog leaf hash
+mismatches, malformed manifest JSON, invalid manifest signatures, and
manifest/file ID divergence. Keep the Python database as a rollback artifact
until validation, live conformance, and evidence-bundle checks pass against
the Rust service.
docs/ROADMAP.md +3 -0
@@ -248,6 +248,9 @@ As of 2026-05-22, registry writes fail closed when tlog append fails and
`--validate-db` compares event tlog indexes against the on-disk tlog size.
As of 2026-05-24, validation also checks that each event's indexed tlog leaf
matches the event row rather than unrelated evidence.
+As of 2026-05-25, local tlog recovery rejects malformed leaf records,
+non-contiguous indexes, and leaf-hash mismatches instead of silently ignoring
+corrupted lines.
Remaining work: longer-running deployment tests and a wire-format stability
declaration before declaring v1.0 ready.
docs/spec/registry-v1.md +5 -0
@@ -280,6 +280,11 @@ These expose the local transparency log so a federated verifier can
monitor it without relying on the registry's own query responses.
The signed tree head MUST be Ed25519-signed by the registry identity
key advertised at `/.well-known/oversight-registry`.
+`/tlog/range` entries carry `index`, `leaf_hash`, `leaf_data`, and MAY
+carry `leaf_data_hex`. `leaf_data_hex`, when present, is the exact leaf
+bytes encoded as lowercase hex. Verifiers MUST recompute
+`SHA-256(0x00 || leaf_bytes)` and compare it to `leaf_hash`; legacy
+entries without `leaf_data_hex` use the UTF-8 bytes of `leaf_data`.
## Beacon endpoints
oversight-rust/oversight-tlog/src/lib.rs +118 -13
@@ -46,6 +46,12 @@ pub enum TlogError {
BadKeyLength(usize),
#[error("index {0} out of range (tree_size={1})")]
IndexOutOfRange(usize, usize),
+ #[error("leaf index mismatch: expected {expected}, got {found}")]
+ LeafIndexMismatch { expected: usize, found: usize },
+ #[error("leaf hash length for index {index}: expected 32, got {len}")]
+ BadLeafHashLength { index: usize, len: usize },
+ #[error("leaf hash mismatch at index {0}")]
+ LeafHashMismatch(usize),
}
pub type Result<T> = std::result::Result<T, TlogError>;
@@ -149,10 +155,13 @@ pub fn verify_inclusion_proof(
/// On-disk leaf record format.
#[derive(Debug, Clone, Serialize, Deserialize)]
+#[non_exhaustive]
pub struct LeafRecord {
pub index: usize,
pub leaf_hash: String,
pub leaf_data: String,
+ #[serde(default, skip_serializing_if = "Option::is_none")]
+ pub leaf_data_hex: Option<String>,
}
/// Signed tree head.
@@ -208,15 +217,27 @@ impl TransparencyLog {
if line.trim().is_empty() {
continue;
}
- if let Ok(rec) = serde_json::from_str::<LeafRecord>(&line) {
- if let Ok(bytes) = hex::decode(&rec.leaf_hash) {
- if bytes.len() == 32 {
- let mut arr = [0u8; 32];
- arr.copy_from_slice(&bytes);
- leaves.push(arr);
- }
- }
+ let rec: LeafRecord = serde_json::from_str(&line)?;
+ let expected_index = leaves.len();
+ if rec.index != expected_index {
+ return Err(TlogError::LeafIndexMismatch {
+ expected: expected_index,
+ found: rec.index,
+ });
}
+ let bytes = hex::decode(&rec.leaf_hash)?;
+ if bytes.len() != 32 {
+ return Err(TlogError::BadLeafHashLength {
+ index: rec.index,
+ len: bytes.len(),
+ });
+ }
+ let mut arr = [0u8; 32];
+ arr.copy_from_slice(&bytes);
+ if arr != leaf_hash_for_data(&leaf_data_bytes(&rec)?) {
+ return Err(TlogError::LeafHashMismatch(rec.index));
+ }
+ leaves.push(arr);
}
}
@@ -247,11 +268,7 @@ impl TransparencyLog {
let mut leaves = self.leaves.lock().unwrap();
let index = leaves.len();
- // RFC 6962 leaf prefix
- let mut prefixed = Vec::with_capacity(1 + leaf_data.len());
- prefixed.push(0x00);
- prefixed.extend_from_slice(leaf_data);
- let leaf_hash = h(&prefixed);
+ let leaf_hash = leaf_hash_for_data(leaf_data);
leaves.push(leaf_hash);
// Invalidate cached root
@@ -262,6 +279,7 @@ impl TransparencyLog {
index,
leaf_hash: hex::encode(leaf_hash),
leaf_data: String::from_utf8_lossy(leaf_data).to_string(),
+ leaf_data_hex: Some(hex::encode(leaf_data)),
};
let line = serde_json::to_string(&record)? + "\n";
let mut f = OpenOptions::new()
@@ -370,6 +388,22 @@ impl TransparencyLog {
}
}
+fn leaf_hash_for_data(leaf_data: &[u8]) -> [u8; 32] {
+ let mut prefixed = Vec::with_capacity(1 + leaf_data.len());
+ prefixed.push(0x00);
+ prefixed.extend_from_slice(leaf_data);
+ h(&prefixed)
+}
+
+fn leaf_data_bytes(rec: &LeafRecord) -> Result<Vec<u8>> {
+ rec.leaf_data_hex
+ .as_deref()
+ .map(hex::decode)
+ .transpose()
+ .map(|bytes| bytes.unwrap_or_else(|| rec.leaf_data.as_bytes().to_vec()))
+ .map_err(TlogError::Hex)
+}
+
// serde_json needs this little helper for custom errors
trait JsonErrorExt {
fn custom(msg: &'static str) -> Self;
@@ -495,6 +529,77 @@ mod tests {
assert_eq!(tl2.size(), 2);
}
+ #[test]
+ fn reopen_rejects_malformed_leaf_record() {
+ let dir = TempDir::new().unwrap();
+ std::fs::write(dir.path().join("leaves.jsonl"), "{not-json}\n").unwrap();
+ let err = match TransparencyLog::open(dir.path()) {
+ Ok(_) => panic!("malformed tlog opened"),
+ Err(err) => err,
+ };
+ assert!(matches!(err, TlogError::Json(_)));
+ }
+
+ #[test]
+ fn reopen_rejects_noncontiguous_leaf_index() {
+ let dir = TempDir::new().unwrap();
+ let data = "event_a";
+ let rec = LeafRecord {
+ index: 1,
+ leaf_hash: hex::encode(leaf_hash_for_data(data.as_bytes())),
+ leaf_data: data.into(),
+ leaf_data_hex: None,
+ };
+ std::fs::write(
+ dir.path().join("leaves.jsonl"),
+ serde_json::to_string(&rec).unwrap() + "\n",
+ )
+ .unwrap();
+ let err = match TransparencyLog::open(dir.path()) {
+ Ok(_) => panic!("noncontiguous tlog opened"),
+ Err(err) => err,
+ };
+ assert!(matches!(
+ err,
+ TlogError::LeafIndexMismatch {
+ expected: 0,
+ found: 1
+ }
+ ));
+ }
+
+ #[test]
+ fn reopen_rejects_leaf_hash_mismatch() {
+ let dir = TempDir::new().unwrap();
+ let rec = LeafRecord {
+ index: 0,
+ leaf_hash: hex::encode(leaf_hash_for_data(b"event_a")),
+ leaf_data: "event_b".into(),
+ leaf_data_hex: None,
+ };
+ std::fs::write(
+ dir.path().join("leaves.jsonl"),
+ serde_json::to_string(&rec).unwrap() + "\n",
+ )
+ .unwrap();
+ let err = match TransparencyLog::open(dir.path()) {
+ Ok(_) => panic!("hash-mismatched tlog opened"),
+ Err(err) => err,
+ };
+ assert!(matches!(err, TlogError::LeafHashMismatch(0)));
+ }
+
+ #[test]
+ fn survives_reopen_with_non_utf8_leaf_data() {
+ let dir = TempDir::new().unwrap();
+ {
+ let tl = TransparencyLog::open(dir.path()).unwrap();
+ tl.append(&[0xff, 0x00, 0xfe]).unwrap();
+ }
+ let tl2 = TransparencyLog::open(dir.path()).unwrap();
+ assert_eq!(tl2.size(), 1);
+ }
+
#[test]
fn empty_tree_has_zero_root() {
let (_d, tl) = mktlog();