| @@ -58,6 +58,9 @@ SESSION_RESUME.md | ||
| HANDOFF_*.md | ||
| NOTES_*.md | ||
| *_HANDOFF.md | ||
| + | CODEX_*.md | |
| + | CLAUDE_*.md | |
| + | START_HERE.md | |
| RUNBOOK.md | ||
| docs/RUNBOOK.md | ||
| docs/META/ |
| @@ -1,5 +1,43 @@ | ||
| # Oversight CHANGELOG | ||
| + | ## Unreleased | |
| + | ||
| + | - **Browser inspector: hybrid (post-quantum) decrypt shipped (2026-05-03).** | |
| + | The viewer at `oversight-protocol.github.io/oversight/viewer/` now decrypts | |
| + | `OSGT-HYBRID-v1` sealed files end-to-end, in addition to the | |
| + | `OSGT-CLASSIC-v1` path that shipped earlier. Implementation reuses | |
| + | WebCrypto X25519 + HKDF-SHA256 and the existing vendored `@noble/ciphers` | |
| + | XChaCha20-Poly1305, plus a newly vendored `@noble/post-quantum` ML-KEM-768 | |
| + | for the post-quantum half of the KEM. KEK is bound X-wing-style over both | |
| + | shared secrets and both ephemeral inputs (`ss_x || ss_pq || eph_pub || | |
| + | mlkem_ct`), matching `oversight_core.crypto.hybrid_wrap_dek`. New files | |
| + | on the site: `viewer/vendor/noble-post-quantum-ml-kem-0.6.1.js` (+ three | |
| + | vendored transitive deps from `@noble/hashes` and `@noble/curves`), | |
| + | `viewer/samples/tutorial-hybrid.sealed`, and | |
| + | `viewer/samples/tutorial-hybrid-identity.json`. New "Load hybrid tutorial | |
| + | identity" button surfaces the test fixture. New tooling: | |
| + | `tools/gen_hybrid_sample.py` (self-contained sample generator that mirrors | |
| + | the production hybrid wrap construction, runs anywhere `oqs` and | |
| + | `cryptography` are available), and `tools/test_hybrid_decrypt_node.mjs` | |
| + | (Node-based end-to-end smoke test against Node's WebCrypto). | |
| + | - `oversight-rust/oversight-registry`: added the missing registry v1 | |
| + | read-only and beacon surface (`/.well-known/oversight-registry`, | |
| + | `/evidence/{file_id}`, `/tlog/head|proof|range`, `/p/{token_id}.png`, | |
| + | `/r/{token_id}`, `/v/{token_id}`, `/candidates/semantic`) and tightened | |
| + | CORS to the public browser-inspector origins with GET/OPTIONS only. The | |
| + | Axum server now passes the existing 33-check | |
| + | `tests/test_registry_conformance.py` harness in live-URL mode. | |
| + | - `oversight-rust/oversight-manifest`: added `canonical_content_hash` and | |
| + | `l3_policy` to the signed manifest model so Rust verifies Python-signed | |
| + | v0.4.5+ manifests without dropping signed fields before canonicalization, | |
| + | while retaining a fallback verification path for older manifests that lack | |
| + | those default fields. | |
| + | - `oversight-rust/oversight-formats`: fixed Rust text/image watermark | |
| + | regressions that were failing the workspace test suite. Text embedding now | |
| + | keeps L2 trailing-whitespace marks at physical line endings after L1 | |
| + | zero-width insertion, and image LSB embedding avoids duplicate pixel slots | |
| + | that could overwrite earlier payload bits. | |
| + | ||
| ## v0.4.8 - 2026-04-29 Mobile-build portability and rustls-webpki security bump | ||
| Patch release covering two upstream-driven fixes that landed on `main` |
| @@ -26,8 +26,12 @@ threat-model honesty, not on a calendar date. | ||
| byte-identical to Python, optional registry lookups, and full | ||
| decryption of classic-suite sealed files using WebCrypto X25519 + HKDF-SHA256 | ||
| with a vendored `@noble/ciphers` XChaCha20-Poly1305. Post-decrypt | ||
| - | SHA-256 matches `content_hash` or the flow aborts. Hybrid (post-quantum) | |
| - | in-browser decrypt is the follow-up milestone. | |
| + | SHA-256 matches `content_hash` or the flow aborts. Hybrid | |
| + | (post-quantum) in-browser decrypt **shipped 2026-05-03** using a | |
| + | vendored ML-KEM-768 from `@noble/post-quantum` for the post-quantum | |
| + | half of the KEM, with X-wing-style HKDF binding over both shared | |
| + | secrets. The viewer now decrypts both `OSGT-CLASSIC-v1` and | |
| + | `OSGT-HYBRID-v1` sealed files. | |
| 5. **Outlook add-in first** for the first ecosystem integration. Drive, | ||
| Box, SharePoint, and Teams plugins are deferred until a maintainer or | ||
| design partner funds them. | ||
| @@ -39,8 +43,9 @@ threat-model honesty, not on a calendar date. | ||
| ## Public launch sequence | ||
| 1. L3 safety and collusion documentation. **Shipped in v0.4.5.** | ||
| - | 2. Browser inspector and drag-drop share workflow. **Inspector and | |
| - | classic-suite decrypt shipped**; hybrid decrypt pending. | |
| + | 2. Browser inspector and drag-drop share workflow. **Shipped** - | |
| + | inspector, classic-suite decrypt, and hybrid (post-quantum) decrypt | |
| + | are all live. | |
| 3. Outlook add-in. **Next up.** | ||
| 4. One regulated-industry design-partner deployment. | ||
| 5. SOC 2 Type 1 scoping in parallel with the design partner. | ||
| @@ -183,15 +188,6 @@ place. | ||
| ## Next | ||
| - | ### Hybrid (post-quantum) decrypt in the browser | |
| - | ||
| - | Classic suite works today. Hybrid adds ML-KEM-768 decapsulation for the | |
| - | second shared secret alongside X25519. WebCrypto does not expose | |
| - | ML-KEM yet. Candidate path: a wasm build of `liboqs` for the KEM step | |
| - | with the rest of the primitives handled as they are today. Viewer | |
| - | already surfaces a stub error for hybrid inputs so the UX degrades | |
| - | gracefully until the KEM is available. | |
| - | ||
| ### Outlook add-in | ||
| Microsoft add-in manifest, JS SDK surface, hosted manifest URL, and a | ||
| @@ -212,10 +208,11 @@ material. | ||
| ### Registry in Rust | ||
| `oversight-rust/oversight-registry` is scaffolded with all endpoints | ||
| - | implemented under `#![forbid(unsafe_code)]`. Remaining work: integration | |
| - | testing, migration tooling from the Python registry, and wire-format | |
| - | stability declaration. The conformance harness is the acceptance gate | |
| - | for declaring v1.0 ready. | |
| + | implemented under `#![forbid(unsafe_code)]`. As of 2026-05-03, the Axum | |
| + | server passes the existing 33-check `tests/test_registry_conformance.py` | |
| + | harness in live-URL mode against the registry v1 surface. Remaining work: | |
| + | migration tooling from the Python registry, longer-running deployment tests, | |
| + | and a wire-format stability declaration before declaring v1.0 ready. | |
| --- | ||
| @@ -285,7 +282,7 @@ via VM and retype, hardware-key pull mid-open. | ||
| | 6 | SIEM export: Splunk, Sentinel, ECS | Shipped (v0.4.6) | | ||
| | 7 | Registry v1 spec + conformance harness + CORS | Shipped (v0.4.7) | | ||
| | 8 | Browser inspector, classic-suite decrypt, opsec scanner + CI | Shipped | | ||
| - | | 9 | Hybrid PQ decrypt in browser | Next | | |
| + | | 9 | Hybrid PQ decrypt in browser | Shipped (2026-05-03) | | |
| | 10 | Outlook add-in | Next | | ||
| | 11 | Hardware KeyProvider in Rust | In progress | | ||
| | 12 | Rust Axum registry, migration tooling | In progress | |
| @@ -53,6 +53,7 @@ | ||
| use crate::{FormatAdapter, FormatError, WatermarkCandidate}; | ||
| use image::{DynamicImage, GenericImageView, ImageFormat, Pixel}; | ||
| use sha2::{Digest, Sha256}; | ||
| + | use std::collections::HashSet; | |
| use std::io::Cursor; | ||
| /// Default mark_id length in bytes for extraction. | ||
| @@ -163,16 +164,36 @@ fn set_y_lsb(r: u8, g: u8, b: u8, target_bit: u8) -> (u8, u8, u8) { | ||
| if (y & 1) == target_bit { | ||
| return (r, g, b); // Already correct | ||
| } | ||
| - | // Need to flip the Y LSB. Adjust green by +1 or -1. | |
| - | let new_g = if g < 255 { g + 1 } else { g - 1 }; | |
| - | // Verify the flip happened; if not (edge case), try adjusting red. | |
| - | let new_y = rgb_to_y(r, new_g, b); | |
| - | if (new_y & 1) == target_bit { | |
| - | return (r, new_g, b); | |
| + | ||
| + | let deltas = [0i16, -1, 1]; | |
| + | let mut best: Option<(u16, u8, u8, u8)> = None; | |
| + | for dg in deltas { | |
| + | for dr in deltas { | |
| + | for db in deltas { | |
| + | let nr = r as i16 + dr; | |
| + | let ng = g as i16 + dg; | |
| + | let nb = b as i16 + db; | |
| + | if !(0..=255).contains(&nr) || !(0..=255).contains(&ng) || !(0..=255).contains(&nb) | |
| + | { | |
| + | continue; | |
| + | } | |
| + | let nr = nr as u8; | |
| + | let ng = ng as u8; | |
| + | let nb = nb as u8; | |
| + | if (rgb_to_y(nr, ng, nb) & 1) != target_bit { | |
| + | continue; | |
| + | } | |
| + | let cost = dr.unsigned_abs() + dg.unsigned_abs() + db.unsigned_abs(); | |
| + | match best { | |
| + | Some((best_cost, _, _, _)) if best_cost <= cost => {} | |
| + | _ => best = Some((cost, nr, ng, nb)), | |
| + | } | |
| + | } | |
| + | } | |
| } | ||
| - | // Fallback: adjust red | |
| - | let new_r = if r < 255 { r + 1 } else { r - 1 }; | |
| - | (new_r, g, b) | |
| + | ||
| + | best.map(|(_, nr, ng, nb)| (nr, ng, nb)) | |
| + | .unwrap_or((r, g, b)) | |
| } | ||
| // --------------------------------------------------------------------------- | ||
| @@ -187,6 +208,7 @@ fn set_y_lsb(r: u8, g: u8, b: u8, target_bit: u8) -> (u8, u8, u8) { | ||
| fn pixel_positions(mark_id: &[u8], width: u32, height: u32, count: usize) -> Vec<(u32, u32)> { | ||
| let total_pixels = (width as u64) * (height as u64); | ||
| let mut positions = Vec::with_capacity(count); | ||
| + | let mut seen = HashSet::with_capacity(count); | |
| let mut counter: u64 = 0; | ||
| while positions.len() < count { | ||
| @@ -205,7 +227,9 @@ fn pixel_positions(mark_id: &[u8], width: u32, height: u32, count: usize) -> Vec | ||
| let idx = val % total_pixels; | ||
| let x = (idx % width as u64) as u32; | ||
| let y = (idx / width as u64) as u32; | ||
| - | positions.push((x, y)); | |
| + | if seen.insert((x, y)) { | |
| + | positions.push((x, y)); | |
| + | } | |
| } | ||
| counter += 1; | ||
| } | ||
| @@ -239,7 +263,9 @@ pub fn embed_lsb(image_bytes: &[u8], mark_id: &[u8]) -> Result<Vec<u8>, FormatEr | ||
| if total_bits as u64 > total_pixels { | ||
| return Err(FormatError::EmbedFailed(format!( | ||
| "image too small: need {} pixels for {} payload bits, have {}", | ||
| - | total_bits, payload.len(), total_pixels | |
| + | total_bits, | |
| + | payload.len(), | |
| + | total_pixels | |
| ))); | ||
| } | ||
| @@ -340,7 +366,9 @@ pub fn embed_lsb_blind(image_bytes: &[u8], mark_id: &[u8]) -> Result<Vec<u8>, Fo | ||
| if total_bits as u64 > total_pixels { | ||
| return Err(FormatError::EmbedFailed(format!( | ||
| "image too small: need {} pixels for {} payload bits, have {}", | ||
| - | total_bits, payload.len(), total_pixels | |
| + | total_bits, | |
| + | payload.len(), | |
| + | total_pixels | |
| ))); | ||
| } | ||
| @@ -459,9 +487,7 @@ mod tests { | ||
| #[test] | ||
| fn blind_embed_extract_round_trip() { | ||
| // Create a small test image (32x32 white) | ||
| - | let img = image::RgbaImage::from_fn(32, 32, |_x, _y| { | |
| - | image::Rgba([200, 200, 200, 255]) | |
| - | }); | |
| + | let img = image::RgbaImage::from_fn(32, 32, |_x, _y| image::Rgba([200, 200, 200, 255])); | |
| let mut buf = Cursor::new(Vec::new()); | ||
| img.write_to(&mut buf, ImageFormat::Png).unwrap(); | ||
| let png_bytes = buf.into_inner(); | ||
| @@ -492,7 +518,10 @@ mod tests { | ||
| let extracted = extract_lsb(&png_bytes, 8).unwrap(); | ||
| // Very likely None since random pixels won't have our magic header | ||
| // (probability of false positive: 2^-16 per attempt) | ||
| - | assert!(extracted.is_none(), "unmarked image should not yield a watermark"); | |
| + | assert!( | |
| + | extracted.is_none(), | |
| + | "unmarked image should not yield a watermark" | |
| + | ); | |
| } | ||
| #[test] |
| @@ -6,9 +6,10 @@ | ||
| //! - **L2** trailing whitespace (`oversight-watermark::embed_ws` / `extract_ws`) | ||
| //! - **L3** semantic synonym rotation (`oversight-semantic::embed_synonyms` / `verify_synonyms`) | ||
| //! | ||
| - | //! Layer order on embed: L3 runs first (rewrites visible words), then L2 | |
| - | //! (trailing whitespace), then L1 (zero-width chars). This matches the | |
| - | //! Python `oversight_core.formats.text` adapter. | |
| + | //! Layer order on embed: L3 runs first (rewrites visible words), then L1 | |
| + | //! (zero-width chars), then L2 (trailing whitespace). L2 runs last so later | |
| + | //! zero-width frame insertion cannot move trailing whitespace away from the | |
| + | //! physical end of a line. | |
| use crate::{FormatAdapter, FormatError, WatermarkCandidate}; | ||
| @@ -33,7 +34,9 @@ impl FormatAdapter for TextAdapter { | ||
| } | ||
| fn extensions(&self) -> &[&str] { | ||
| - | &["txt", "md", "rst", "csv", "log", "json", "xml", "yaml", "yml", "toml"] | |
| + | &[ | |
| + | "txt", "md", "rst", "csv", "log", "json", "xml", "yaml", "yml", "toml", | |
| + | ] | |
| } | ||
| fn can_handle(&self, data: &[u8]) -> bool { | ||
| @@ -81,16 +84,17 @@ impl FormatAdapter for TextAdapter { | ||
| /// Apply all three watermark layers to plaintext. | ||
| /// | ||
| - | /// Layer order: L3 first (rewrites visible words), then L2 (trailing | |
| - | /// whitespace), then L1 (zero-width chars). This order ensures that | |
| - | /// steganographic layers don't get clobbered by semantic rewriting. | |
| + | /// Layer order: L3 first (rewrites visible words), then L1 (zero-width | |
| + | /// chars), then L2 (trailing whitespace). This order ensures that semantic | |
| + | /// rewriting does not fragment invisible frames and that L2 remains at line | |
| + | /// endings. | |
| pub fn embed_all_layers(text: &str, mark_id: &[u8]) -> String { | ||
| // L3: semantic synonym rotation | ||
| let t = oversight_semantic::embed_synonyms(text, mark_id, L3_MIN_INSTANCES); | ||
| - | // L2: trailing whitespace | |
| - | let t = oversight_watermark::embed_ws(&t, mark_id); | |
| // L1: zero-width unicode | ||
| - | oversight_watermark::embed_zw(&t, mark_id, ZW_DENSITY) | |
| + | let t = oversight_watermark::embed_zw(&t, mark_id, ZW_DENSITY); | |
| + | // L2: trailing whitespace | |
| + | oversight_watermark::embed_ws(&t, mark_id) | |
| } | ||
| /// Apply only specific layers. `layers` is a slice of layer names: "L1", "L2", "L3". | ||
| @@ -99,12 +103,12 @@ pub fn embed_layers(text: &str, mark_id: &[u8], layers: &[&str]) -> String { | ||
| if layers.contains(&"L3") { | ||
| t = oversight_semantic::embed_synonyms(&t, mark_id, L3_MIN_INSTANCES); | ||
| } | ||
| - | if layers.contains(&"L2") { | |
| - | t = oversight_watermark::embed_ws(&t, mark_id); | |
| - | } | |
| if layers.contains(&"L1") { | ||
| t = oversight_watermark::embed_zw(&t, mark_id, ZW_DENSITY); | ||
| } | ||
| + | if layers.contains(&"L2") { | |
| + | t = oversight_watermark::embed_ws(&t, mark_id); | |
| + | } | |
| t | ||
| } | ||
| @@ -145,7 +149,8 @@ pub fn extract_all_layers(text: &str) -> Vec<WatermarkCandidate> { | ||
| /// Returns `Some(WatermarkCandidate)` if the candidate matches with score | ||
| /// above the threshold, `None` otherwise. | ||
| pub fn verify_l3(text: &str, candidate_mark_id: &[u8]) -> Option<WatermarkCandidate> { | ||
| - | let (matched, score) = oversight_semantic::verify_synonyms(text, candidate_mark_id, L3_THRESHOLD); | |
| + | let (matched, score) = | |
| + | oversight_semantic::verify_synonyms(text, candidate_mark_id, L3_THRESHOLD); | |
| if matched { | ||
| Some(WatermarkCandidate { | ||
| mark_id: candidate_mark_id.to_vec(), | ||
| @@ -230,17 +235,16 @@ fn normalize_text(text: &str) -> String { | ||
| mod tests { | ||
| use super::*; | ||
| - | const LONG_TEXT: &str = "The quick brown fox jumps over the lazy dog. \ | |
| - | Revenue performance exceeded expectations across all business units. \ | |
| - | The team plans to continue the expansion strategy outlined in the report. \ | |
| - | However, there are important risks to consider before we commence the next \ | |
| - | phase. We need to carefully review the competitive situation and determine \ | |
| - | whether our current approach is the right one. The board will also request \ | |
| - | that we improve internal reporting and reduce operational overhead. It is \ | |
| - | difficult to know exactly how quickly the market will change, but we should \ | |
| - | respond rapidly when opportunities appear. Overall the results show clear \ | |
| - | momentum and a strong basis for continued growth. The organization has \ | |
| - | demonstrated significant progress in multiple areas this quarter."; | |
| + | fn long_text() -> String { | |
| + | (0..80) | |
| + | .map(|i| { | |
| + | format!( | |
| + | "Line {i}: The quick brown fox jumps over the lazy dog while revenue performance and operational plans remain under review." | |
| + | ) | |
| + | }) | |
| + | .collect::<Vec<_>>() | |
| + | .join("\n") | |
| + | } | |
| #[test] | ||
| fn text_adapter_can_handle() { | ||
| @@ -264,7 +268,8 @@ mod tests { | ||
| #[test] | ||
| fn embed_extract_round_trip_l1_l2() { | ||
| let mark = oversight_watermark::new_mark_id(MARK_LEN); | ||
| - | let marked = embed_layers(LONG_TEXT, &mark, &["L1", "L2"]); | |
| + | let text = long_text(); | |
| + | let marked = embed_layers(&text, &mark, &["L1", "L2"]); | |
| let candidates = extract_all_layers(&marked); | ||
| let l1_hits: Vec<_> = candidates.iter().filter(|c| c.layer == "L1").collect(); | ||
| @@ -279,7 +284,8 @@ mod tests { | ||
| #[test] | ||
| fn embed_extract_all_layers_round_trip() { | ||
| let mark = oversight_watermark::new_mark_id(MARK_LEN); | ||
| - | let marked = embed_all_layers(LONG_TEXT, &mark); | |
| + | let text = long_text(); | |
| + | let marked = embed_all_layers(&text, &mark); | |
| // L1 + L2 direct extraction | ||
| let candidates = extract_all_layers(&marked); | ||
| @@ -296,7 +302,8 @@ mod tests { | ||
| #[test] | ||
| fn verify_all_layers_correct_mark() { | ||
| let mark = oversight_watermark::new_mark_id(MARK_LEN); | ||
| - | let marked = embed_all_layers(LONG_TEXT, &mark); | |
| + | let text = long_text(); | |
| + | let marked = embed_all_layers(&text, &mark); | |
| let results = verify_all_layers(&marked, &mark); | ||
| let layers: Vec<&str> = results.iter().map(|r| r.layer.as_str()).collect(); | ||
| assert!(layers.contains(&"L1"), "L1 should verify"); | ||
| @@ -308,22 +315,32 @@ mod tests { | ||
| fn verify_all_layers_wrong_mark() { | ||
| let good = oversight_watermark::new_mark_id(MARK_LEN); | ||
| let bad = oversight_watermark::new_mark_id(MARK_LEN); | ||
| - | let marked = embed_all_layers(LONG_TEXT, &good); | |
| + | let text = long_text(); | |
| + | let marked = embed_all_layers(&text, &good); | |
| let results = verify_all_layers(&marked, &bad); | ||
| // Wrong mark should not match any layer (with overwhelmingly high probability) | ||
| - | assert!(results.is_empty() || results.iter().all(|r| r.layer == "L3" && r.confidence < 0.80)); | |
| + | assert!( | |
| + | results.is_empty() | |
| + | || results | |
| + | .iter() | |
| + | .all(|r| r.layer == "L3" && r.confidence < 0.80) | |
| + | ); | |
| } | ||
| #[test] | ||
| fn adapter_embed_extract_via_trait() { | ||
| let adapter = TextAdapter; | ||
| let mark = oversight_watermark::new_mark_id(MARK_LEN); | ||
| - | let data = LONG_TEXT.as_bytes(); | |
| + | let text = long_text(); | |
| + | let data = text.as_bytes(); | |
| let marked_bytes = adapter.embed_watermark(data, &mark).unwrap(); | ||
| let candidates = adapter.extract_watermark(&marked_bytes).unwrap(); | ||
| - | assert!(!candidates.is_empty(), "should extract at least one candidate"); | |
| + | assert!( | |
| + | !candidates.is_empty(), | |
| + | "should extract at least one candidate" | |
| + | ); | |
| assert!(candidates.iter().any(|c| c.mark_id == mark)); | ||
| } | ||
| @@ -341,9 +358,7 @@ mod tests { | ||
| fn normalize_collapses_whitespace() { | ||
| let adapter = TextAdapter; | ||
| let text = " Hello world \n\n foo "; | ||
| - | let normalized = adapter | |
| - | .normalize_for_fingerprint(text.as_bytes()) | |
| - | .unwrap(); | |
| + | let normalized = adapter.normalize_for_fingerprint(text.as_bytes()).unwrap(); | |
| assert_eq!(normalized, "hello world foo"); | ||
| } | ||
| @@ -351,7 +366,8 @@ mod tests { | ||
| fn l1_survives_stripped_whitespace() { | ||
| // L1 zero-width chars survive trailing-whitespace stripping | ||
| let mark = oversight_watermark::new_mark_id(MARK_LEN); | ||
| - | let marked = embed_all_layers(LONG_TEXT, &mark); | |
| + | let text = long_text(); | |
| + | let marked = embed_all_layers(&text, &mark); | |
| let stripped: String = marked | ||
| .lines() | ||
| .map(|l| l.trim_end()) | ||
| @@ -359,7 +375,10 @@ mod tests { | ||
| .join("\n"); | ||
| let candidates = extract_all_layers(&stripped); | ||
| let l1_hits: Vec<_> = candidates.iter().filter(|c| c.layer == "L1").collect(); | ||
| - | assert!(!l1_hits.is_empty(), "L1 should survive whitespace stripping"); | |
| + | assert!( | |
| + | !l1_hits.is_empty(), | |
| + | "L1 should survive whitespace stripping" | |
| + | ); | |
| assert_eq!(l1_hits[0].mark_id, mark); | ||
| } | ||
| } |
| @@ -48,6 +48,7 @@ pub struct Manifest { | ||
| pub suite: String, | ||
| pub original_filename: String, | ||
| pub content_hash: String, | ||
| + | pub canonical_content_hash: String, | |
| pub content_type: String, | ||
| pub size_bytes: u64, | ||
| pub issuer_id: String, | ||
| @@ -57,6 +58,7 @@ pub struct Manifest { | ||
| pub watermarks: Vec<WatermarkRef>, | ||
| pub beacons: Vec<serde_json::Value>, | ||
| pub policy: serde_json::Value, | ||
| + | pub l3_policy: serde_json::Value, | |
| pub signature_ed25519: String, | ||
| pub signature_ml_dsa: String, | ||
| } | ||
| @@ -70,6 +72,7 @@ impl Default for Manifest { | ||
| suite: crypto::SUITE_CLASSIC_V1.into(), | ||
| original_filename: String::new(), | ||
| content_hash: String::new(), | ||
| + | canonical_content_hash: String::new(), | |
| content_type: "application/octet-stream".into(), | ||
| size_bytes: 0, | ||
| issuer_id: String::new(), | ||
| @@ -78,6 +81,7 @@ impl Default for Manifest { | ||
| watermarks: Vec::new(), | ||
| beacons: Vec::new(), | ||
| policy: serde_json::json!({}), | ||
| + | l3_policy: serde_json::json!({}), | |
| signature_ed25519: String::new(), | ||
| signature_ml_dsa: String::new(), | ||
| } | ||
| @@ -117,6 +121,7 @@ impl Manifest { | ||
| .unwrap_or(0), | ||
| original_filename: original_filename.into(), | ||
| content_hash: content_hash.into(), | ||
| + | canonical_content_hash: String::new(), | |
| content_type: content_type.into(), | ||
| size_bytes, | ||
| issuer_id: issuer_id.into(), | ||
| @@ -125,6 +130,14 @@ impl Manifest { | ||
| policy, | ||
| ..Default::default() | ||
| } | ||
| + | .with_default_canonical_content_hash() | |
| + | } | |
| + | ||
| + | fn with_default_canonical_content_hash(mut self) -> Self { | |
| + | if self.canonical_content_hash.is_empty() { | |
| + | self.canonical_content_hash = self.content_hash.clone(); | |
| + | } | |
| + | self | |
| } | ||
| /// Canonical bytes (excluding signatures) - this is what gets signed. | ||
| @@ -138,6 +151,21 @@ impl Manifest { | ||
| serde_jcs::to_vec(&v).map_err(|_| ManifestError::Canonicalization) | ||
| } | ||
| + | fn legacy_canonical_bytes_without_new_defaults(&self) -> Result<Vec<u8>, ManifestError> { | |
| + | let mut v = serde_json::to_value(self)?; | |
| + | if let Some(obj) = v.as_object_mut() { | |
| + | obj.insert("signature_ed25519".into(), serde_json::json!("")); | |
| + | obj.insert("signature_ml_dsa".into(), serde_json::json!("")); | |
| + | if self.canonical_content_hash.is_empty() { | |
| + | obj.remove("canonical_content_hash"); | |
| + | } | |
| + | if self.l3_policy.as_object().is_some_and(|o| o.is_empty()) { | |
| + | obj.remove("l3_policy"); | |
| + | } | |
| + | } | |
| + | serde_jcs::to_vec(&v).map_err(|_| ManifestError::Canonicalization) | |
| + | } | |
| + | ||
| pub fn to_json(&self) -> Result<Vec<u8>, ManifestError> { | ||
| let v = serde_json::to_value(self)?; | ||
| serde_jcs::to_vec(&v).map_err(|_| ManifestError::Canonicalization) | ||
| @@ -165,7 +193,14 @@ impl Manifest { | ||
| let bytes = self.canonical_bytes()?; | ||
| let sig = hex::decode(&self.signature_ed25519)?; | ||
| let pub_key = hex::decode(&self.issuer_ed25519_pub)?; | ||
| - | Ok(crypto::verify_message(&bytes, &sig, &pub_key)) | |
| + | if crypto::verify_message(&bytes, &sig, &pub_key) { | |
| + | return Ok(true); | |
| + | } | |
| + | let legacy_bytes = self.legacy_canonical_bytes_without_new_defaults()?; | |
| + | if legacy_bytes != bytes { | |
| + | return Ok(crypto::verify_message(&legacy_bytes, &sig, &pub_key)); | |
| + | } | |
| + | Ok(false) | |
| } | ||
| } | ||
| @@ -233,4 +268,45 @@ mod tests { | ||
| assert_eq!(m, parsed); | ||
| assert!(parsed.verify().unwrap()); | ||
| } | ||
| + | ||
| + | #[test] | |
| + | fn verify_legacy_manifest_missing_l3_fields() { | |
| + | let issuer = ClassicIdentity::generate(); | |
| + | let recipient = ClassicIdentity::generate(); | |
| + | let m = Manifest::new( | |
| + | "doc.txt", | |
| + | crypto::content_hash(b"hello"), | |
| + | 5, | |
| + | "issuer@test", | |
| + | hex::encode(issuer.ed25519_pub), | |
| + | Recipient { | |
| + | recipient_id: "alice@test".into(), | |
| + | x25519_pub: hex::encode(recipient.x25519_pub), | |
| + | ed25519_pub: None, | |
| + | }, | |
| + | "https://registry.test", | |
| + | "text/plain", | |
| + | None, | |
| + | None, | |
| + | "GLOBAL", | |
| + | ); | |
| + | ||
| + | let mut value = serde_json::to_value(&m).unwrap(); | |
| + | { | |
| + | let obj = value.as_object_mut().unwrap(); | |
| + | obj.remove("canonical_content_hash"); | |
| + | obj.remove("l3_policy"); | |
| + | obj.insert("signature_ed25519".into(), serde_json::json!("")); | |
| + | obj.insert("signature_ml_dsa".into(), serde_json::json!("")); | |
| + | } | |
| + | let legacy_bytes = serde_jcs::to_vec(&value).unwrap(); | |
| + | let sig = crypto::sign_message(&legacy_bytes, issuer.ed25519_priv.as_ref()).unwrap(); | |
| + | value.as_object_mut().unwrap().insert( | |
| + | "signature_ed25519".into(), | |
| + | serde_json::json!(hex::encode(sig)), | |
| + | ); | |
| + | ||
| + | let parsed: Manifest = serde_json::from_value(value).unwrap(); | |
| + | assert!(parsed.verify().unwrap()); | |
| + | } | |
| } |
| @@ -14,9 +14,8 @@ use crate::models::*; | ||
| pub async fn create_pool(db_path: &Path) -> Result<SqlitePool> { | ||
| // Ensure parent directory exists. | ||
| if let Some(parent) = db_path.parent() { | ||
| - | std::fs::create_dir_all(parent).map_err(|e| { | |
| - | RegistryError::Internal(format!("cannot create db directory: {e}")) | |
| - | })?; | |
| + | std::fs::create_dir_all(parent) | |
| + | .map_err(|e| RegistryError::Internal(format!("cannot create db directory: {e}")))?; | |
| } | ||
| let db_url = format!("sqlite://{}?mode=rwc", db_path.display()); | ||
| @@ -136,16 +135,12 @@ pub async fn run_migrations(pool: &SqlitePool) -> Result<()> { | ||
| // ---- Manifest queries --------------------------------------------------- | ||
| /// Look up the issuer pubkey for an existing file_id. Returns None if not found. | ||
| - | pub async fn get_manifest_issuer_pub( | |
| - | pool: &SqlitePool, | |
| - | file_id: &str, | |
| - | ) -> Result<Option<String>> { | |
| - | let row: Option<(String,)> = sqlx::query_as( | |
| - | "SELECT issuer_ed25519_pub FROM manifests WHERE file_id = ?", | |
| - | ) | |
| - | .bind(file_id) | |
| - | .fetch_optional(pool) | |
| - | .await?; | |
| + | pub async fn get_manifest_issuer_pub(pool: &SqlitePool, file_id: &str) -> Result<Option<String>> { | |
| + | let row: Option<(String,)> = | |
| + | sqlx::query_as("SELECT issuer_ed25519_pub FROM manifests WHERE file_id = ?") | |
| + | .bind(file_id) | |
| + | .fetch_optional(pool) | |
| + | .await?; | |
| Ok(row.map(|r| r.0)) | ||
| } | ||
| @@ -287,10 +282,7 @@ pub async fn get_watermark( | ||
| } | ||
| /// Get all watermarks for a file_id. | ||
| - | pub async fn get_watermarks_by_file( | |
| - | pool: &SqlitePool, | |
| - | file_id: &str, | |
| - | ) -> Result<Vec<WatermarkRow>> { | |
| + | pub async fn get_watermarks_by_file(pool: &SqlitePool, file_id: &str) -> Result<Vec<WatermarkRow>> { | |
| let rows = sqlx::query_as::<_, WatermarkRow>( | ||
| "SELECT mark_id, layer, file_id, recipient_id, issuer_id, registered_at FROM watermarks WHERE file_id = ?", | ||
| ) | ||
| @@ -352,6 +344,17 @@ pub async fn get_recent_events( | ||
| Ok(rows) | ||
| } | ||
| + | /// Get all events for a file_id, oldest first. | |
| + | pub async fn get_events_by_file(pool: &SqlitePool, file_id: &str) -> Result<Vec<EventRow>> { | |
| + | let rows = sqlx::query_as::<_, EventRow>( | |
| + | "SELECT id, token_id, file_id, recipient_id, issuer_id, kind, source_ip, user_agent, extra, timestamp, qualified_timestamp, tlog_index FROM events WHERE file_id = ? ORDER BY timestamp ASC", | |
| + | ) | |
| + | .bind(file_id) | |
| + | .fetch_all(pool) | |
| + | .await?; | |
| + | Ok(rows) | |
| + | } | |
| + | ||
| // ---- Corpus queries ----------------------------------------------------- | ||
| /// Insert or replace a corpus hash entry. | ||
| @@ -387,3 +390,31 @@ pub async fn lookup_by_perceptual_hash( | ||
| .await?; | ||
| Ok(row) | ||
| } | ||
| + | ||
| + | /// Return recent L3 semantic watermark candidates for verifier scrapers. | |
| + | pub async fn get_semantic_candidates( | |
| + | pool: &SqlitePool, | |
| + | limit: i64, | |
| + | since: Option<i64>, | |
| + | ) -> Result<Vec<SemanticCandidateRow>> { | |
| + | let rows = match since { | |
| + | Some(since) => { | |
| + | sqlx::query_as::<_, SemanticCandidateRow>( | |
| + | "SELECT mark_id, file_id, recipient_id, registered_at FROM watermarks WHERE layer = 'L3_semantic' AND registered_at >= ? ORDER BY registered_at DESC LIMIT ?", | |
| + | ) | |
| + | .bind(since) | |
| + | .bind(limit) | |
| + | .fetch_all(pool) | |
| + | .await? | |
| + | } | |
| + | None => { | |
| + | sqlx::query_as::<_, SemanticCandidateRow>( | |
| + | "SELECT mark_id, file_id, recipient_id, registered_at FROM watermarks WHERE layer = 'L3_semantic' ORDER BY registered_at DESC LIMIT ?", | |
| + | ) | |
| + | .bind(limit) | |
| + | .fetch_all(pool) | |
| + | .await? | |
| + | } | |
| + | }; | |
| + | Ok(rows) | |
| + | } |
| @@ -16,6 +16,9 @@ pub enum RegistryError { | ||
| #[error("conflict: {0}")] | ||
| Conflict(String), | ||
| + | #[error("unauthorized: {0}")] | |
| + | Unauthorized(String), | |
| + | ||
| #[error("rate limit exceeded")] | ||
| RateLimited, | ||
| @@ -32,16 +35,14 @@ impl IntoResponse for RegistryError { | ||
| RegistryError::BadRequest(msg) => (StatusCode::BAD_REQUEST, msg.clone()), | ||
| RegistryError::NotFound(msg) => (StatusCode::NOT_FOUND, msg.clone()), | ||
| RegistryError::Conflict(msg) => (StatusCode::CONFLICT, msg.clone()), | ||
| + | RegistryError::Unauthorized(msg) => (StatusCode::UNAUTHORIZED, msg.clone()), | |
| RegistryError::RateLimited => { | ||
| (StatusCode::TOO_MANY_REQUESTS, "rate limit exceeded".into()) | ||
| } | ||
| - | RegistryError::Database(_) => ( | |
| - | StatusCode::INTERNAL_SERVER_ERROR, | |
| - | "database error".into(), | |
| - | ), | |
| - | RegistryError::Internal(msg) => { | |
| - | (StatusCode::INTERNAL_SERVER_ERROR, msg.clone()) | |
| + | RegistryError::Database(_) => { | |
| + | (StatusCode::INTERNAL_SERVER_ERROR, "database error".into()) | |
| } | ||
| + | RegistryError::Internal(msg) => (StatusCode::INTERNAL_SERVER_ERROR, msg.clone()), | |
| }; | ||
| // Log server-side errors at error level; client errors at debug. |
| @@ -26,7 +26,7 @@ use std::sync::{Arc, Mutex}; | ||
| use std::time::Instant; | ||
| use axum::extract::{ConnectInfo, State}; | ||
| - | use axum::http::{HeaderMap, Request, StatusCode}; | |
| + | use axum::http::{header, HeaderMap, HeaderValue, Method, Request, StatusCode}; | |
| use axum::middleware::{self, Next}; | ||
| use axum::response::Response; | ||
| use axum::routing::{get, post}; | ||
| @@ -34,7 +34,7 @@ use axum::Router; | ||
| use clap::Parser; | ||
| use oversight_tlog::TransparencyLog; | ||
| use sqlx::SqlitePool; | ||
| - | use tower_http::cors::CorsLayer; | |
| + | use tower_http::cors::{AllowOrigin, CorsLayer}; | |
| use tower_http::trace::TraceLayer; | ||
| pub const VERSION: &str = "1.0.0"; | ||
| @@ -250,6 +250,41 @@ fn load_or_create_identity(data_dir: &PathBuf) -> Option<RegistryIdentity> { | ||
| }) | ||
| } | ||
| + | fn allowed_cors_origins() -> Vec<HeaderValue> { | |
| + | let mut origins = vec![ | |
| + | "https://oversight-protocol.github.io".to_string(), | |
| + | "https://oversightprotocol.dev".to_string(), | |
| + | "https://www.oversightprotocol.dev".to_string(), | |
| + | "http://localhost:8000".to_string(), | |
| + | "http://127.0.0.1:8000".to_string(), | |
| + | "http://localhost:8787".to_string(), | |
| + | "http://127.0.0.1:8787".to_string(), | |
| + | ]; | |
| + | origins.extend( | |
| + | std::env::var("OVERSIGHT_CORS_ORIGINS") | |
| + | .unwrap_or_default() | |
| + | .split(',') | |
| + | .map(str::trim) | |
| + | .filter(|origin| !origin.is_empty()) | |
| + | .map(str::to_string), | |
| + | ); | |
| + | origins | |
| + | .into_iter() | |
| + | .filter_map(|origin| HeaderValue::from_str(&origin).ok()) | |
| + | .collect() | |
| + | } | |
| + | ||
| + | fn cors_layer() -> CorsLayer { | |
| + | let allowed = allowed_cors_origins(); | |
| + | CorsLayer::new() | |
| + | .allow_origin(AllowOrigin::predicate(move |origin, _| { | |
| + | allowed.iter().any(|candidate| candidate == origin) | |
| + | })) | |
| + | .allow_methods([Method::GET, Method::OPTIONS]) | |
| + | .allow_headers([header::ACCEPT, header::CONTENT_TYPE]) | |
| + | .max_age(std::time::Duration::from_secs(3600)) | |
| + | } | |
| + | ||
| // ---- Rate-limit middleware ---------------------------------------------- | ||
| async fn rate_limit_middleware( | ||
| @@ -293,8 +328,7 @@ async fn main() -> anyhow::Result<()> { | ||
| .or_else(|| args.db.clone()) | ||
| .unwrap_or_else(|| { | ||
| if cfg!(windows) { | ||
| - | std::env::var("TEMP") | |
| - | .unwrap_or_else(|_| "C:\\Temp".to_string()) | |
| + | std::env::var("TEMP").unwrap_or_else(|_| "C:\\Temp".to_string()) | |
| + "\\oversight-registry.sqlite" | ||
| } else { | ||
| "/tmp/oversight-registry.sqlite".to_string() | ||
| @@ -308,8 +342,7 @@ async fn main() -> anyhow::Result<()> { | ||
| .or_else(|| args.data_dir.clone()) | ||
| .unwrap_or_else(|| { | ||
| if cfg!(windows) { | ||
| - | std::env::var("TEMP") | |
| - | .unwrap_or_else(|_| "C:\\Temp".to_string()) | |
| + | std::env::var("TEMP").unwrap_or_else(|_| "C:\\Temp".to_string()) | |
| + "\\oversight-data" | ||
| } else { | ||
| "/tmp/oversight-data".to_string() | ||
| @@ -317,10 +350,7 @@ async fn main() -> anyhow::Result<()> { | ||
| }), | ||
| ); | ||
| - | let trusted_proxy = std::env::var("TRUSTED_PROXY") | |
| - | .unwrap_or_default() | |
| - | .trim() | |
| - | == "1"; | |
| + | let trusted_proxy = std::env::var("TRUSTED_PROXY").unwrap_or_default().trim() == "1"; | |
| let rekor_enabled = std::env::var("OVERSIGHT_REKOR_ENABLED") | ||
| .unwrap_or_default() | ||
| @@ -373,15 +403,38 @@ async fn main() -> anyhow::Result<()> { | ||
| // Build router. | ||
| let app = Router::new() | ||
| .route("/health", get(routes::health::health)) | ||
| + | .route( | |
| + | "/.well-known/oversight-registry", | |
| + | get(routes::well_known::well_known), | |
| + | ) | |
| .route("/register", post(routes::register::register)) | ||
| .route("/attribute", post(routes::attribute::attribute)) | ||
| .route("/query/:file_id", get(routes::query::query_file)) | ||
| + | .route("/evidence/:file_id", get(routes::evidence::evidence_bundle)) | |
| + | .route("/tlog/head", get(routes::tlog::tlog_head)) | |
| + | .route("/tlog/proof/:index", get(routes::tlog::tlog_proof)) | |
| + | .route("/tlog/range", get(routes::tlog::tlog_range)) | |
| + | .route("/p/:token_id", get(routes::beacon::beacon_png)) | |
| + | .route( | |
| + | "/r/:token_id", | |
| + | get(routes::beacon::beacon_ocsp).post(routes::beacon::beacon_ocsp), | |
| + | ) | |
| + | .route( | |
| + | "/ocsp/r/:token_id", | |
| + | get(routes::beacon::beacon_ocsp).post(routes::beacon::beacon_ocsp), | |
| + | ) | |
| + | .route("/v/:token_id", get(routes::beacon::beacon_license)) | |
| + | .route("/lic/v/:token_id", get(routes::beacon::beacon_license)) | |
| + | .route( | |
| + | "/candidates/semantic", | |
| + | get(routes::semantic::candidates_semantic), | |
| + | ) | |
| .route("/dns_event", post(routes::dns_event::dns_event)) | ||
| .layer(middleware::from_fn_with_state( | ||
| state.clone(), | ||
| rate_limit_middleware, | ||
| )) | ||
| - | .layer(CorsLayer::permissive()) | |
| + | .layer(cors_layer()) | |
| .layer(TraceLayer::new_for_http()) | ||
| .with_state(state); | ||
| @@ -151,3 +151,11 @@ pub struct EventRow { | ||
| pub qualified_timestamp: Option<String>, | ||
| pub tlog_index: Option<i64>, | ||
| } | ||
| + | ||
| + | #[derive(Debug, Clone, Serialize, Deserialize, sqlx::FromRow)] | |
| + | pub struct SemanticCandidateRow { | |
| + | pub mark_id: String, | |
| + | pub file_id: String, | |
| + | pub recipient_id: String, | |
| + | pub registered_at: i64, | |
| + | } |
| @@ -0,0 +1,118 @@ | ||
| + | //! Beacon callback endpoints: HTTP image, OCSP-style, and license checks. | |
| + | ||
| + | use axum::extract::{ConnectInfo, Path, State}; | |
| + | use axum::http::{header, HeaderMap, StatusCode}; | |
| + | use axum::response::{IntoResponse, Response}; | |
| + | use axum::Json; | |
| + | use std::net::SocketAddr; | |
| + | use std::sync::Arc; | |
| + | use std::time::{SystemTime, UNIX_EPOCH}; | |
| + | ||
| + | use crate::db; | |
| + | use crate::error::{RegistryError, Result}; | |
| + | use crate::models::MAX_ID_LEN; | |
| + | use crate::AppState; | |
| + | ||
| + | const ONE_PX_PNG: &[u8] = &[ | |
| + | 0x89, 0x50, 0x4e, 0x47, 0x0d, 0x0a, 0x1a, 0x0a, 0x00, 0x00, 0x00, 0x0d, 0x49, 0x48, 0x44, 0x52, | |
| + | 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x01, 0x08, 0x06, 0x00, 0x00, 0x00, 0x1f, 0x15, 0xc4, | |
| + | 0x89, 0x00, 0x00, 0x00, 0x0d, 0x49, 0x44, 0x41, 0x54, 0x78, 0x9c, 0x62, 0x60, 0x00, 0x00, 0x00, | |
| + | 0x00, 0x05, 0x00, 0x01, 0xa5, 0xf6, 0x45, 0x40, 0x00, 0x00, 0x00, 0x00, 0x49, 0x45, 0x4e, 0x44, | |
| + | 0xae, 0x42, 0x60, 0x82, | |
| + | ]; | |
| + | ||
| + | pub async fn beacon_png( | |
| + | State(state): State<Arc<AppState>>, | |
| + | ConnectInfo(addr): ConnectInfo<SocketAddr>, | |
| + | headers: HeaderMap, | |
| + | Path(token_id): Path<String>, | |
| + | ) -> Result<Response> { | |
| + | let token_id = token_id | |
| + | .strip_suffix(".png") | |
| + | .unwrap_or(token_id.as_str()) | |
| + | .to_string(); | |
| + | record_event(&state, &addr, &headers, &token_id, "http_img").await?; | |
| + | Ok(([(header::CONTENT_TYPE, "image/png")], ONE_PX_PNG).into_response()) | |
| + | } | |
| + | ||
| + | pub async fn beacon_ocsp( | |
| + | State(state): State<Arc<AppState>>, | |
| + | ConnectInfo(addr): ConnectInfo<SocketAddr>, | |
| + | headers: HeaderMap, | |
| + | Path(token_id): Path<String>, | |
| + | ) -> Result<StatusCode> { | |
| + | record_event(&state, &addr, &headers, &token_id, "ocsp").await?; | |
| + | Ok(StatusCode::OK) | |
| + | } | |
| + | ||
| + | pub async fn beacon_license( | |
| + | State(state): State<Arc<AppState>>, | |
| + | ConnectInfo(addr): ConnectInfo<SocketAddr>, | |
| + | headers: HeaderMap, | |
| + | Path(token_id): Path<String>, | |
| + | ) -> Result<Json<serde_json::Value>> { | |
| + | record_event(&state, &addr, &headers, &token_id, "license").await?; | |
| + | Ok(Json(serde_json::json!({"valid": true}))) | |
| + | } | |
| + | ||
| + | async fn record_event( | |
| + | state: &AppState, | |
| + | addr: &SocketAddr, | |
| + | headers: &HeaderMap, | |
| + | token_id: &str, | |
| + | kind: &str, | |
| + | ) -> Result<i64> { | |
| + | if token_id.is_empty() || token_id.len() > MAX_ID_LEN { | |
| + | return Err(RegistryError::BadRequest("invalid token_id".into())); | |
| + | } | |
| + | ||
| + | let beacon = db::get_beacon(&state.db, token_id).await?; | |
| + | let file_id = beacon.as_ref().map(|b| b.file_id.as_str()); | |
| + | let recipient_id = beacon.as_ref().map(|b| b.recipient_id.as_str()); | |
| + | let issuer_id = beacon.as_ref().map(|b| b.issuer_id.as_str()); | |
| + | let source_ip = addr.ip().to_string(); | |
| + | let user_agent = headers | |
| + | .get(header::USER_AGENT) | |
| + | .and_then(|v| v.to_str().ok()) | |
| + | .unwrap_or(""); | |
| + | let timestamp_str = crate::timestamp_stub(); | |
| + | ||
| + | let tlog_event = serde_json::json!({ | |
| + | "event": "beacon", | |
| + | "kind": kind, | |
| + | "token_id": token_id, | |
| + | "file_id": file_id, | |
| + | "recipient_id": recipient_id, | |
| + | "source_ip": source_ip, | |
| + | "user_agent": user_agent, | |
| + | "timestamp": timestamp_str, | |
| + | }); | |
| + | let tlog_idx = state | |
| + | .tlog | |
| + | .append_event(&tlog_event) | |
| + | .map(|idx| idx as i64) | |
| + | .unwrap_or(-1); | |
| + | ||
| + | let now = SystemTime::now() | |
| + | .duration_since(UNIX_EPOCH) | |
| + | .unwrap_or_default() | |
| + | .as_secs() as i64; | |
| + | ||
| + | db::insert_event( | |
| + | &state.db, | |
| + | token_id, | |
| + | file_id, | |
| + | recipient_id, | |
| + | issuer_id, | |
| + | kind, | |
| + | Some(&source_ip), | |
| + | Some(user_agent), | |
| + | Some("{}"), | |
| + | now, | |
| + | Some(×tamp_str), | |
| + | Some(tlog_idx), | |
| + | ) | |
| + | .await?; | |
| + | ||
| + | Ok(tlog_idx) | |
| + | } |
| @@ -24,7 +24,10 @@ pub async fn dns_event( | ||
| if evt.token_id.is_empty() || evt.token_id.len() > MAX_ID_LEN { | ||
| return Err(RegistryError::BadRequest("invalid token_id".into())); | ||
| } | ||
| - | if evt.client_ip.as_deref().is_some_and(|v| v.len() > MAX_ID_LEN) | |
| + | if evt | |
| + | .client_ip | |
| + | .as_deref() | |
| + | .is_some_and(|v| v.len() > MAX_ID_LEN) | |
| || evt.qtype.as_deref().is_some_and(|v| v.len() > MAX_ID_LEN) | ||
| || evt.qname.as_deref().is_some_and(|v| v.len() > MAX_ID_LEN) | ||
| { | ||
| @@ -97,11 +100,7 @@ pub async fn dns_event( | ||
| })) | ||
| } | ||
| - | fn verify_dns_event_auth( | |
| - | state: &AppState, | |
| - | headers: &HeaderMap, | |
| - | addr: &SocketAddr, | |
| - | ) -> Result<()> { | |
| + | fn verify_dns_event_auth(state: &AppState, headers: &HeaderMap, addr: &SocketAddr) -> Result<()> { | |
| if let Some(secret) = state.dns_event_secret.as_deref() { | ||
| let supplied = headers | ||
| .get("x-oversight-dns-secret") | ||
| @@ -110,7 +109,7 @@ fn verify_dns_event_auth( | ||
| if constant_time_eq(supplied.as_bytes(), secret.as_bytes()) { | ||
| return Ok(()); | ||
| } | ||
| - | return Err(RegistryError::BadRequest( | |
| + | return Err(RegistryError::Unauthorized( | |
| "invalid dns event authentication".into(), | ||
| )); | ||
| } |
| @@ -0,0 +1,237 @@ | ||
| + | //! GET /evidence/{file_id} - signed provenance bundle for a registered file. | |
| + | ||
| + | use axum::extract::{Path, State}; | |
| + | use axum::Json; | |
| + | use ed25519_dalek::{Signer, SigningKey}; | |
| + | use std::sync::Arc; | |
| + | ||
| + | use crate::db; | |
| + | use crate::error::{RegistryError, Result}; | |
| + | use crate::models::{EventRow, MAX_ID_LEN}; | |
| + | use crate::AppState; | |
| + | ||
| + | pub async fn evidence_bundle( | |
| + | State(state): State<Arc<AppState>>, | |
| + | Path(file_id): Path<String>, | |
| + | ) -> Result<Json<serde_json::Value>> { | |
| + | if file_id.len() > MAX_ID_LEN { | |
| + | return Err(RegistryError::BadRequest("file_id too long".into())); | |
| + | } | |
| + | ||
| + | let manifest_row = db::get_manifest(&state.db, &file_id) | |
| + | .await? | |
| + | .ok_or_else(|| RegistryError::NotFound("unknown file_id".into()))?; | |
| + | let manifest: serde_json::Value = serde_json::from_str(&manifest_row.manifest_json) | |
| + | .map_err(|e| RegistryError::Internal(format!("stored manifest is invalid JSON: {e}")))?; | |
| + | let beacons = db::get_beacons_by_file(&state.db, &file_id).await?; | |
| + | let watermarks = db::get_watermarks_by_file(&state.db, &file_id).await?; | |
| + | let events = db::get_events_by_file(&state.db, &file_id).await?; | |
| + | let event_values: Vec<serde_json::Value> = events | |
| + | .iter() | |
| + | .map(|event| serde_json::to_value(event).unwrap_or_else(|_| serde_json::json!({}))) | |
| + | .collect(); | |
| + | ||
| + | let identity = state | |
| + | .identity | |
| + | .as_ref() | |
| + | .ok_or_else(|| RegistryError::Internal("registry identity not initialized".into()))?; | |
| + | ||
| + | let mut bundle = serde_json::json!({ | |
| + | "file_id": file_id, | |
| + | "bundle_generated_at": crate::timestamp_stub(), | |
| + | "registry_pub": identity.ed25519_pub, | |
| + | "manifest": manifest, | |
| + | "beacons": beacons, | |
| + | "watermarks": watermarks, | |
| + | "events": event_values, | |
| + | "tlog_head": state.tlog.signed_head(), | |
| + | "tlog_proofs": tlog_proofs_for_events(&state, &events), | |
| + | "disclaimer": "This bundle is a provenance record, not a legal finding. For court use, supplement with RFC 3161 qualified timestamps and ISO/IEC 27037 chain-of-custody.", | |
| + | }); | |
| + | ||
| + | let signature = sign_bundle(identity.ed25519_priv.as_str(), &bundle)?; | |
| + | bundle["bundle_signature_ed25519"] = serde_json::Value::String(signature); | |
| + | Ok(Json(bundle)) | |
| + | } | |
| + | ||
| + | fn tlog_proofs_for_events(state: &AppState, events: &[EventRow]) -> Vec<serde_json::Value> { | |
| + | events | |
| + | .iter() | |
| + | .enumerate() | |
| + | .filter_map(|(event_row, event)| { | |
| + | let idx = event.tlog_index?; | |
| + | if idx < 0 { | |
| + | return None; | |
| + | } | |
| + | let proof = state.tlog.inclusion_proof(idx as usize)?; | |
| + | Some(serde_json::json!({ | |
| + | "event_row": event_row, | |
| + | "tlog_index": idx, | |
| + | "proof": proof, | |
| + | })) | |
| + | }) | |
| + | .collect() | |
| + | } | |
| + | ||
| + | fn sign_bundle(priv_hex: &str, bundle: &serde_json::Value) -> Result<String> { | |
| + | let priv_bytes = hex::decode(priv_hex).map_err(|e| { | |
| + | RegistryError::Internal(format!("registry identity private key is invalid hex: {e}")) | |
| + | })?; | |
| + | if priv_bytes.len() != 32 { | |
| + | return Err(RegistryError::Internal(format!( | |
| + | "registry identity private key must be 32 bytes, got {}", | |
| + | priv_bytes.len() | |
| + | ))); | |
| + | } | |
| + | let mut arr = [0u8; 32]; | |
| + | arr.copy_from_slice(&priv_bytes); | |
| + | let signing_key = SigningKey::from_bytes(&arr); | |
| + | let msg = serde_jcs::to_vec(bundle) | |
| + | .map_err(|_| RegistryError::Internal("could not canonicalize evidence bundle".into()))?; | |
| + | Ok(hex::encode(signing_key.sign(&msg).to_bytes())) | |
| + | } | |
| + | ||
| + | #[cfg(test)] | |
| + | mod tests { | |
| + | use super::*; | |
| + | use crate::db; | |
| + | use crate::{AppState, RateLimiter, RegistryIdentity}; | |
| + | use oversight_tlog::TransparencyLog; | |
| + | use sqlx::SqlitePool; | |
| + | use std::path::PathBuf; | |
| + | use std::sync::Arc; | |
| + | ||
| + | fn temp_path(label: &str) -> PathBuf { | |
| + | let unique = format!( | |
| + | "oversight-registry-{label}-{}", | |
| + | std::time::SystemTime::now() | |
| + | .duration_since(std::time::UNIX_EPOCH) | |
| + | .unwrap() | |
| + | .as_nanos() | |
| + | ); | |
| + | std::env::temp_dir().join(unique) | |
| + | } | |
| + | ||
| + | async fn test_state() -> (Arc<AppState>, PathBuf) { | |
| + | let dir = temp_path("evidence"); | |
| + | std::fs::create_dir_all(&dir).unwrap(); | |
| + | let pool = db::create_pool(&dir.join("registry.sqlite")).await.unwrap(); | |
| + | db::run_migrations(&pool).await.unwrap(); | |
| + | ||
| + | let priv_hex = "11".repeat(32); | |
| + | let tlog = TransparencyLog::open_with_signer(dir.join("tlog"), Some(&priv_hex)).unwrap(); | |
| + | let pub_hex = { | |
| + | let mut bytes = [0u8; 32]; | |
| + | bytes.copy_from_slice(&hex::decode(&priv_hex).unwrap()); | |
| + | hex::encode(SigningKey::from_bytes(&bytes).verifying_key().to_bytes()) | |
| + | }; | |
| + | let state = AppState { | |
| + | db: pool, | |
| + | tlog, | |
| + | identity: Some(RegistryIdentity { | |
| + | ed25519_priv: priv_hex, | |
| + | ed25519_pub: pub_hex, | |
| + | }), | |
| + | rate_limiter: RateLimiter::new(10.0, 30.0, 100), | |
| + | trusted_proxy: false, | |
| + | dns_event_secret: None, | |
| + | rekor_enabled: false, | |
| + | rekor_url: String::new(), | |
| + | }; | |
| + | (Arc::new(state), dir) | |
| + | } | |
| + | ||
| + | async fn seed_file(pool: &SqlitePool, state: &AppState) { | |
| + | let manifest = serde_json::json!({ | |
| + | "file_id": "file-1", | |
| + | "issuer_id": "issuer-1", | |
| + | "issuer_ed25519_pub": "ab".repeat(32), | |
| + | "recipient": {"recipient_id": "recipient-1"}, | |
| + | }); | |
| + | db::upsert_manifest( | |
| + | pool, | |
| + | "file-1", | |
| + | "recipient-1", | |
| + | "issuer-1", | |
| + | &"ab".repeat(32), | |
| + | &serde_json::to_string(&manifest).unwrap(), | |
| + | 10, | |
| + | ) | |
| + | .await | |
| + | .unwrap(); | |
| + | db::upsert_beacon( | |
| + | pool, | |
| + | "token-1", | |
| + | "file-1", | |
| + | "recipient-1", | |
| + | "issuer-1", | |
| + | "dns", | |
| + | 10, | |
| + | ) | |
| + | .await | |
| + | .unwrap(); | |
| + | db::upsert_watermark( | |
| + | pool, | |
| + | "mark-1", | |
| + | "L1_zero_width", | |
| + | "file-1", | |
| + | "recipient-1", | |
| + | "issuer-1", | |
| + | 10, | |
| + | ) | |
| + | .await | |
| + | .unwrap(); | |
| + | ||
| + | let event = serde_json::json!({"event": "beacon", "token_id": "token-1"}); | |
| + | let tlog_index = state.tlog.append_event(&event).unwrap() as i64; | |
| + | db::insert_event( | |
| + | pool, | |
| + | "token-1", | |
| + | Some("file-1"), | |
| + | Some("recipient-1"), | |
| + | Some("issuer-1"), | |
| + | "dns", | |
| + | Some("127.0.0.1"), | |
| + | Some("test"), | |
| + | Some("{}"), | |
| + | 10, | |
| + | Some("2026-05-03T00:00:00Z"), | |
| + | Some(tlog_index), | |
| + | ) | |
| + | .await | |
| + | .unwrap(); | |
| + | } | |
| + | ||
| + | #[tokio::test] | |
| + | async fn evidence_bundle_contains_signed_tlog_proof() { | |
| + | let (state, dir) = test_state().await; | |
| + | seed_file(&state.db, &state).await; | |
| + | ||
| + | let Json(body) = evidence_bundle(State(state), Path("file-1".into())) | |
| + | .await | |
| + | .unwrap(); | |
| + | assert_eq!(body["file_id"], "file-1"); | |
| + | assert!(body["manifest"].is_object()); | |
| + | assert_eq!(body["beacons"].as_array().unwrap().len(), 1); | |
| + | assert_eq!(body["watermarks"].as_array().unwrap().len(), 1); | |
| + | assert_eq!(body["events"].as_array().unwrap().len(), 1); | |
| + | assert_eq!(body["tlog_proofs"].as_array().unwrap().len(), 1); | |
| + | assert_eq!(body["tlog_proofs"][0]["tlog_index"], 0); | |
| + | assert_eq!( | |
| + | body["bundle_signature_ed25519"].as_str().unwrap().len(), | |
| + | 128 | |
| + | ); | |
| + | ||
| + | let _ = std::fs::remove_dir_all(dir); | |
| + | } | |
| + | ||
| + | #[tokio::test] | |
| + | async fn evidence_bundle_returns_404_for_unknown_file() { | |
| + | let (state, dir) = test_state().await; | |
| + | let err = evidence_bundle(State(state), Path("missing".into())) | |
| + | .await | |
| + | .unwrap_err(); | |
| + | assert!(matches!(err, RegistryError::NotFound(_))); | |
| + | let _ = std::fs::remove_dir_all(dir); | |
| + | } | |
| + | } |
| @@ -1,7 +1,12 @@ | ||
| //! Route modules for the Oversight registry. | ||
| pub mod attribute; | ||
| + | pub mod beacon; | |
| pub mod dns_event; | ||
| + | pub mod evidence; | |
| pub mod health; | ||
| pub mod query; | ||
| pub mod register; | ||
| + | pub mod semantic; | |
| + | pub mod tlog; | |
| + | pub mod well_known; |
| @@ -124,13 +124,25 @@ pub async fn register( | ||
| .and_then(|v| v.as_str()) | ||
| .unwrap_or("unknown"); | ||
| if token_id.is_empty() || token_id.len() > MAX_ID_LEN { | ||
| - | return Err(RegistryError::BadRequest("signed beacon has invalid token_id".into())); | |
| + | return Err(RegistryError::BadRequest( | |
| + | "signed beacon has invalid token_id".into(), | |
| + | )); | |
| } | ||
| if kind.is_empty() || kind.len() > MAX_ID_LEN { | ||
| - | return Err(RegistryError::BadRequest("signed beacon has invalid kind".into())); | |
| + | return Err(RegistryError::BadRequest( | |
| + | "signed beacon has invalid kind".into(), | |
| + | )); | |
| } | ||
| - | db::upsert_beacon(&state.db, token_id, file_id, recipient_id, issuer_id, kind, now) | |
| - | .await?; | |
| + | db::upsert_beacon( | |
| + | &state.db, | |
| + | token_id, | |
| + | file_id, | |
| + | recipient_id, | |
| + | issuer_id, | |
| + | kind, | |
| + | now, | |
| + | ) | |
| + | .await?; | |
| } | ||
| for watermark in &signed_watermarks { | ||
| @@ -143,13 +155,25 @@ pub async fn register( | ||
| .and_then(|v| v.as_str()) | ||
| .unwrap_or("unknown"); | ||
| if mark_id.is_empty() || mark_id.len() > MAX_ID_LEN { | ||
| - | return Err(RegistryError::BadRequest("signed watermark has invalid mark_id".into())); | |
| + | return Err(RegistryError::BadRequest( | |
| + | "signed watermark has invalid mark_id".into(), | |
| + | )); | |
| } | ||
| if layer.is_empty() || layer.len() > MAX_ID_LEN { | ||
| - | return Err(RegistryError::BadRequest("signed watermark has invalid layer".into())); | |
| + | return Err(RegistryError::BadRequest( | |
| + | "signed watermark has invalid layer".into(), | |
| + | )); | |
| } | ||
| - | db::upsert_watermark(&state.db, mark_id, layer, file_id, recipient_id, issuer_id, now) | |
| - | .await?; | |
| + | db::upsert_watermark( | |
| + | &state.db, | |
| + | mark_id, | |
| + | layer, | |
| + | file_id, | |
| + | recipient_id, | |
| + | issuer_id, | |
| + | now, | |
| + | ) | |
| + | .await?; | |
| } | ||
| // Corpus hashes (optional) | ||
| @@ -189,7 +213,14 @@ pub async fn register( | ||
| // ---- Optional Rekor attestation ---- | ||
| let rekor_result = if state.rekor_enabled { | ||
| - | attest_to_rekor(&state, file_id, &issuer_pub, recipient_id, manifest, &signed_watermarks) | |
| + | attest_to_rekor( | |
| + | &state, | |
| + | file_id, | |
| + | &issuer_pub, | |
| + | recipient_id, | |
| + | manifest, | |
| + | &signed_watermarks, | |
| + | ) | |
| } else { | ||
| None | ||
| }; | ||
| @@ -244,7 +275,8 @@ fn attest_to_rekor( | ||
| let Some(mark_id_hex) = signed_watermarks | ||
| .iter() | ||
| - | .find_map(|w| w.get("mark_id").and_then(|v| v.as_str())) else { | |
| + | .find_map(|w| w.get("mark_id").and_then(|v| v.as_str())) | |
| + | else { | |
| return Some(serde_json::json!({ | ||
| "skipped": "no signed watermark mark_id to attest", | ||
| "tlog_kind": oversight_rekor::TLOG_KIND, | ||
| @@ -254,10 +286,7 @@ fn attest_to_rekor( | ||
| let mut wm_map = std::collections::BTreeMap::new(); | ||
| for (i, w) in signed_watermarks.iter().enumerate() { | ||
| let fallback = format!("layer_{i}"); | ||
| - | let layer = w | |
| - | .get("layer") | |
| - | .and_then(|v| v.as_str()) | |
| - | .unwrap_or(&fallback); | |
| + | let layer = w.get("layer").and_then(|v| v.as_str()).unwrap_or(&fallback); | |
| if let Some(mid) = w.get("mark_id").and_then(|v| v.as_str()) { | ||
| wm_map.insert( | ||
| layer.to_string(), |
| @@ -0,0 +1,34 @@ | ||
| + | //! GET /candidates/semantic - recent L3 semantic mark IDs for scrapers. | |
| + | ||
| + | use axum::extract::{Query, State}; | |
| + | use axum::Json; | |
| + | use serde::Deserialize; | |
| + | use std::sync::Arc; | |
| + | ||
| + | use crate::db; | |
| + | use crate::error::Result; | |
| + | use crate::AppState; | |
| + | ||
| + | #[derive(Debug, Deserialize)] | |
| + | pub struct SemanticParams { | |
| + | #[serde(default = "default_limit")] | |
| + | limit: i64, | |
| + | since: Option<i64>, | |
| + | } | |
| + | ||
| + | fn default_limit() -> i64 { | |
| + | 1000 | |
| + | } | |
| + | ||
| + | pub async fn candidates_semantic( | |
| + | State(state): State<Arc<AppState>>, | |
| + | Query(params): Query<SemanticParams>, | |
| + | ) -> Result<Json<serde_json::Value>> { | |
| + | let limit = params.limit.clamp(1, 10_000); | |
| + | let candidates = db::get_semantic_candidates(&state.db, limit, params.since).await?; | |
| + | Ok(Json(serde_json::json!({ | |
| + | "generated_at": crate::timestamp_stub(), | |
| + | "count": candidates.len(), | |
| + | "candidates": candidates, | |
| + | }))) | |
| + | } |
| @@ -0,0 +1,84 @@ | ||
| + | //! Transparency-log read endpoints for federated verifiers. | |
| + | ||
| + | use axum::extract::{Path, Query, State}; | |
| + | use axum::Json; | |
| + | use serde::Deserialize; | |
| + | use std::io::{BufRead, BufReader}; | |
| + | use std::sync::Arc; | |
| + | ||
| + | use crate::error::{RegistryError, Result}; | |
| + | use crate::AppState; | |
| + | ||
| + | #[derive(Debug, Deserialize)] | |
| + | pub struct RangeParams { | |
| + | #[serde(default)] | |
| + | start: usize, | |
| + | #[serde(default = "default_limit")] | |
| + | limit: usize, | |
| + | } | |
| + | ||
| + | fn default_limit() -> usize { | |
| + | 500 | |
| + | } | |
| + | ||
| + | pub async fn tlog_head(State(state): State<Arc<AppState>>) -> Result<Json<serde_json::Value>> { | |
| + | Ok(Json( | |
| + | serde_json::to_value(state.tlog.signed_head()) | |
| + | .map_err(|e| RegistryError::Internal(format!("could not serialize tlog head: {e}")))?, | |
| + | )) | |
| + | } | |
| + | ||
| + | pub async fn tlog_proof( | |
| + | State(state): State<Arc<AppState>>, | |
| + | Path(index): Path<usize>, | |
| + | ) -> Result<Json<serde_json::Value>> { | |
| + | let proof = state | |
| + | .tlog | |
| + | .inclusion_proof(index) | |
| + | .ok_or_else(|| RegistryError::NotFound("index out of range".into()))?; | |
| + | Ok(Json(serde_json::to_value(proof).map_err(|e| { | |
| + | RegistryError::Internal(format!("could not serialize tlog proof: {e}")) | |
| + | })?)) | |
| + | } | |
| + | ||
| + | pub async fn tlog_range( | |
| + | State(state): State<Arc<AppState>>, | |
| + | Query(params): Query<RangeParams>, | |
| + | ) -> Result<Json<serde_json::Value>> { | |
| + | let limit = params.limit.clamp(1, 1000); | |
| + | let leaves_path = state.tlog.data_dir().join("leaves.jsonl"); | |
| + | if !leaves_path.exists() { | |
| + | return Ok(Json(serde_json::json!({ | |
| + | "start": params.start, | |
| + | "count": 0, | |
| + | "entries": [], | |
| + | }))); | |
| + | } | |
| + | ||
| + | let file = std::fs::File::open(&leaves_path) | |
| + | .map_err(|e| RegistryError::Internal(format!("could not open tlog leaves: {e}")))?; | |
| + | let reader = BufReader::new(file); | |
| + | let mut entries = Vec::new(); | |
| + | for (idx, line) in reader.lines().enumerate() { | |
| + | if idx < params.start { | |
| + | continue; | |
| + | } | |
| + | if entries.len() >= limit { | |
| + | break; | |
| + | } | |
| + | let line = | |
| + | line.map_err(|e| RegistryError::Internal(format!("could not read tlog leaf: {e}")))?; | |
| + | if line.trim().is_empty() { | |
| + | continue; | |
| + | } | |
| + | if let Ok(value) = serde_json::from_str::<serde_json::Value>(&line) { | |
| + | entries.push(value); | |
| + | } | |
| + | } | |
| + | ||
| + | Ok(Json(serde_json::json!({ | |
| + | "start": params.start, | |
| + | "count": entries.len(), | |
| + | "entries": entries, | |
| + | }))) | |
| + | } |
| @@ -0,0 +1,17 @@ | ||
| + | //! GET /.well-known/oversight-registry - registry identity advertisement. | |
| + | ||
| + | use axum::extract::State; | |
| + | use axum::Json; | |
| + | use std::sync::Arc; | |
| + | ||
| + | use crate::error::Result; | |
| + | use crate::AppState; | |
| + | ||
| + | pub async fn well_known(State(state): State<Arc<AppState>>) -> Result<Json<serde_json::Value>> { | |
| + | Ok(Json(serde_json::json!({ | |
| + | "ed25519_pub": state.identity.as_ref().map(|i| i.ed25519_pub.as_str()), | |
| + | "version": crate::VERSION, | |
| + | "jurisdiction": std::env::var("OVERSIGHT_JURISDICTION").unwrap_or_else(|_| "GLOBAL".into()), | |
| + | "tlog_size": state.tlog.size(), | |
| + | }))) | |
| + | } |
| @@ -0,0 +1,229 @@ | ||
| + | """Generate a hybrid (OSGT-HYBRID-v1) .sealed sample + matching identity JSON. | |
| + | ||
| + | Self-contained: depends on `cryptography` and `oqs` (liboqs-python). Mirrors the | |
| + | binary container format from oversight_core/container.py and the hybrid wrap | |
| + | construction from oversight_core/crypto.py:hybrid_wrap_dek, so the produced | |
| + | sample is byte-compatible with the production reference implementation. | |
| + | ||
| + | Usage (from any host where `oqs` is installed): | |
| + | python3 gen_hybrid_sample.py --out-dir ./out | |
| + | ||
| + | Outputs: | |
| + | out/tutorial-hybrid.sealed - viewer test fixture | |
| + | out/tutorial-hybrid-identity.json - recipient X25519 + ML-KEM-768 priv/pub | |
| + | ||
| + | The identity is a public test key, NEVER use for real content. | |
| + | """ | |
| + | from __future__ import annotations | |
| + | ||
| + | import argparse | |
| + | import hashlib | |
| + | import json | |
| + | import os | |
| + | import struct | |
| + | import sys | |
| + | from pathlib import Path | |
| + | ||
| + | import oqs | |
| + | from cryptography.hazmat.primitives import hashes, serialization | |
| + | from cryptography.hazmat.primitives.asymmetric.ed25519 import Ed25519PrivateKey | |
| + | from cryptography.hazmat.primitives.asymmetric.x25519 import X25519PrivateKey, X25519PublicKey | |
| + | from cryptography.hazmat.primitives.kdf.hkdf import HKDF | |
| + | from cryptography.hazmat.primitives.ciphers.aead import ChaCha20Poly1305 | |
| + | ||
| + | # ---------- format constants (must match oversight_core/container.py) ---------- | |
| + | MAGIC = b"OSGT\x01\x00" | |
| + | FORMAT_VERSION = 1 | |
| + | SUITE_HYBRID_V1_ID = 2 | |
| + | SUITE_HYBRID_V1 = "OSGT-HYBRID-v1" | |
| + | ||
| + | ||
| + | # ---------- XChaCha20-Poly1305 (HChaCha20 + ChaCha20-Poly1305) ----------------- | |
| + | # Python's `cryptography` ships ChaCha20Poly1305 (12-byte nonce, RFC 7539) but | |
| + | # not XChaCha20-Poly1305 (24-byte nonce). We derive a per-message subkey via | |
| + | # HChaCha20 over the first 16 bytes of the 24-byte nonce, then run ChaCha20- | |
| + | # Poly1305 with the remaining 8 bytes (zero-padded to 12). This matches the | |
| + | # construction used by the reference implementation and noble/ciphers. | |
| + | ||
| + | def _hchacha20(key: bytes, nonce16: bytes) -> bytes: | |
| + | assert len(key) == 32 and len(nonce16) == 16 | |
| + | state = bytearray(64) | |
| + | state[0:4] = b"expa"; state[4:8] = b"nd 3"; state[8:12] = b"2-by"; state[12:16] = b"te k" | |
| + | state[16:48] = key | |
| + | state[48:64] = nonce16 | |
| + | s = list(struct.unpack("<16I", bytes(state))) | |
| + | ||
| + | def rotl(v, n): return ((v << n) & 0xFFFFFFFF) | (v >> (32 - n)) | |
| + | def qr(a, b, c, d): | |
| + | s[a] = (s[a] + s[b]) & 0xFFFFFFFF; s[d] = rotl(s[d] ^ s[a], 16) | |
| + | s[c] = (s[c] + s[d]) & 0xFFFFFFFF; s[b] = rotl(s[b] ^ s[c], 12) | |
| + | s[a] = (s[a] + s[b]) & 0xFFFFFFFF; s[d] = rotl(s[d] ^ s[a], 8) | |
| + | s[c] = (s[c] + s[d]) & 0xFFFFFFFF; s[b] = rotl(s[b] ^ s[c], 7) | |
| + | ||
| + | for _ in range(10): | |
| + | qr(0, 4, 8, 12); qr(1, 5, 9, 13); qr(2, 6, 10, 14); qr(3, 7, 11, 15) | |
| + | qr(0, 5, 10, 15); qr(1, 6, 11, 12); qr(2, 7, 8, 13); qr(3, 4, 9, 14) | |
| + | return struct.pack("<8I", s[0], s[1], s[2], s[3], s[12], s[13], s[14], s[15]) | |
| + | ||
| + | ||
| + | def xchacha20poly1305_encrypt(key: bytes, nonce24: bytes, plaintext: bytes, aad: bytes) -> bytes: | |
| + | if len(key) != 32 or len(nonce24) != 24: | |
| + | raise ValueError("xchacha20poly1305 requires 32-byte key and 24-byte nonce") | |
| + | subkey = _hchacha20(key, nonce24[:16]) | |
| + | nonce12 = b"\x00\x00\x00\x00" + nonce24[16:24] | |
| + | return ChaCha20Poly1305(subkey).encrypt(nonce12, plaintext, aad) | |
| + | ||
| + | ||
| + | # ---------- canonical JSON (must match Python json.dumps sort_keys+compact) ---- | |
| + | def canonical_bytes(obj: dict) -> bytes: | |
| + | return json.dumps(obj, sort_keys=True, separators=(",", ":"), ensure_ascii=True).encode("utf-8") | |
| + | ||
| + | ||
| + | def strip_none(obj): | |
| + | if isinstance(obj, dict): | |
| + | return {k: strip_none(v) for k, v in obj.items() if v is not None} | |
| + | if isinstance(obj, list): | |
| + | return [strip_none(v) for v in obj if v is not None] | |
| + | return obj | |
| + | ||
| + | ||
| + | # ---------- hybrid DEK wrap (mirrors crypto.py:hybrid_wrap_dek) ---------------- | |
| + | def hybrid_wrap_dek(dek: bytes, x25519_pub: bytes, mlkem_pub: bytes) -> tuple[dict, bytes, bytes]: | |
| + | eph = X25519PrivateKey.generate() | |
| + | eph_pub = eph.public_key().public_bytes( | |
| + | encoding=serialization.Encoding.Raw, format=serialization.PublicFormat.Raw | |
| + | ) | |
| + | peer_x = X25519PublicKey.from_public_bytes(x25519_pub) | |
| + | ss_x = eph.exchange(peer_x) | |
| + | ||
| + | with oqs.KeyEncapsulation("ML-KEM-768") as kem: | |
| + | mlkem_ct, ss_pq = kem.encap_secret(mlkem_pub) | |
| + | ||
| + | ikm = ss_x + ss_pq + eph_pub + mlkem_ct | |
| + | kek = HKDF( | |
| + | algorithm=hashes.SHA256(), length=32, salt=None, | |
| + | info=b"oversight-hybrid-v1-dek-wrap", | |
| + | ).derive(ikm) | |
| + | ||
| + | nonce = os.urandom(24) | |
| + | wrapped = xchacha20poly1305_encrypt(kek, nonce, dek, aad=b"oversight-hybrid-dek") | |
| + | return ({ | |
| + | "suite": SUITE_HYBRID_V1, | |
| + | "x25519_ephemeral_pub": eph_pub.hex(), | |
| + | "mlkem_ciphertext": mlkem_ct.hex(), | |
| + | "nonce": nonce.hex(), | |
| + | "wrapped_dek": wrapped.hex(), | |
| + | }, mlkem_ct, eph_pub) | |
| + | ||
| + | ||
| + | # ---------- main -------------------------------------------------------------- | |
| + | def main() -> int: | |
| + | p = argparse.ArgumentParser() | |
| + | p.add_argument("--out-dir", required=True, type=Path) | |
| + | p.add_argument("--message", default="hello hybrid post-quantum oversight\n") | |
| + | args = p.parse_args() | |
| + | args.out_dir.mkdir(parents=True, exist_ok=True) | |
| + | ||
| + | # Recipient identity | |
| + | rx_priv = X25519PrivateKey.generate() | |
| + | rx_priv_bytes = rx_priv.private_bytes( | |
| + | encoding=serialization.Encoding.Raw, | |
| + | format=serialization.PrivateFormat.Raw, | |
| + | encryption_algorithm=serialization.NoEncryption(), | |
| + | ) | |
| + | rx_pub_bytes = rx_priv.public_key().public_bytes( | |
| + | encoding=serialization.Encoding.Raw, format=serialization.PublicFormat.Raw | |
| + | ) | |
| + | with oqs.KeyEncapsulation("ML-KEM-768") as kem: | |
| + | mlkem_pub = kem.generate_keypair() | |
| + | mlkem_priv = kem.export_secret_key() | |
| + | ||
| + | # Issuer Ed25519 (no ML-DSA: viewer.js verifies Ed25519 only; ML-DSA stays "") | |
| + | issuer = Ed25519PrivateKey.generate() | |
| + | issuer_pub_bytes = issuer.public_key().public_bytes( | |
| + | encoding=serialization.Encoding.Raw, format=serialization.PublicFormat.Raw | |
| + | ) | |
| + | ||
| + | # Plaintext + DEK | |
| + | plaintext = args.message.encode("utf-8") | |
| + | content_hash = hashlib.sha256(plaintext).hexdigest() | |
| + | canonical_content_hash = content_hash # 1:1 for plain text without canonicalization | |
| + | dek = os.urandom(32) | |
| + | ||
| + | # Outer AEAD (DEK encrypts plaintext, AAD = content_hash hex string ascii) | |
| + | aead_nonce = os.urandom(24) | |
| + | ciphertext = xchacha20poly1305_encrypt( | |
| + | dek, aead_nonce, plaintext, aad=content_hash.encode("ascii") | |
| + | ) | |
| + | ||
| + | # Hybrid wrap of DEK | |
| + | wrapped_dek, _mlkem_ct, _eph_pub = hybrid_wrap_dek(dek, rx_pub_bytes, mlkem_pub) | |
| + | ||
| + | # Manifest | |
| + | manifest = { | |
| + | "suite": SUITE_HYBRID_V1, | |
| + | "format": "oversight/v1", | |
| + | "issuer_id": "tutorial-hybrid@oversightprotocol.dev", | |
| + | "issuer_ed25519_pub": issuer_pub_bytes.hex(), | |
| + | "issuer_ml_dsa_pub": "", | |
| + | "recipient": { | |
| + | "id": "tutorial@oversightprotocol.dev", | |
| + | "x25519_pub": rx_pub_bytes.hex(), | |
| + | "mlkem_pub": mlkem_pub.hex(), | |
| + | }, | |
| + | "content_type": "text/plain", | |
| + | "content_hash": content_hash, | |
| + | "canonical_content_hash": canonical_content_hash, | |
| + | "l3_policy": {"enabled": False, "mode": "off"}, | |
| + | "filename": "hello-hybrid.txt", | |
| + | "signature_ed25519": "", | |
| + | "signature_ml_dsa": "", | |
| + | } | |
| + | ||
| + | # Sign canonical bytes (with both signature fields blanked) | |
| + | manifest_for_sign = strip_none(manifest) | |
| + | manifest_for_sign["signature_ed25519"] = "" | |
| + | manifest_for_sign["signature_ml_dsa"] = "" | |
| + | sig_bytes = issuer.sign(canonical_bytes(manifest_for_sign)) | |
| + | manifest["signature_ed25519"] = sig_bytes.hex() | |
| + | ||
| + | manifest_serialized = canonical_bytes(strip_none(manifest)) | |
| + | wrapped_dek_serialized = canonical_bytes(wrapped_dek) | |
| + | ||
| + | # Build container per oversight_core/container.py | |
| + | container = bytearray() | |
| + | container.extend(MAGIC) | |
| + | container.extend(bytes([FORMAT_VERSION, SUITE_HYBRID_V1_ID])) | |
| + | container.extend(struct.pack(">I", len(manifest_serialized))) | |
| + | container.extend(manifest_serialized) | |
| + | container.extend(struct.pack(">I", len(wrapped_dek_serialized))) | |
| + | container.extend(wrapped_dek_serialized) | |
| + | container.extend(aead_nonce) | |
| + | container.extend(struct.pack(">I", len(ciphertext))) | |
| + | container.extend(ciphertext) | |
| + | ||
| + | sealed_path = args.out_dir / "tutorial-hybrid.sealed" | |
| + | identity_path = args.out_dir / "tutorial-hybrid-identity.json" | |
| + | ||
| + | sealed_path.write_bytes(bytes(container)) | |
| + | identity = { | |
| + | "recipient_id": "tutorial@oversightprotocol.dev", | |
| + | "x25519_priv": rx_priv_bytes.hex(), | |
| + | "x25519_pub": rx_pub_bytes.hex(), | |
| + | "mlkem_priv": mlkem_priv.hex(), | |
| + | "mlkem_pub": mlkem_pub.hex(), | |
| + | "ed25519_priv": "public-tutorial-key-does-not-sign", | |
| + | "ed25519_pub": "public-tutorial-key-does-not-sign", | |
| + | "_note": "PUBLIC TUTORIAL KEY. Demo-only. Do not use for real content.", | |
| + | } | |
| + | identity_path.write_text(json.dumps(identity, indent=2)) | |
| + | ||
| + | print(f"[+] wrote {sealed_path} ({sealed_path.stat().st_size} bytes)") | |
| + | print(f"[+] wrote {identity_path}") | |
| + | print(f" plaintext SHA-256: {content_hash}") | |
| + | print(f" suite: {SUITE_HYBRID_V1}") | |
| + | return 0 | |
| + | ||
| + | ||
| + | if __name__ == "__main__": | |
| + | sys.exit(main()) |
| @@ -0,0 +1,46 @@ | ||
| + | // End-to-end test of viewer.js hybrid decrypt against the generated sample. | |
| + | // Run with: | |
| + | // node tools/test_hybrid_decrypt_node.mjs --viewer-dir ../site/viewer | |
| + | import { readFileSync } from 'node:fs'; | |
| + | import { join, resolve } from 'node:path'; | |
| + | import { pathToFileURL } from 'node:url'; | |
| + | ||
| + | function argValue(name, fallback) { | |
| + | const idx = process.argv.indexOf(name); | |
| + | if (idx >= 0 && idx + 1 < process.argv.length) return process.argv[idx + 1]; | |
| + | return fallback; | |
| + | } | |
| + | ||
| + | const viewerDir = resolve(argValue('--viewer-dir', process.env.OVERSIGHT_VIEWER_DIR || 'site/viewer')); | |
| + | const sealedPath = resolve(argValue('--sealed', join(viewerDir, 'samples/tutorial-hybrid.sealed'))); | |
| + | const identityPath = resolve(argValue('--identity', join(viewerDir, 'samples/tutorial-hybrid-identity.json'))); | |
| + | ||
| + | const VIEWER = pathToFileURL(join(viewerDir, 'viewer.js')).href; | |
| + | const NOBLE_CHACHA = pathToFileURL(join(viewerDir, 'vendor/noble-ciphers-chacha-1.3.0.js')).href; | |
| + | const NOBLE_MLKEM = pathToFileURL(join(viewerDir, 'vendor/noble-post-quantum-ml-kem-0.6.1.js')).href; | |
| + | ||
| + | const { parseSealed, verifyManifestSignature, decryptSealed } = await import(VIEWER); | |
| + | const { xchacha20poly1305 } = await import(NOBLE_CHACHA); | |
| + | const { ml_kem768 } = await import(NOBLE_MLKEM); | |
| + | ||
| + | const sealedBuf = readFileSync(sealedPath); | |
| + | const identity = JSON.parse(readFileSync(identityPath, 'utf8')); | |
| + | ||
| + | const parsed = parseSealed(sealedBuf.buffer.slice(sealedBuf.byteOffset, sealedBuf.byteOffset + sealedBuf.byteLength)); | |
| + | console.log('parsed.suiteName :', parsed.suiteName); | |
| + | console.log('parsed.manifest.suite :', parsed.manifest.suite); | |
| + | console.log('parsed.content_hash :', parsed.manifest.content_hash); | |
| + | console.log('parsed.ciphertextLen :', parsed.ciphertextLen); | |
| + | console.log('parsed.wrappedDek keys :', Object.keys(parsed.wrappedDek || {}).join(',')); | |
| + | ||
| + | const sigCheck = await verifyManifestSignature(parsed.manifest); | |
| + | console.log('Ed25519 signature verify:', sigCheck.ok, sigCheck.reason || ''); | |
| + | ||
| + | const plaintext = await decryptSealed(parsed, identity, xchacha20poly1305, ml_kem768); | |
| + | const text = new TextDecoder().decode(plaintext); | |
| + | console.log('decrypted plaintext :', JSON.stringify(text)); | |
| + | ||
| + | const expected = 'hello hybrid post-quantum oversight\n'; | |
| + | const pass = sigCheck.ok && text === expected; | |
| + | console.log('TEST :', pass ? 'PASS' : 'FAIL'); | |
| + | process.exit(pass ? 0 : 1); |